Abstract: Artificial neural networks are known for their ability to approximate complex, nonlinear, and hard-to-express functions, especially in the case of images, where convolutional neural networks (CNN) are used, but they are particularly sensitive to the choice of hyperparameters. To find the best parameters for solving each task, a set of networks is trained, which requires expert knowledge and a lot of computing resources. Although modern neural network architectures have significantly reduced the amount of hyperparameters, they still must be found by trial and error. Many automatic hyperparameter search algorithms have been invented, and many of them are based on the principle of Bayesian optimization (BO). Although such methods significantly speed up and facilitate the search for the best model, they still require at least a dozen trials, which is expensive, especially when working with large datasets. In this case, not every element of the data set is as valuable as the other, so filtering out the least significant elements from the training data set can speed up the hyperparameter selection process, thus reducing both training time and cost. The SNT training method presented and analyzed in this paper combines the search for BO hyperparameters and the filtering of irrelevant data elements in order to minimize the cost of searching for the best model without large additional computing resources. Data filtering is performed according to the amount of element forgetting events, the dynamic estimate of the complexity of the instances and the proposed new estimate - the first look hardness. To demonstrate the effectiveness of the methods, they are tested using two publicly available datasets. Empirical research has shown that in certain cases, the amount of training data can be reduced by up to 85% while maintaining the accuracy of the model, thereby reducing the training cost several times.
No Comments.