Supervising Nyström Methods via Negative Margin Support Vector Selection
Pattern recognition on big data can be challenging for kernel machines as the complexity grows at least with the squared number of training samples. Recently, methods to approximate explicit, low-dimensional feature mappings for kernel functions have been applied to overcome this hurdle, such as the Nyström methods and Random Fourier Features. The Nyström methods, in particular, create the feature mappings from pairwise comparisons with the training data. However, Nyström methods in previous works are generally applied without the supervision provided by the training labels in the classification/regression problems. This leads to the pairwise comparisons with randomly chosen training samples in the model, instead of important ones. Conversely, this work studies a supervised Nyström method, which chooses the subsets of samples that are critical for the success of the Machine Learning model. Particularly, we select the Nyström support vectors via the negative margin criterion, and create explicit feature maps that are more suitable for the classification task on the data. Experimental results on small to large scale data sets show that our methods can significantly improve the classification performance achieved via kernel approximation methods, and at times, even exceed the performance of the full-dimensional kernel machines.
READ FULL TEXT