Máquina de Vectores de Soporte

The Support Vector Machine is a supervised machine learning algorithm that performs well even in non-linear situations. Available in Excel using XLSTAT.

Use this method to perform a binary classification, a multi-class classification or a regression on a set of observations described by qualitative and/or quantitative variables (predictors).

What is the Support Vector Machine?

The Support Vector Machine (SVM) is a supervised machine learning technique that was invented by Vapnik and Chervonenkis in the context of the statistical learning theory (Vapnik and Chervonenkis, 1964). It was not until the mid-90s that an algorithm implementation of the SVM was proposed with the introduction of the kernel trick (Boser, B., Guyon, I., & Vapnik, V., 1992) and the generalization to the non separable case (Cortes, C. & Vapnik V. 1995). Since then, the SVM has known numerous developments and gained popularity in various areas such as Machine Learning, optimization, neural networks or functional analysis. It is one of the most successful learning algorithm. Its ability to compute a complex model at the price of a simple one made it a key component to the Machine Learning domain where it has become famous in applications such as text or image recognition.

Binary classification

The SVM aims to find a separation between two classes of objects with the idea that the larger the separation, the more reliable the classification. In its simplest form, the linear and separable case, the algorithm will select a hyperplane that separates the set of observations into two distinct classes in a way that maximizes the distance between the hyperplane and the closest observation of the training set.

Multi-class classification

Because SVM can only resolve binary problems, different methods have been developed to solve multi-class problems. They all use the same principle: transform the multi-class problem in several binary problems. XLSTAT proposes two different methods to solve multi-class problem:

One versus one: one binary model per pair of classes is generated.
One versus all: one binary model per class is generated, where the corresponding class is kept and all the other classes are merged in one class.

Regression

SVM method was generalized to be applied to regression problem or time series prediction. Let the training set {x{i}, y{i}} for i = 1,...,N where x is the set of predictors of the observation and y{i} in ℝ.

Support Vector Machine options in XLSTAT

Support Vector Machine is available under the Machine Learning menu in XLSTAT.

SMO parameters

This option allows to tune the optimization algorithm to your specific needs. There are 3 tunable parameters:

C: this is the regularization parameter (see the description for more details);
Epsilon: this is a machine dependent accuracy parameter, its default value is 1x10^-12;
Tolerance: this value define the tolerance when comparing 2 values during the optimization. This parameter can be used to speed up computations.

Preprocessing

This option allows to select the way the explanatory data are rescaled. There are 3 options available:

Rescaling: quantitative explanatory variables are rescaled between 0 and 1 using the observed minimum and maximum for each variable;
Standardisation: both qualitative and quantitative explanatory variables are standardized using the sample mean and variance for each variable;
None: no transformation is applied.

Kernel

This option allows to select the kernel you wish to apply to your dataset to extend the feature space. There are 4 kernels available:

Linear kernel: this is the basic linear dot product;
Power kernel: If you select this kernel, you have to enter the coefficient and gamma parameters;
RBF kernel: this the Radial Basis Function. If you select this kernel, you have to enter the gamma parameter;
Sigmoid kernel: If you select this kernel, you have to enter the coefficient and gamma parameters;

Validation set options

Random: The observations are randomly selected. The “Number of observations” N must then be specified.
N last rows: The N last observations are selected for the validation. The “Number of observations” N must then be specified.
N first rows: The N first observations are selected for the validation. The “Number of observations” N must then be specified.
Group variable: If you choose this option, you need to select a binary variable with only 0s and 1s. The 1s identify the observations to use for the validation.

Support Vector Machine results in XLSTAT

Results regarding the classifier

A summary description of the optimized classifier is displayed. The positive and negative classes are indicated as well as the training sample size and both optimized parameters the bias b and the number of support vectors.

Results regarding the list of support vectors

A table containing the value of the class, the optimized value of alpha and the rescaled explanatory variables as they were used during the optimization is displayed for each identified support vector.

Results regarding the confusion matrices

The confusion matrix is deduced from prior and posterior classifications together with the overall percentage of well-classified observations.

Results regarding the performance metrics

There are 9 classification metrics displayed if this option is active:

Accuracy, Precision, Recall, F-score, Specificity, FPR, Prevalence, Cohen's kappa, NER.

Indicators in the first column correspond to the training sample and those in the second column to the validation sample (if activated).

Result corresponding to the predicted classes

The predicted classes obtained using the SVM classifier are displayed for the training, validation and prediction dataset (if activated).

References

Vapnik, V. & Chervonenkis, A., (1964). A note on one class of perceptrons. Automation and Remote Control, 25.

Boser, B., Guyon, I., & Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop of Computational Learning Theory, 5, 144-152, Pittsburgh, ACM.

Cortes, C. & Vapnik V. (1995). Support-Vector Networks. Machine Learning, 20, 273-297.

Ver todos los tutoriales