# Support Vector Machine

The Support Vector Machine is a supervised machine learning algorithm that performs well even in non-linear situations. Available in Excel using XLSTAT.

## What is the Support Vector Machine?

The **Support Vector Machine (SVM)** is a **supervised machine learning** technique that was invented by Vapnik and Chervonenkis in the context of the statistical **learning** theory (Vapnik and Chervonenkis, 1964). It was not until the mid-90s that an algorithm implementation of the SVM was proposed with the introduction of the **kernel trick** (Boser, B., Guyon, I., & Vapnik, V., 1992) and the generalization to the non separable case (Cortes, C. & Vapnik V. 1995). Since then, the SVM has known numerous developments and gained popularity in various areas such as Machine Learning, optimization, neural networks or functional analysis. It is one of the most successful learning algorithm. Its ability to compute a complex model at the price of a simple one made it a key component to the Machine Learning domain where it has become famous in applications such as text or image recognition.

The SVM aims to find a separation between two classes of objects with the idea that the larger the separation, the more reliable the classification. In its simplest form, the linear and separable case, the algorithm will select a hyperplane that separates the set of observations into two distinct classes in a way that maximizes the distance between the hyperplane and the closest observation of the training set.

XLSTAT supports both **binary** and **multiclass SVM**. There are two options for multiclass SVM in XLSTAT:

**One versus one**: one binary model per pair of classes is generated.**One versus all**: one binary model per class is generated, where the corresponding class is kept and all the other classes are merged in one class.

## Support Vector Machine options in XLSTAT

Support Vector Machine is available under the **Machine Learning** menu in XLSTAT.

### SMO parameters

This option allows to tune the optimization algorithm to your specific needs. There are 3 tunable parameters:

**C**: this is the regularization parameter (see the description for more details);**Epsilon**: this is a machine dependent accuracy parameter, its default value is 1x10^-12;**Tolerance**: this value define the tolerance when comparing 2 values during the optimization. This parameter can be used to speed up computations.

### Preprocessing

This option allows to select the way the explanatory data are rescaled. There are 3 options available:

**Rescaling**: quantitative explanatory variables are rescaled between 0 and 1 using the observed minimum and maximum for each variable;**Standardisation**: both qualitative and quantitative explanatory variables are standardized using the sample mean and variance for each variable;**None**: no transformation is applied.

### Kernel

This option allows to select the kernel you wish to apply to your dataset to extend the feature space. There are 4 kernels available:

**Linear kernel**: this is the basic linear dot product;**Power kernel**: If you select this kernel, you have to enter the coefficient and gamma parameters;**RBF kernel**: this the Radial Basis Function. If you select this kernel, you have to enter the gamma parameter;**Sigmoid kernel**: If you select this kernel, you have to enter the coefficient and gamma parameters;

### Validation set options

- Random: The observations are randomly selected. The “Number of observations” N must then be specified.
- N last rows: The N last observations are selected for the validation. The “Number of observations” N must then be specified.
- N first rows: The N first observations are selected for the validation. The “Number of observations” N must then be specified.
- Group variable: If you choose this option, you need to select a binary variable with only 0s and 1s. The 1s identify the observations to use for the validation.

## Support Vector Machine results in XLSTAT

### Results regarding the classifier

A summary description of the optimized classifier is displayed. The positive and negative classes are indicated as well as the training sample size and both optimized parameters the bias b and the number of support vectors.

### Results regarding the list of support vectors

A table containing the value of the class, the optimized value of alpha and the rescaled explanatory variables as they were used during the optimization is displayed for each identified support vector.

### Results regarding the confusion matrices

The confusion matrix is deduced from prior and posterior classifications together with the overall percentage of well-classified observations.

### Results regarding the performance metrics

There are 9 classification metrics displayed if this option is active:

Accuracy, Precision, Recall, F-score, Specificity, FPR, Prevalence, Cohen's kappa, NER.

Indicators in the first column correspond to the training sample and those in the second column to the validation sample (if activated).

### Result corresponding to the predicted classes

The predicted classes obtained using the SVM classifier are displayed for the training, validation and prediction dataset (if activated).

## References

**Vapnik, V. & Chervonenkis, A., (1964).** A note on one class of perceptrons. Automation and Remote Control, 25.

**Boser, B., Guyon, I., & Vapnik, V. (1992).** A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop of Computational Learning Theory, 5, 144-152, Pittsburgh, ACM.

**Cortes, C. & Vapnik V. (1995).** Support-Vector Networks. Machine Learning, 20, 273-297.