# Cursos y Webinares

Addinsoft organizes public (inter-company) and private (intra-company) sessions for all levels. You can register to one of our scheduled courses or contact us for a customized training course. All of our courses are available in virtual classrooms.

## Multivariate analysis and classification (PCA, FCA, MCA, HCA, k-means, DFA) with XLSTAT

Master the most commonly used multivariate data analysis methods and classification techniques with this course. The training is addressed to anyone interested in studying the relationship between several variables or group objects with common characteristics.

INGLÉS

4 days
28 HOURS

SEE BROCHURE

### Documentos

This course is intended for people who wish to master the concepts and implementation of multivariate factor analysis. The objective of these analyses is to extract information from data that: has a large number of variables, has a large number of individuals, Is unstructured, Contains redundant variables (possible confusion between variables).

Main topics covered in this training:

• Principal Component Analysis (PCA)
• Simple Correspondence Factorial Analysis (CFA)
• Multiple Correspondence Factorial Analysis (MCFA)
• Hierarchical Ascending Classification (HAC)
• k-means
• Discriminant Factor Analysis (DFA)

Required experience:

Participants must have a good knowledge of basic statistical tools: correlation, standard deviation, variance, confidence intervals, hypothesis testing.

Syllabus:

### Introduction to the various methods of multivariate analysis

• Limitations of classical statistics
• Fields of application of the different methods of multivariate analysis
• Introduction to data mining:
• Description objectives
• Prediction objectives
• Dataset structure
• Presentation of the range of methods:
• Principal Component Analysis
• Simple and multiple correspondence factor analysis
• Canonical correlation analysis
• Discriminant Factor Analysis
• Classification methods: hierarchical ascending classification, k-means
• General principles of the different methods - Notions of:
• Distance
• Inertia and variance
• Factorial Axes

### Concept of correlation

• Definition of correlation coefficient
• Interpretation of the value of the correlation coefficient
• Sources of confusion: correlation, causation, slope
• The different correlation coefficients:
• Pearson's Coefficient
• Spearman's Coefficient

### Implementation of a Principal Component Analysis (PCA)

• Dataset structure and application context
• Interpretation of software outputs

### Implementation of a k-means classification

• Presentation of the objectives of the k-means method
• Avantages and disadvantages of HCA and k-means
• Cluster determination
• Presentation of the different versions of the algorithm
• Using k-means as a compliment to PCA
• Classification on large datasets
• Implementation tips
• Interpretation of software outputs

### Implementation of a Discriminant Factorial Analysis (DFA)

• Dataset structure and application context
• Detailed objectives of DFA
• Concepts of classification and discrimination
• DFA Methodology
• Comparison with PCA
• Interpretation of software outputs: factorial circle, variable x-axis correlations
• Quality of the DFA (of the discrimination obtained):
• Univariate and multivariate tests (Wilks' lambda)
• Graph of individuals
• Confusion matrix (and possibly ROC curve)
• Errors to avoid