Trainings & Webinars

Addinsoft organizes public (inter-company) and private (intra-company) sessions for all levels. You can register to one of our scheduled courses or contact us for a customized training course. All of our courses are available in virtual classrooms.

Multivariate analysis and classification (PCA, FCA, MCA, HCA, k-means, DFA) with XLSTAT

Master the most commonly used multivariate data analysis methods and classification techniques with this course. The training is addressed to anyone interested in studying the relationship between several variables or group objects with common characteristics.

ENGLISH

4 days
28 HOURS

SEE BROCHURE

Documents

This course is intended for people who wish to master the concepts and implementation of multivariate factor analysis. The objective of these analyses is to extract information from data that: has a large number of variables, has a large number of individuals, Is unstructured, Contains redundant variables (possible confusion between variables).

Main topics covered in this training:

• Principal Component Analysis (PCA)
• Simple Correspondence Factorial Analysis (CFA)
• Multiple Correspondence Factorial Analysis (MCFA)
• Hierarchical Ascending Classification (HAC)
• k-means
• Discriminant Factor Analysis (DFA)

Required experience:

Participants must have a good knowledge of basic statistical tools: correlation, standard deviation, variance, confidence intervals, hypothesis testing.

Syllabus:

Introduction to the various methods of multivariate analysis

• Limitations of classical statistics
• Fields of application of the different methods of multivariate analysis
• Introduction to data mining:
• Description objectives
• Prediction objectives
• Dataset structure
• Presentation of the range of methods:
• Principal Component Analysis
• Simple and multiple correspondence factor analysis
• Canonical correlation analysis
• Discriminant Factor Analysis
• Classification methods: hierarchical ascending classification, k-means
• General principles of the different methods - Notions of:
• Distance
• Inertia and variance
• Factorial Axes

Concept of correlation

• Definition of correlation coefficient
• Interpretation of the value of the correlation coefficient
• Sources of confusion: correlation, causation, slope
• The different correlation coefficients:
• Pearson's Coefficient
• Spearman's Coefficient

Implementation of a Principal Component Analysis (PCA)

• Dataset structure and application context
• Interpretation of software outputs

Implementation of a k-means classification

• Presentation of the objectives of the k-means method
• Avantages and disadvantages of HCA and k-means
• Cluster determination
• Presentation of the different versions of the algorithm
• Using k-means as a compliment to PCA
• Classification on large datasets
• Implementation tips
• Interpretation of software outputs

Implementation of a Discriminant Factorial Analysis (DFA)

• Dataset structure and application context
• Detailed objectives of DFA
• Concepts of classification and discrimination
• DFA Methodology
• Comparison with PCA
• Interpretation of software outputs: factorial circle, variable x-axis correlations
• Quality of the DFA (of the discrimination obtained):
• Univariate and multivariate tests (Wilks' lambda)
• Graph of individuals
• Confusion matrix (and possibly ROC curve)
• Errors to avoid

TRAINER PROFILES

Thierry Anthouard

Statistical instructor

Thierry Anthouard is the head of the Arkesys Group's statistical training program and has always been passionate about the field of statistics. In 1992, he launched the development of the Arkesys Group's statistics training program. His "by example" pedagogical approach  allows him to popularize statistics and to make it accessible to all learners. As a consultant supporting of key accounts, he adapts to all types of contexts and learning issues.

Jérôme-Philippe Garsi

Statistical instructor

Jérôme-Philippe Garsi is a statistical instructor with 13 years of experience in the training field. Since his doctorate on clinical issues, his work is mainly focused on the interest of populations, their health and well-being. At ease with any audience, he makes pedagogy and the simplification of scientific knowledge a priority. To do so, he always takes the greatest care to be clear in his written documents as well as in his oral presentations.