# Trainings & Webinars

Addinsoft organizes public (inter-company) and private (intra-company) sessions for all levels. You can register to one of our scheduled courses or contact us for a customized training course. All of our courses are available in virtual classrooms.

## Predictive models: linear, logistic, PLS and ANCOVA regressions

Do you already have a good understanding of basic statistical tools? Reach the next level by using XLSTAT to develop predictive models in EXCEL.

4 days
28H

SEE BROCHURE

### Documents

This course is intended for people who wish to implement modeling methods. Different types of modeling will be covered: Multiple linear regression, ANCOVA-type general linear model, Logistic regression (binary or multimodal response), PLS regression. The objective of this course is to give the participants the methodological know-how to do these analyses: Context and objectives, Conditions for use, Model quality measurement, Implementation and interpretation of results, etc.

Main topics covered in this training:

• Context and objectives of the different methods
• Goodness of fit and estimation quality of model coefficients (prediction quality)
• Linear regression
• Stepwise regression
• General linear models (ANCOVA, etc.)
• Logistic regression
• PLS regression
• The problem of multi-collinearity

Required experience:

Participants must have:

• A good understanding of basic statistical tools: descriptive statistics, confidence intervals, p-value, alpha risk, hypothesis testing, etc.
• Some knowledge of correlation and linear regression

Syllabus:

### Reviewing the concept of correlation

• Defining the correlation coefficient
• Interpreting the value of the correlation coefficient
• Sources of confusion: correlation, causation, slope, etc.
• The different correlation coefficients:
• Pearson's coefficient
• Spearman's coefficient

### Simple linear regression-type modeling

• Mathematical principles and concepts inherent to simple linear regression
• Hypothesis testing of the significance of the model
• Quality of the model
• Coefficient of determination R², adjusted R², R² Prev
• Use of the model:
• Prediction of individual values
• Confidence intervals of predictions
• Graphical treatment of the results
• Mathematical principles and concepts inherent to multiple linear regression
• Model inference, variable inference (Fisher statistics)
• Residual analysis:
• Residual calculations
• Physical and statistical significance
• Homogeneity
• Distribution, Normality
• Suspect values
• Graphical analysis
• Suspect values and influential points:
• Residuals: Studentized residuals
• Leverage
• Cook distance
• Model quality:
• Goodness of fit, coefficient of determination R², adjusted R²
• Prediction quality, estimation error
• Use of the model:
• Prediction (forecast) of individual values
• Confidence intervals of predictions (forecasts)
• Graphic illustration of results

### Multiple regression model

• Significance of the coefficients
• Hierarchy of the coefficients
• Problems related to multicollinearity
• Measures of collinearity:
• Correlation coefficient
• VIF
• Solving multicollinearity problems

Analysis of multicollinearity problems through variable selection

• Detection of collinearity:
• Harmful effects of collinearity between explanatory variables
• Detection tools: correlation, VIF, sign consistency
• Proposed solutions:
• Structured experimentation
• Variable Selection
• PLS
• Treatment of collinearity - Selection of variables:
• Selection by optimization. R², adjusted R², AIC and BIC criteria
• Stepwise selection algorithms: Forward selection, Backward selection, Stepwise regression

### Implementation and Interpretation of results for PLS regressions

• Context and objectives
• Introducing the different regression methods on collinear data: PCR, Ridge regression, and PLS
• Mathematical principles and concepts of PCR and PLS Regression
• Present the different versions of the PLS regression
• Implementation and interpretation of results: graphs, model coefficients etc.
• Choosing the number of components (cross-validation)
• Components and regression coefficients
• Quality of fit, quality of prediction
• Q² and R² coefficients
• Importance of explanatory variables for prediction:
• Standardized coefficients
• VIP
• Selection of variables

### Implementation and Interpretation of results for and ANCOVA (general linear models)

• Context and objectives
• The notion of interaction between qualitative and quantitative explanatory variables
• Combined lines model
• Complete model
• Implementation and interpretation of results for all different models
• Reading and using the model
• Significance tests of the different terms (Fisher's F)
• Model cleaning (selection of influential terms and variables)
• Conditions for using ANCOVA

### Logistic regression-type modeling

• Context and objectives
• Differences between linear and logistic regression
• Definition of the Logit model
• Implementation and interpretation of results
• Classification of quantitative explanatory variables
• Estimation and interpretation of model coefficients
• Tests of the contribution of a variable (Wald test, likelihood ratio tests)
• Interpretation of Wald's Chi-square
• Odds ratios
• Parallel odds ratios and relative risks
• Analysis of the grading table:
• Success rate, failure rate
• True positives, true negatives, false positives, false negatives
• Fitted probabilities and use of the model for prediction purposes
• Conditions for use

### TRAINER PROFILES

#### Thierry Anthouard

Statistics instructor Thierry Anthouard is the head of the Arkesys Group's statistical training program and has always been passionate about the field of statistics. In 1992, he launched the development of the Arkesys Group's statistics training program. His "by example" pedagogical approach  allows him to popularize statistics and to make it accessible to all learners. As a consultant supporting of key accounts, he adapts to all types of contexts and learning issues.

#### Jérôme-Philippe Garsi

Statistics instructor Jérôme-Philippe Garsi is a statistical instructor with 13 years of experience in the training field. Since his doctorate on clinical issues, his work is mainly focused on the interest of populations, their health and well-being. At ease with any audience, he makes pedagogy and the simplification of scientific knowledge a priority. To do so, he always takes the greatest care to be clear in his written documents as well as in his oral presentations.