# Cursos y Webinares

Addinsoft organizes public (inter-company) and private (intra-company) sessions for all levels. You can register to one of our scheduled courses or contact us for a customized training course. All of our courses are available in virtual classrooms.

## Predictive models: linear, logistic, PLS and ANCOVA regressions

Do you already have a good understanding of basic statistical tools? Reach the next level by using XLSTAT to develop predictive models in EXCEL.

INGLÉS

4 days
28 HOURS

SEE BROCHURE

### Documentos

This course is intended for people who wish to implement modeling methods. Different types of modeling will be covered: Multiple linear regression, ANCOVA-type general linear model, Logistic regression (binary or multimodal response), PLS regression. The objective of this course is to give the participants the methodological know-how to do these analyses: Context and objectives, Conditions for use, Model quality measurement, Implementation and interpretation of results, etc.

Main topics covered in this training:

• Context and objectives of the different methods
• Goodness of fit and estimation quality of model coefficients (prediction quality)
• Linear regression
• Stepwise regression
• General linear models (ANCOVA, etc.)
• Logistic regression
• PLS regression
• The problem of multi-collinearity

Required experience:

Participants must have:

• A good understanding of basic statistical tools: descriptive statistics, confidence intervals, p-value, alpha risk, hypothesis testing, etc.
• Some knowledge of correlation and linear regression

Syllabus:

### Reviewing the concept of correlation

• Defining the correlation coefficient
• Interpreting the value of the correlation coefficient
• Sources of confusion: correlation, causation, slope, etc.
• The different correlation coefficients:
• Pearson's coefficient
• Spearman's coefficient

### Simple linear regression-type modeling

• Mathematical principles and concepts inherent to simple linear regression
• Hypothesis testing of the significance of the model
• Quality of the model
• Coefficient of determination R², adjusted R², R² Prev
• Use of the model:
• Prediction of individual values
• Confidence intervals of predictions
• Graphical treatment of the results
• Mathematical principles and concepts inherent to multiple linear regression
• Model inference, variable inference (Fisher statistics)
• Residual analysis:
• Residual calculations
• Physical and statistical significance
• Homogeneity
• Distribution, Normality
• Suspect values
• Graphical analysis
• Suspect values and influential points:
• Residuals: Studentized residuals
• Leverage
• Cook distance
• Model quality:
• Goodness of fit, coefficient of determination R², adjusted R²
• Prediction quality, estimation error
• Use of the model:
• Prediction (forecast) of individual values
• Confidence intervals of predictions (forecasts)
• Graphic illustration of results

### Multiple regression model

• Significance of the coefficients
• Hierarchy of the coefficients
• Problems related to multicollinearity
• Measures of collinearity:
• Correlation coefficient
• VIF
• Solving multicollinearity problems

Analysis of multicollinearity problems through variable selection

• Detection of collinearity:
• Harmful effects of collinearity between explanatory variables
• Detection tools: correlation, VIF, sign consistency
• Proposed solutions:
• Structured experimentation
• Variable Selection
• PLS
• Treatment of collinearity - Selection of variables:
• Selection by optimization. R², adjusted R², AIC and BIC criteria
• Stepwise selection algorithms: Forward selection, Backward selection, Stepwise regression

### Implementation and Interpretation of results for PLS regressions

• Context and objectives
• Introducing the different regression methods on collinear data: PCR, Ridge regression, and PLS
• Mathematical principles and concepts of PCR and PLS Regression
• Present the different versions of the PLS regression
• Implementation and interpretation of results: graphs, model coefficients etc.
• Choosing the number of components (cross-validation)
• Components and regression coefficients
• Quality of fit, quality of prediction
• Q² and R² coefficients
• Importance of explanatory variables for prediction:
• Standardized coefficients
• VIP
• Selection of variables

### Implementation and Interpretation of results for and ANCOVA (general linear models)

• Context and objectives
• The notion of interaction between qualitative and quantitative explanatory variables
• Combined lines model
• Complete model
• Implementation and interpretation of results for all different models
• Reading and using the model
• Significance tests of the different terms (Fisher's F)
• Model cleaning (selection of influential terms and variables)
• Conditions for using ANCOVA

### Logistic regression-type modeling

• Context and objectives
• Differences between linear and logistic regression
• Definition of the Logit model
• Implementation and interpretation of results
• Classification of quantitative explanatory variables
• Estimation and interpretation of model coefficients
• Tests of the contribution of a variable (Wald test, likelihood ratio tests)
• Interpretation of Wald's Chi-square
• Odds ratios
• Parallel odds ratios and relative risks
• Analysis of the grading table:
• Success rate, failure rate
• True positives, true negatives, false positives, false negatives
• Fitted probabilities and use of the model for prediction purposes
• Conditions for use