Running a multinomial logit model with XLSTAT

Dataset for Logistic regression for binary response data and polytomous variables (Logit, Probit) XLS274 KB

Tutorial video
Logistic regression for binary response data and polytomous variables (Logit, Probit) is part of: Download Trial version More details See users' feedback
  • Pro Core statistical software

  • System configuration

    • Windows:
      • Versions: 9x/Me/NT/2000/XP/Vista/Win 7/Win 8
      • Excel: 97 and later
      • Processor: 32 or 64 bits
      • Hard disk: 150 Mb
    • Mac OS X:
      • OS: OS X
      • Excel: X, 2004 and 2011
      • Hard disk: 150Mb.

Benefits

  • Easy and user-friendly
    Easy and user-friendly XLSTAT is flawlessly integrated with Microsoft Excel which is the most popular spreadsheet worldwide. This integration makes it one of the simplest available tools to work with as it utilizes the same philosophy as Microsoft Excel. The program is accessible in a dedicated XLSTAT tab. The analyses are grouped into functional menus. The dialog boxes are user-friendly and setting up an analysis is straightforward.
  • Data and results shared seamlessly
    Data and results shared seamlessly One of the greatest advantages of XLSTAT is the way you can share data and results seamlessly. As the results are stored in Microsoft Excel, anyone can access them. There is no need for the receiver to have an XLSTAT license or any additional viewer which makes your team-work easier and more affordable. In addition, results are easily integrable into other Microsoft Office software such as PowerPoint, so that you can create striking presentation in minutes.
  • Modular
    Modular XLSTAT is a modular product. XLSTAT-Pro is a core statistical module of XLSTAT which includes all the mainstream functionalities in statistics and multivariate analysis. More advanced features contained in add-on modules can be added for specific applications. This way you can adapt the software to your needs making the software more cost-efficient.
  • Didactic
    Didactic The results of XLSTAT are organized by analysis and are easy to navigate. Moreover useful information is provided along with the results to assist you in your interpretation.
  • Affordable
    Affordable XLSTAT is a complete and modular analytical solution that can suit any analytical business needs. It is very reasonably priced so that the return of your investment is almost immediate. Any XLSTAT license comes with top level support and assistance.
  • Accessible - Available in many languages
    Accessible - Available in many languages We have ensured XLSTAT is accessible to everyone by making the program available in many languages, including Chinese, English, French, German, Italian, Japanese, Polish, Portuguese and Spanish.
  • Automatable and customizable
    Automatable and customizable Most of the statistical functions available in XLSTAT can be called directly from the Visual Basic window of Microsoft Excel. They can be modified and integrated to more code to fit to the specificity of your domain. Adding tables and plots as well as modifying existing outputs becomes easy. Furthermore, XLSTAT includes some special tools on the dialog boxes to generate automatically the VBA code in order to reproduce your analysis using the VBA editor or to simply load pre-set settings. This effortless automation of routine analysis will be a huge time saver on your part.

Multinomial logit model

The multinomial logit model is a generalization of the logit model when the response variable has more than two categories. This method is very useful when one wants to understand, or to predict, the effect of a series of variables on an unordered qualitative response variable (a variable which can take lore than two values). Multinomial logit model can be helpful to model the effect of some descriptive variables on the choice of a brand in a market with more than two brands. All results are given relatively to a reference category (for example, the brand that is best established).

With XLSTAT you can run the multinomial logit on raw data. The dialog box for to the multinomial logit model is the same as the one used for the logistic regression.

The methodology of multinomial logit model aims at modeling the probability associated to each category depending on the values of the explanatory variables, which can be categorical or numerical variables.

Dataset for running a multinomial logit model

The example treated here is a marketing case where we want to detect if customers are likely to choose one of three brands depending on their age and sex. An Excel sheet with both the data and the results can be downloaded by clicking here.

The data consist of 750 observations. The reference category is brand 1.

Goal of this multinomial logit model

Our goal is to understand if customers are more likely to choose brand 2 or 3 then brand 1 depending on their age and sex.

Setting up a multinomial logit model

To activate the Multinomial Logit Model dialog box, start XLSTAT, then select the XLSTAT / Modeling data / Logistic regression for binary response data command, or click on the logistic regression button of the Modeling Data toolbar (see below).

barlog1.gif

When you click on the button, the Logistic regression dialog box appears.

To activate the multinomial logit model, change the response type and choose multinomial. A new box appears where you can choose the control or reference category (in our case we choose a1=0, meaning we set to 0 the relative effect for the first category).

logmult1.gif

Select the data on the Excel sheet. The Response corresponds to the column where the variable to be explained is stored. In this particular case we have two quantitative explanatory variables.

As we selected the column titles of all variables, we have selected the option Variable labels included.

logmultf2.gif

Many options are available in the dialog box.

Note: The default options correspond to the basic choice one would make; please consult the XLSTAT help for more details.

The computations begin once you have clicked on the OK button.

Interpret the results of a multinomial logit model

The following table gives several indicators on the model quality (or goodness of fit). These results are equivalent to the R2 and to the analysis of variance table in linear regression and ANOVA. The most important value to look at is the probability of the Chi-square test on the log ratio. This is equivalent to the Fisher's F test: we try to evaluate if the variables bring significant information by comparing the model as it is defined, with a simpler model with only one constant. In this case, as the probability is lower than 0.0001, we can conclude that significant information is brought by the variables.

logmultf3.gif

The next table gives details on the model. This table is helpful in understanding the effect of the various variables on the categories of the response variable. It is quite different from the logistic regression table. Parameters are obtained for each variable and for each category of the response variable (except the reference category). Odds ratios are also available for a better understanding of the results.

logmultf4.gif

Parameters interpretation is not immediate. The model equation for modality 2 is:

Log(P(Response=2)/P(Response=1))=-11.775+0.524*FEMALE+0.368*AGE

For example, we can say that for one unit change in the variable AGE, the log of the ratio of the two probabilities, P(Response=2)/P(Response=1), will be increased by 0.368. Therefore, we can say that, in general, the older a person is, the more he will prefer brand 2. The ratio of the probability of choosing one outcome category over the probability of choosing the reference category is often referred as odds ratios (and it is also sometimes referred as relative risks). So another way of interpreting the regression results is in terms of odds ratios. We can say that for one unit change in the variable AGE, we expect the relative risk of choosing brand 2 over 1 to increase by 1.445.

By looking at the probability of the Chi-squares, we see that the variable that most influences the response variable for both category 2 and 3 is the age of the customer. The intercepts are significant. The marketing experts should focus on older people if they want to increase the market share of the brand 1.