Running a Multiple Correspondence Analysis (MCA) with XLSTAT

Dataset for Multiple Correspondence Analysis (MCA) XLS101 KB

Tutorial video
Multiple Correspondence Analysis (MCA) is part of: Download Trial version More details See users' feedback
  • Pro Core statistical software

  • System configuration

    • Windows:
      • Versions: 9x/Me/NT/2000/XP/Vista/Win 7/Win 8
      • Excel: 97 and later
      • Processor: 32 or 64 bits
      • Hard disk: 150 Mb
    • Mac OS X:
      • OS: OS X
      • Excel: X, 2004 and 2011
      • Hard disk: 150Mb.

Benefits

  • Easy and user-friendly
    Easy and user-friendly XLSTAT is flawlessly integrated with Microsoft Excel which is the most popular spreadsheet worldwide. This integration makes it one of the simplest available tools to work with as it utilizes the same philosophy as Microsoft Excel. The program is accessible in a dedicated XLSTAT tab. The analyses are grouped into functional menus. The dialog boxes are user-friendly and setting up an analysis is straightforward.
  • Data and results shared seamlessly
    Data and results shared seamlessly One of the greatest advantages of XLSTAT is the way you can share data and results seamlessly. As the results are stored in Microsoft Excel, anyone can access them. There is no need for the receiver to have an XLSTAT license or any additional viewer which makes your team-work easier and more affordable. In addition, results are easily integrable into other Microsoft Office software such as PowerPoint, so that you can create striking presentation in minutes.
  • Modular
    Modular XLSTAT is a modular product. XLSTAT-Pro is a core statistical module of XLSTAT which includes all the mainstream functionalities in statistics and multivariate analysis. More advanced features contained in add-on modules can be added for specific applications. This way you can adapt the software to your needs making the software more cost-efficient.
  • Didactic
    Didactic The results of XLSTAT are organized by analysis and are easy to navigate. Moreover useful information is provided along with the results to assist you in your interpretation.
  • Affordable
    Affordable XLSTAT is a complete and modular analytical solution that can suit any analytical business needs. It is very reasonably priced so that the return of your investment is almost immediate. Any XLSTAT license comes with top level support and assistance.
  • Accessible - Available in many languages
    Accessible - Available in many languages We have ensured XLSTAT is accessible to everyone by making the program available in many languages, including Chinese, English, French, German, Italian, Japanese, Polish, Portuguese and Spanish.
  • Automatable and customizable
    Automatable and customizable Most of the statistical functions available in XLSTAT can be called directly from the Visual Basic window of Microsoft Excel. They can be modified and integrated to more code to fit to the specificity of your domain. Adding tables and plots as well as modifying existing outputs becomes easy. Furthermore, XLSTAT includes some special tools on the dialog boxes to generate automatically the VBA code in order to reproduce your analysis using the VBA editor or to simply load pre-set settings. This effortless automation of routine analysis will be a huge time saver on your part.

Multiple Correspondence Analysis

Multiple Correspondence Analysis (MCA) is a method that allows studying the association between two or more qualitative variables.

Multiple Correspondence Analysis is to qualitative variables what Principal Component Analysis is to quantitative variables. One can obtain maps where it is possible to visually observe the distances between the categories of the qualitative variables and between the observations. For detailed information on the method, we recommend the recent book by Michael Greenacre and Jörg Blasius.

mcabook.jpg (click the cover to order it on Amazon.com).

Dataset to run a Multiple Correspondence Analysis

An Excel sheet containing both the data and the results used in this tutorial can be downloaded by clicking here.

The data correspond to a survey conducted by a car dealer where 28 customers were asked five questions, one week after they had picked up their car after a mechanical repair. The questions were:

  • Are you globally satisfied by the service? (Yes/No)
  • Do you consider the problem is solved? (Yes/No/Don't know)
  • How good was the welcome? (1 to 5)
  • Is the quality/price ratio satisfactory? (Yes/No)
  • Will you use our services again? (Yes/No/Don't know)

By running a Multiple Correspondence Analysis (MCA), we want to identify the relationships between the various possible answer to the questions.

Setting up the Multiple Correspondence Analysis dialog box

After opening XLSTAT, select the XLSTAT / Analyzing data / Multiple Correspondence Analysis command, or click on the corresponding button of the Analyzing data toolbar (see below).

barmca.gif

Once you've clicked on the button, the Multiple Correspondence Analysis dialog box appears.

The format of the data is here Observations/Variables.

We select the data on the Excel sheet, using the column selection method: just click on the name of the columns you want to select (see the tutorial on how to select data for more information on this topic).

The Observations labels are selected in the corresponding field, and the Variable labels option is left activated as the first row of the table contains the name of the variables.

mca1.gif

In the Options tab we activate the Supplementary data option and then go to the corresponding tab: the "Come back" variable is used as a supplementary variable because we don't want it to influence the computations; however, we want to know how the categories of this variable are positioned on the correspondence map.

The 1/p option is our filtering choice: the detailed results corresponding to factors which eigenvalue is less than 1/p (where p is the number of active qualitative variables), will not be displayed.

mca1-2.gifmca1-3.gif

The following Outputs and Charts options have been activated.

mca1-4.gifmca1-5.gif

The computations begin once you have clicked on OK. The results will then be displayed.

Interpreting the results of a Multiple Correspondence Analysis

The first results displayed are the tables used for the computations (full disjunctive table, Burt's table).

The total inertia is equal to 2. It depends only on the number of variables and categories and not on the linkage between the variables. Therefore, there is no possible statistical interpretation.

The next table shows the eight non null eigenvalues and the corresponding % of inertia. However, unlike with CA (correspondence analysis performed on only 2 variables), the % of inertia are here pessimistic estimates of the quality of the representation, the latter being for the user "how close is the representation to the reality".

Greenacre et al (2005) suggested an adjusted inertia which gives a better idea of the quality of the maps. We see here that while the usual computation gives us only 46.6% with the first two axes, the method based on the adjusted inertia gives us 87.3%.

mca2.gif

The % displayed on the scree plot is based on the adjusted inertia.

mca3.gif

Then, a table displays the coordinates of the categories in the factors space. The results that correspond to the supplementary variable are displayed in blue color.

The coordinates of the observations are displayed further down.

The contributions, the test values and the squared cosines help in the interpretation of the results. Before interpreting that two categories are close on the map, one should check that their contribution to the axes of the map, or that their squared cosines are high.

The following chart corresponds to the correspondence map where both the categories and the observations are displayed on the first two axes.

mca4.gif

In order to better visualize the relative positions of the categories, we have built with XLSTAT-3DPlot a visualization in the F1/F2/F3 space.

mca5.gif

From these charts we confirm that a customer will come back only if he is satisfied with the intervention, the welcome and the price. We also notice that there seems to be a link between the fact that the repair was not satisfactory, and the fact that the welcome was bad. This should be investigated further: has the customer described the problem not precisely enough because he had been badly welcome or has the person called back to mention tat the problem was still there and has been badly welcome by the representative?

The following video shows you how to run this tutorial.