Generalized Procrustes Analysis with XLSTAT

Dataset for Generalized Procrustes Analysis (GPA) XLS189 KB

Tutorial video
  • ADA Advanced Data Analysis on Multiple tables software

  • System configuration

    • Windows:
      • Versions: 9x/Me/NT/2000/XP/Vista/Win 7
      • Excel: 97 and later
      • Processor: 32 or 64 bits
      • Hard disk: 150 Mb
    • Mac OS X:
      • OS: OS X
      • Excel: X, 2004 and 2011
      • Hard disk: 150Mb.
  • MX Market research and sensory analysis software

  • System configuration

    • Windows:
      • Versions: 9x/Me/NT/2000/XP/Vista/Win 7
      • Excel: 97 and later
      • Processor: 32 or 64 bits
      • Hard disk: 150 Mb
    • Mac OS X:
      • OS: OS X
      • Excel: X, 2004 and 2011
      • Hard disk: 150Mb.

Benefits

  • Easy and user-friendly
    Easy and user-friendly XLSTAT is flawlessly integrated with Microsoft Excel which is the most popular spreadsheet worldwide. This integration makes it one of the simplest available tools to work with as it utilizes the same philosophy as Microsoft Excel. The program is accessible in a dedicated XLSTAT tab. The analyses are grouped into functional menus. The dialog boxes are user-friendly and setting up an analysis is straightforward.
  • Data and results shared seamlessly
    Data and results shared seamlessly One of the greatest advantages of XLSTAT is the way you can share data and results seamlessly. As the results are stored in Microsoft Excel, anyone can access them. There is no need for the receiver to have an XLSTAT license or any additional viewer which makes your team-work easier and more affordable. In addition, results are easily integrable into other Microsoft Office software such as PowerPoint, so that you can create striking presentation in minutes.
  • Modular
    Modular XLSTAT is a modular product. XLSTAT-Pro is a core statistical module of XLSTAT which includes all the mainstream functionalities in statistics and multivariate analysis. More advanced features contained in add-on modules can be added for specific applications. This way you can adapt the software to your needs making the software more cost-efficient.
  • Didactic
    Didactic The results of XLSTAT are organized by analysis and are easy to navigate. Moreover useful information is provided along with the results to assist you in your interpretation.
  • Affordable
    Affordable XLSTAT is a complete and modular analytical solution that can suit any analytical business needs. It is very reasonably priced so that the return of your investment is almost immediate. Any XLSTAT license comes with top level support and assistance.
  • Accessible - Available in many languages
    Accessible - Available in many languages We have ensured XLSTAT is accessible to everyone by making the program available in many languages, including Chinese, English, French, German, Italian, Japanese, Polish, Portuguese and Spanish.
  • Automatable and customizable
    Automatable and customizable Most of the statistical functions available in XLSTAT can be called directly from the Visual Basic window of Microsoft Excel. They can be modified and integrated to more code to fit to the specificity of your domain. Adding tables and plots as well as modifying existing outputs becomes easy. Furthermore, XLSTAT includes some special tools on the dialog boxes to generate automatically the VBA code in order to reproduce your analysis using the VBA editor or to simply load pre-set settings. This effortless automation of routine analysis will be a huge time saver on your part.

Generalized Procrustes Analysis

Generalized Procrustes Analysis (GPA), a method that is used in several domains, is used in sensory analysis before a Preference Mapping to reduce the scale effects and to obtain a consensus configuration. It also allows comparing the proximity between the terms that are used by different experts to describe products.

Dataset for Generalized Procrustes Analysis

An Excel sheet with both the data and the results can be downloaded by clicking here.

The data used in this tutorial correspond to a study where a product marketing team wants to determine how four slightly different cheeses are evaluated. Ten experts have been asked to rate the four cheeses several times (without knowing which is which), using three criteria: acidity, strangeness, hardness.

The values used here correspond to the average rating for each cheese and each expert.

Goal of this Generalized Procrustes Analysis

Our goal is to transform the data to remove scaling effects (some experts might use a wider scale) or position effects (some experts might tend to use more the lower or the higher part of the rating scales), to obtain a consensus configuration that will then be used in an external preference mapping.

Setting up a Generalized Procrustes Analysis

To activate the Generalized Procrustes Analysis dialog box, start XLSTAT, and select the XLSTAT-MX / Generalized Procrustes Analysis or XLSTAT-ADA / Generalized Procrustes Analysis command, or click on the corresponding button of the XLSTAT-MX (or XLSTAT-ADA) toolbar (see below).

bargpa.gif

Once you have clicked on the button, the dialog box appears. Then, select the data that corresponds the configurations (a configuration corresponds here to the set of rates given by an expert).

The number of configurations must be entered. As we have 10 experts, we enter 10.

As each expert gave rates for each of the three dimensions, we can let XLSTAT know that the number of dimensions is constant by selecting the Equal option.

When the number of dimensions is different for a least one configuration, you need to select a column that contains the number of dimensions for each configuration.

So that the results look better we also select the configurations labels and the objects labels (in our case the cheeses).

gpa1.gif

The following options have been selected.

gpa12.gifgpa13.gifgpa14.gif

After you have clicked on the OK button, the computations start and the results are displayed on a new Excel sheet.

Interpreting the results of a Generalized Procrustes Analysis

The first result is the PANOVA table that summarizes the efficiency of each Generalized Procrustes Analysis transformation in terms of reduction of the total variability. We can see that the Scaling transformation is the most efficient (lowest p-value).

gpa2.gif

The second table and the corresponding chart give the residuals by object after the transformations. We can see that the C3 cheese has the smallest residual. This indicates that there is most probably a consensus between experts.

gpa3.gif

The third table and the corresponding chart give the residuals by configuration after the transformations. We can see that the Expert2 has the highest residual, which means that he gave rates that do not match the consensus.

gpa4.gif

The next table and chart give scaling factors of the Generalized Procrustes Analysis transformations. A factor lower than 1 indicates that the corresponding expert was using a wider scale than the others. A factor higher than 1 indicates that the corresponding expert was not using the rating scale as widely as the other experts. We can see that here that the experts 1 and 3 tend to use a wider scale than the other experts.

gpa-5.png

A consensus test is then performed to check if the consensus configuration is a true consensus. This permutation test allows determiningwhether the observed Rc value (Rc corresponds to the proportion of the original variance explained by the consensus configuration) is significantly higher than 95% of the results that are obtained when permuting the data.

gpa51.gifgpa52.gif

Another permutation test is used to verify how many dimensions should be retained to display the results. We see here that for the third dimension, the F value is below the 95th percentile. So we can conclude that two dimensions are enough.

gpa53.gif

The next results correspond to the results of the PCA step (unstandardized PCA). While the Generalized Procrustes Analysis already includes a rotation step for each configuration, so that it matches the consensus configuration, the PCA corresponds here to the optimal transformation of the consensus configuration under the usual PCA constraints. The PCA transformation is then applied to each configuration corresponding to each expert.

The eigenvalues show how much of the variability corresponds to each axis. We see that here we have 99% of the variability represented on the first two axes. When the variability is split between the experts we see that the results are almost identical for all experts.

gpa6.gifgpa7.gif

The results are then separated into the results corresponding to the consensus configuration, and the results for each individual configuration. The objects coordinates of the consensus configuration could be used later in a PREFMAP analysis as the coordinates of the products on the preference map.

On the correlation circle we can see that the "Strangeness" is most of the time on the negative side of the first axis, and that Acidity and Hardness are often mixed. The Strangeness at the origin of the plot corresponds to the 6th expert that did not rate the products on that criterion.

gpa8.gif

The next two charts are the maps of the objects, respectively colored by configuration and by object (see below). The points are all close to the first axis because 96% of the variability is concentrated on the frist axis, and because XLSTAT displays orthonormal maps to avoid misleading interpretations.

gpa9.gifgpa10.gif

In order to make the chart more legible, we change the scale options (as you can do with any Excel chart - you can also do that with the XLSTAT AxesZoomer). We obtain the following map:

gpa11.gif

We can see that the cheeses C1 and C3 are clearly separated on the map, while the border between products C2 and C4 is not as clear. That means that the experts differentiate well C1 and C3 and there is a consensus on these products, and that they do not distinguish as well C2 and C4.