Generalized Procrustes Analysis (GPA)

Generalized Procrustes Analysis (GPA) is part of:
  • ADA Advanced Data Analysis on Multiple tables software

  • System configuration

    • Windows:
      • Versions: 9x/Me/NT/2000/XP/Vista/Win 7
      • Excel: 97 and later
      • Processor: 32 or 64 bits
      • Hard disk: 150 Mb
    • Mac OS X:
      • OS: OS X
      • Excel: X, 2004 and 2011
      • Hard disk: 150Mb.
  • MX Market research and sensory analysis software

  • System configuration

    • Windows:
      • Versions: 9x/Me/NT/2000/XP/Vista/Win 7
      • Excel: 97 and later
      • Processor: 32 or 64 bits
      • Hard disk: 150 Mb
    • Mac OS X:
      • OS: OS X
      • Excel: X, 2004 and 2011
      • Hard disk: 150Mb.

Benefits

  • Easy and user-friendly
    Easy and user-friendly XLSTAT is flawlessly integrated with Microsoft Excel which is the most popular spreadsheet worldwide. This integration makes it one of the simplest available tools to work with as it utilizes the same philosophy as Microsoft Excel. The program is accessible in a dedicated XLSTAT tab. The analyses are grouped into functional menus. The dialog boxes are user-friendly and setting up an analysis is straightforward.
  • Data and results shared seamlessly
    Data and results shared seamlessly One of the greatest advantages of XLSTAT is the way you can share data and results seamlessly. As the results are stored in Microsoft Excel, anyone can access them. There is no need for the receiver to have an XLSTAT license or any additional viewer which makes your team-work easier and more affordable. In addition, results are easily integrable into other Microsoft Office software such as PowerPoint, so that you can create striking presentation in minutes.
  • Modular
    Modular XLSTAT is a modular product. XLSTAT-Pro is a core statistical module of XLSTAT which includes all the mainstream functionalities in statistics and multivariate analysis. More advanced features contained in add-on modules can be added for specific applications. This way you can adapt the software to your needs making the software more cost-efficient.
  • Didactic
    Didactic The results of XLSTAT are organized by analysis and are easy to navigate. Moreover useful information is provided along with the results to assist you in your interpretation.
  • Affordable
    Affordable XLSTAT is a complete and modular analytical solution that can suit any analytical business needs. It is very reasonably priced so that the return of your investment is almost immediate. Any XLSTAT license comes with top level support and assistance.
  • Accessible - Available in many languages
    Accessible - Available in many languages We have ensured XLSTAT is accessible to everyone by making the program available in many languages, including Chinese, English, French, German, Italian, Japanese, Polish, Portuguese and Spanish.
  • Automatable and customizable
    Automatable and customizable Most of the statistical functions available in XLSTAT can be called directly from the Visual Basic window of Microsoft Excel. They can be modified and integrated to more code to fit to the specificity of your domain. Adding tables and plots as well as modifying existing outputs becomes easy. Furthermore, XLSTAT includes some special tools on the dialog boxes to generate automatically the VBA code in order to reproduce your analysis using the VBA editor or to simply load pre-set settings. This effortless automation of routine analysis will be a huge time saver on your part.

When to use Generalized Procrustes Analysis

Generalized Procrustean Analysis is used in sensory data analysis before a Preference Mapping to reduce the scale effects and to obtain a consensual configuration. It also allows comparing the proximity between the terms that are used by different experts to describe products.

Principle of Generalized Procrustes Analysis

We define by configuration an n x p matrix that corresponds to the description of n objects (or individuals/cases/products) on p dimensions (or attributes/variables/criteria/descriptors).

We name consensus configuration the mean configuration computed from the m configurations. Procrustes Analysis is an iterative method that allows to reduce, by applying transformations to the configurations (rescaling, translations, rotations, reflections), the distance of the m configurations to the consensus configuration, the latter being updated after each transformation.

Let us take the example of 5 experts rating 4 cheeses according to 3 criteria. The ratings can go from 1 to 10. One can easily consider that an expert tends to be harder in his notation, leading to a shift to the bottom of the ratings, or that another expert tends to give ratings around the average, without daring to use extreme ratings. To work on an average configuration could lead to false interpretations. One can easily see that a translation of the ratings of the first expert is necessary, or that rescaling the ratings of the second expert would make his ratings possibly closer to those of the other experts.

Once the consensus configuration has been obtained, it is possible to run a PCA (Principal Components Analysis) on the consensus configuration in order to allow an optimal visualization in two or three dimensions.

There exist two cases:

  1. If the number and the designation of the p dimensions are identical for the m configurations, one speaks in sensory analysis about conventional profiles.
  2. If the number p and the designation of the dimensions varies from one configuration to the other, one speaks in sensory analysis about free profiles, and the data can then only be represented by a series of m matrices of size n x p(k), k=1,2, …, m.

Algorithms for Generalized Procrustes Analysis used in XLSTAT

XLSTAT is the unique product offering the choice between the two main available algorithms: the one based on the works initiated by John Gower (1975), and the later one described in the thesis of Jacques Commandeur (1991). Which algorithm performs best (in terms of least squares) depends on the dataset, but the Commandeur algorithm is the only one that allows to take into account missing data; by missing data we mean here that for a given configuration and a given observation or row, the values were not recorded for all the dimensions of the configuration. The latter can happen in sensory data analysis if one of the judges has not evaluated a product.

Results for the Generalized Procrustes Analysis in XLSTAT

PANOVA table

Inspired from the format of the analysis of variance table of the linear model, this table allows you to evaluate the relative contribution of each transformation to the evolution of the variance. In this table are displayed the residual variance before and after the transformations, the contribution to the evolution of the variance of the rescaling, rotation and translation steps. The computing of the Fisher’s F statistic enables you to compare the relative contributions of the transformations. The corresponding probabilities help you to determine whether the contributions are significant or not.

Residuals

Residuals by object: This table and the corresponding bar chart allow to visualize the distribution of the residual variance by object. Thus, it is possible to identify for which objects the GPA has been the less efficient, in other words, which objects are the farther from the consensus configuration.

Residuals by configuration: This table and the corresponding bar chart allow you to visualize the distribution of the residual variance by configuration. Thus, it is possible to identify for which configurations the GPA has been the less efficient, in other words, which configurations are the farther from the consensus configuration.

Scaling factors for each configuration

Scaling factors for each configuration presented either in a table or a plot allow to compare the scaling factors applied to the configurations. It is used in sensory analysis to understand how the experts use the rating scales.

Results of the consensus test

The number of permutations that have been performed, the value of Rc which corresponds to the proportion of the original variance explained by the consensus configuration, and the quantile corresponding to Rc, calculated using the distribution of Rc obtained from the permutations are displayed to evaluate the effectiveness of the Generalized Procrustean Analysis. You need to set a confidence interval (typically 95%), and if the quantile is beyond the confidence interval, one concludes that the Generalized Procrustean Analysis significantly reduced the variance.

Results of the dimensions test

For each factor retained at the end of the PCA step, the number of permutations that have been performed, the F calculated after the Generalized Procrustean Analysis (F is here the ratio of the variance between the objects, on the variance between the configurations), and the quantile corresponding to F calculated using the distribution of F obtained from the permutations are displayed to evaluate if a dimension contributes significantly to the quality of the Generalized Procrustean Analysis. You need to set a confidence interval (typically 95%), and if the quantile is beyond the confidence interval, one concludes that factor contributes significantly. As an indication are also displayed, the critical values and the p-value that corresponds to the Fisher’s F distribution for the selected alpha significance level. It may be that the conclusions resulting from the Fisher’s F distribution is very different from what the permutations test indicates: using Fisher’s F distribution requires to assume the normality of the data, which is not necessarily the case.

Results for the consensus configuration

Results for the configurations after transformations

Tutorials

Screenshots