# Create contingency tables

Contingency tables are a way to visualize the relationship between two qualitative variables. Create them in Excel using the XLSTAT add-on software.

## What is a contingency table

A **contingency table**, also called cross-tab, is an efficient way to summarize the **relation** (or correspondence) between** two categorical variables** V_{1} and V_{2}. It has the following structure:

V_{1} V_{2} | Category 1 | Category j | … | Category m_{2} |

Category 1 | n(1,1) | n(1,j) | … | n(1,m_{2}) |

… | … | … | … | … |

Category i | n(i,1) | n(i,j) | … | n(i,m_{2}) |

… | … | … | … | … |

Category m_{1} | n(m_{1},1) | n(m_{1},j) | … | n(m_{1},m_{2}) |

where *n(i,j)* is the frequency of observations that show both characteristic *i* for variable *V _{1}*, and characteristic

*j*for variable

*V*.

_{2}To create a contingency table from two qualitative variables *V _{1}* and

*V*, the first transformation consists of recoding the two qualitative variables V

_{2}_{1}and V

_{2}as two disjunctive tables

*Z*and

_{1}*Z*or indicator (or dummy) variables. For each category of a variable there is a column in the respective disjunctive table. Each time the category

_{2}*c*of variable

*V*occurs for an observation

_{1}*i*, the value of

*Z*is set to one (the same rule is applied to the

_{1}(i,c)*V*variable). The other values of

_{2}*Z*and

_{1 }*Z*are zero. The contingency table of the two variables is the table

_{2}*Z*(where ‘ indicates matrix transpose).

_{1}'Z_{2}The **Chi-square** distance has been suggested to measure the distance between two categories. The **Pearson chi-square** statistic, which is the sum of the Chi-square distances, is used to test the independence between rows and columns. Is has asymptotically a Chi-square distribution with *(m _{1}-1)(m_{2}-1)* degrees of freedom.

**Inertia **is a measure inspired from physics that is often used in Correspondence Analysis, a method that is used to analyse in depth contingency tables. The inertia of a set of points is the weighted mean of the squared distances to the center of gravity.

## XLSTAT tool for creating a contingency table

From two or more qualitative variables, XLSTAT enables you to create instantly contingency tables summarizing the structure of the dataset. The data should be selecting in the general tab of the tool using the below fields:

**Row variable(s):** Select the data that correspond to the variable(s) that will be used to construct the rows of the contingency table(s).

**Column variable(s):** Select the data that correspond to the variable(s) that will be used to construct the columns of the contingency table(s).

**By group analysis:** Activate this option and select the data that describe to which group each observation belongs. XLSTAT will then run the analysis for each group separately. Useful for creating three-way crosstabs.

**Weights**: Activate this option if the observations are weighted. If you do not activate this option, the weights will be considered as 1. Weights must be greater than or equal to 0. If a column header has been selected, check that the "Variable labels" option is activated.

## XLSTAT tests on contingency tables

XLSTAT provides you with two criteria to characterize the relationship between the two variables:

**The Chi-square distance**has been suggested to measure the distance between two categories. The Pearson chi-square statistic, which is the sum of the Chi-square distances, is used to test the independence between rows and columns.

The Chi-square by cell option available in the XLSTAT dialog box allows you to display the contribution to the chi-square of each cell of the contingency table.

**Inertia**is a measure inspired from physics that is often used in Correspondence Analysis, a method that is used to analyse in depth contingency tables. The inertia of a set of points is the weighted mean of the squared distances to the center of gravity.

The Inertia by cell option available in the XLSTAT dialog box allows you to display the inertia for each cell of the contingency table.

You may also use a more complete feature dedicated to tests on contingency tables available in the Correlations/Association tests menu.

## XLSTAT charts for contingency tables

**3D view of the contingency table**: Activate this option to display the 3D bar chart corresponding to the contingency table.

**2D Bar Charts options**

Chart type

- Grouped: Choose this option to display the graphs as bars grouped by modality.
- Stacked bars: Choose this option to display the chart as stacked bars. These charts are used to compare the frequencies of sub-samples to those of a full sample.

Bar charts

- Frequencies: Choose this option to display the frequencies corresponding to each bar.
- Percentages: Choose this option to display the % of population corresponding to each bar.

## Advantages of using XLSTAT cross-tabs instead of Excel pivot tables

Among the many advantages of using the XLSTAT contingency table feature compared to Excel pivot tables:

- XLSTAT is able to automatically output test results on the contingency tables.
- You can enter as many qualitative variables as you want in both the row and column variable fields in XLSTAT. XLSTAT will produce one result for each possible pair of row/column variables.

## Tutorial on how to create and interpret a contingency table

A tutorial which explains how to configure the XLSTAT dialog box in order to generate a cross-tab based on two qualitative variables is available.