Generating box plots with XLSTAT
Dataset for Descriptive Statistics (including Box plots and scattergrams) XLS35.5 KB
Dataset for generating a box plot
An Excel sheet with both the data and the results can be downloaded by clicking here.
The data correspond to a sample of 150 irises for which 4 variables were measured. The flowers belong to 3 different species. Fisher used this dataset, now famous, when he developed his discriminant analysis theory. In this particular example, we decided to analyze the variable "Sepal length" of the flowers and check if there are "visually" significant differences between the three species.
Setting up the dialog box for the box plot
Once XLSTAT is open, select the XLSTAT / Describing data / Descriptive Statistics command, or click on the corresponding button of the Describing data toolbar (see below).


Once you have clicked on the button, the Descriptive Statistics dialog box appears.
The data corresponding to the variable "Sepal length" were selected on the Excel sheet. Note that for a box plot, the data must be numerical data.
As the name of the variable was included in the selection, the Labels included option was also selected.
The "Species" data were selected as sub-sample descriptor to enable the comparison between the species.
The Sheet option was selected because we wanted the results displayed on a new sheet of the workbook.

In the Options tab, the following options have been activated.

The Normalize or Rescale options can be used when you want to compare several variables spread over different scales - there is no need to use these in this case as we are dealing with only one variable.
In the Charts tab the box plots option is checked. The Group plots option has been chosen so that the box plots are displayed on the same chart, and not separately.
The Minimum/Maximum and Outliers options have been checked so that the corresponding values are displayed on the box plots.

Interpreting a box plot
The results are displayed on the new sheet named "Desc". They include a full set of descriptive statistics.

Then, the box plots are displayed.

It appears clearly that the Sepal length variable is different from for the three species. The blue rhombuses correspond to the minimum and maximum values.
Watch this video to see how to generate this boxplot.