Generating Bootstrap statistics using Resampling in XLSTAT

Dataset for Resampling XLS843 KB

Tutorial video
  • Pro Core statistical software

  • System configuration

    • Windows:
      • Versions: 9x/Me/NT/2000/XP/Vista/Win 7
      • Excel: 97 and later
      • Processor: 32 or 64 bits
      • Hard disk: 150 Mb
    • Mac OS X:
      • OS: OS X
      • Excel: X, 2004 and 2011
      • Hard disk: 150Mb.

Benefits

  • Easy and user-friendly
    Easy and user-friendly XLSTAT is flawlessly integrated with Microsoft Excel which is the most popular spreadsheet worldwide. This integration makes it one of the simplest available tools to work with as it utilizes the same philosophy as Microsoft Excel. The program is accessible in a dedicated XLSTAT tab. The analyses are grouped into functional menus. The dialog boxes are user-friendly and setting up an analysis is straightforward.
  • Data and results shared seamlessly
    Data and results shared seamlessly One of the greatest advantages of XLSTAT is the way you can share data and results seamlessly. As the results are stored in Microsoft Excel, anyone can access them. There is no need for the receiver to have an XLSTAT license or any additional viewer which makes your team-work easier and more affordable. In addition, results are easily integrable into other Microsoft Office software such as PowerPoint, so that you can create striking presentation in minutes.
  • Modular
    Modular XLSTAT is a modular product. XLSTAT-Pro is a core statistical module of XLSTAT which includes all the mainstream functionalities in statistics and multivariate analysis. More advanced features contained in add-on modules can be added for specific applications. This way you can adapt the software to your needs making the software more cost-efficient.
  • Didactic
    Didactic The results of XLSTAT are organized by analysis and are easy to navigate. Moreover useful information is provided along with the results to assist you in your interpretation.
  • Affordable
    Affordable XLSTAT is a complete and modular analytical solution that can suit any analytical business needs. It is very reasonably priced so that the return of your investment is almost immediate. Any XLSTAT license comes with top level support and assistance.
  • Accessible - Available in many languages
    Accessible - Available in many languages We have ensured XLSTAT is accessible to everyone by making the program available in many languages, including Chinese, English, French, German, Italian, Japanese, Polish, Portuguese and Spanish.
  • Automatable and customizable
    Automatable and customizable Most of the statistical functions available in XLSTAT can be called directly from the Visual Basic window of Microsoft Excel. They can be modified and integrated to more code to fit to the specificity of your domain. Adding tables and plots as well as modifying existing outputs becomes easy. Furthermore, XLSTAT includes some special tools on the dialog boxes to generate automatically the VBA code in order to reproduce your analysis using the VBA editor or to simply load pre-set settings. This effortless automation of routine analysis will be a huge time saver on your part.

XLSTAT has a resampling toolbox which can be used to obtain bootstrap resamples, standard deviation and confidence intervals. It allows to construct graphics based on the bootstrap distribution.

Dataset to generate Bootstrap statistics using Resampling

An Excel sheet with both the data and the results can be downloaded by clicking here.

The data correspond to a sample of 150 irises for which 4 variables were measured. The flowers belong to 3 different species. Fisher used this dataset, now famous, when he developed his discriminant analysis theory. In this particular example, we decided to analyze the variable Sepal length of the flowers.

Goal of this tutorial

Using XLSTAT resampling toolbox, we want to obtain bootstrap means, bootstrap standard deviations and boostrap confidence intervals for some statistical measures. These bootstrap statistics are obtained without any distributional assumption. We will study the mean and standard deviations of the Sepal length variable.

Setting up a resampling

Once XLSTAT is open, select the XLSTAT / Describing data / Resampled Statistics command, or click on the corresponding button of the Describing data toolbar (see below).

boot0.gifboot1.gif

Once you have clicked on the button, the Resampled Statistics dialog box appears. The data corresponding to the variable "Sepal length" were selected on the Excel sheet.

Note that for a resampling, the data must be numerical data.

As the name of the variable was included in the selection, the Labels included option was also selected.

The Sheet option was selected because we wanted the results displayed on a new sheet of the workbook. The chosen resampling method is the bootstrap method with 200 resamples.

boot2.gif

In the Outputs tab, select the statistics to be studied. We select the mean and both standard deviations.

The 95 % standard bootstrap interval is selected. You can display all 200 samples and resampled statistics using Resamples and Resampled statistics options.

boot3.gif

In the Charts tab, histograms have been selected.

boot4.gif

Click on OK to launch the analysis.

Interpreting the results of the resampled statistics

The results are displayed on the new sheet named "Resampling".

The following table gathers the obtained bootstrap statistics for the sample mean and standard deviations. We can see that bootstrap estimates are very close to the original estimates even with only 200 resamples. For the mean, the 95 % standard bootstrap confidence interval is very narrow. For the standard deviations, we can see that the sample-based and the population-based standard deviations are very close and that bootstrap estimates are very similar to original estimates. Confidence intervals are also very narrow even with 200 resamples.

boot5.gif

Histograms can be used to visualise the bootstrap distribution of the mean and standard deviations. For the mean, we see that 56 of the 200 values are in the range [58.2, 58.64], which also includes the mean over the original sample. The table of intervals is also given in order to better understand the distribution. The resampling tool automatically decides the number of intervals. If you wish to adapt it to your analysis, simply use the "Histograms" function of XLSTAT on the resampled statistics obtained when the “Resampled statistics” is selected on the “Outputs” tab. boot6.gif

This tool allows you to calculate different types of confidence intervals on a large number of descriptive statistics. The addition of weight and the treatment of several variables simultaneously are also possible.