Using differencing to obtain a stationary time series

Dataset for Time series descriptive statistics XLS126 KB

Tutorial video
Time series descriptive statistics is part of: Download Trial version More details See users' feedback
  • Time Time series analysis software

  • System configuration

    • Windows:
      • Versions: 9x/Me/NT/2000/XP/Vista/Win 7
      • Excel: 97 and later
      • Processor: 32 or 64 bits
      • Hard disk: 150 Mb
    • Mac OS X:
      • OS: OS X
      • Excel: X, 2004 and 2011
      • Hard disk: 150Mb.

Benefits

  • Easy and user-friendly
    Easy and user-friendly XLSTAT is flawlessly integrated with Microsoft Excel which is the most popular spreadsheet worldwide. This integration makes it one of the simplest available tools to work with as it utilizes the same philosophy as Microsoft Excel. The program is accessible in a dedicated XLSTAT tab. The analyses are grouped into functional menus. The dialog boxes are user-friendly and setting up an analysis is straightforward.
  • Data and results shared seamlessly
    Data and results shared seamlessly One of the greatest advantages of XLSTAT is the way you can share data and results seamlessly. As the results are stored in Microsoft Excel, anyone can access them. There is no need for the receiver to have an XLSTAT license or any additional viewer which makes your team-work easier and more affordable. In addition, results are easily integrable into other Microsoft Office software such as PowerPoint, so that you can create striking presentation in minutes.
  • Modular
    Modular XLSTAT is a modular product. XLSTAT-Pro is a core statistical module of XLSTAT which includes all the mainstream functionalities in statistics and multivariate analysis. More advanced features contained in add-on modules can be added for specific applications. This way you can adapt the software to your needs making the software more cost-efficient.
  • Didactic
    Didactic The results of XLSTAT are organized by analysis and are easy to navigate. Moreover useful information is provided along with the results to assist you in your interpretation.
  • Affordable
    Affordable XLSTAT is a complete and modular analytical solution that can suit any analytical business needs. It is very reasonably priced so that the return of your investment is almost immediate. Any XLSTAT license comes with top level support and assistance.
  • Accessible - Available in many languages
    Accessible - Available in many languages We have ensured XLSTAT is accessible to everyone by making the program available in many languages, including Chinese, English, French, German, Italian, Japanese, Polish, Portuguese and Spanish.
  • Automatable and customizable
    Automatable and customizable Most of the statistical functions available in XLSTAT can be called directly from the Visual Basic window of Microsoft Excel. They can be modified and integrated to more code to fit to the specificity of your domain. Adding tables and plots as well as modifying existing outputs becomes easy. Furthermore, XLSTAT includes some special tools on the dialog boxes to generate automatically the VBA code in order to reproduce your analysis using the VBA editor or to simply load pre-set settings. This effortless automation of routine analysis will be a huge time saver on your part.

Dataset for the differencing transformation

An Excel sheet with both the data and results can be downloaded by clicking here.

The data have been obtained in [Box, G.E.P. and Jenkins, G.M. (1976). Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco], and correspond to monthly international airline passengers (in thousands) from January 1949 to December 1960. It is widely used as an nonstationary seasonal time series.

Our goal is to show how helpful descriptive analysis can be before a modeling approach.

hw1.gif

We notice that on the chart, there is global upward trend, that every year, a similar cycles start while the variability within a year seems to increase over time. In order to confirm this trend we are going to analyse the autocorrelation function of the series.

Setting up a descriptive analysis of time series

After opening XLSTAT, select the XLSTAT / XLSTAT-Time / Descriptive analysis command, or click on the corresponding button of the "XLSTAT-Time" toolbar (see below).

bardesc.gif

Once you've clicked on the button, the Descriptive analysis dialog appears. Select the data on the Excel sheet. The "Time series" corresponds to the series of interest, the Passengers. After you selected the data. The option "Series labels" is activated because the first row of the selected data contains the header of the variable.

desc1a.gif

In the options tab, automatic time steps are selected:

desc1b.gif

The outputs and charts tabs are as follows:

desc1c.gif

desc1d.gif

The computations begin once you have clicked on "OK". The results will then be displayed.

Interpreting the descriptive statistics of a time series

The first table displays the summary statistics. Then the "Normality test and white noise tests" table is displayed. The Jarque-Bera test is a normality test, based on the skewness and kurtosis coefficients. The bigger the value of the Chi-square statistic, the more unlikely the null hypothesis that the data are normally distributed. Here the p-value, which corresponds to the probability of being wrong when rejecting the null hypothesis, is close to 0.01. With an alpha=0.05 significance level, one should reject the null hypothesis.

The three other three tests (Box-Pierce, Ljung-Box, McLeod-Li) are computed at different time lags. They allow to test if the data could be assumed to be a white noise or not. These tests are also based on the Chi-square distribution. They all agree that the data cannot be assumed to be generated by a white noise process. While the sorting of the data has no influence on the Jarque-Bera test, it does have an influence on the three other tests which are particularly suited for time series analysis.

desc2.gif

Below the table that displays the descriptive functions of the time series, two bar charts display the evolution of the autocorrelation function (ACF) and of the partial autocorrelation function (PACF). The 95% confidence intervals are also displayed. By looking at the autocorrelogram, we can identify a clear lag 1 autocorrelation, as well as a seasonnality which seems to be of 12 months.

desc3.gif

desc4.gif

Transformation of a time series

In order to improve the normality of the data, we want to perform two transformations:

  1. First, we want to stabilize the increasing variability of the series,
  2. Second, we want to remove the autocorrelations by differencing the series.

Setting up the transformation of a time series

This can be done using the Time series transformation tool. To activate the corresponding dialog box, select the XLSTAT / XLSTAT-Time / Transforming series command, or click on the corresponding button of the XLSTAT-Time toolbar (see below).

bartrans.gif

Once you've clicked on the button, the dialog appears.

Select the data on the Excel sheet. The Time series corresponds to the series of interest, the "Passengers". After you selected the data.

desc5a.gif

In the options tab, select the Box-Cox option.

While we could ask for an optimized transformation (the lambda parameter of the Box-Cox transformation would be adjusted so that the likelihood of a regression model - tranformed Y = simple linear function of time - would be as high as possible), we decide here to fix the lambda value to 0, which corresponds to a log transformation of the series.

The log transformation is often a good choice for removing increasing variability. Then, in order to remove the trend and the seasonnal component, we decide to use the differencing method. We set the d value to one the remove the trend, and D and s to 1 and 12 to remove the 12 months seasonal component.

desc5b.gif

The computations begin once you have clicked on OK.

Results of the transformation of a time series

We first see a table and a chart that correspond to the Box-Cox transformation. We can see the transformed series on the chart below. It looks like the log transformation has removed the increasing variability.

desc6.gif

Next, a table and a chart display the differencing transformation. We see that the differencing has well removed the trend, but it is not clear if we have obtained a white noise or not.

desc7.gif

Descriptive statistics on transformed time series

In order to verify if the transformations have made that the series looks now like a white noise and is normality distributed, we need to perform a descriptive analysis on the transformed series.

desc8.gif

In the "missing data" tab, select the "remove the observations" option.

desc8a.gif

The Jarque-Bera test confirms that the series looks more like a normal sample (we jumped from 0.01 to 0.04). But looking at the white noise tests it looks like the transformations have not been effiicient enough. The autocorrelogram indicates that we removed too much of the lag 1 and lag 12 components, as they have now negative autocorrelation coefficients. Furthermore the lag 3 and 9 coefficients seem to be also significant. Therefore, it seems that further work is necessary to understand the underlying phenomenon.

desc9.gif

desc10.gif