How do I use differencing to obtain a stationary time series?

Conjunto de datos para Time series descriptive statistics XLS126 KB

Vídeo de tutorial
  • Time Software estadístico para el análisis de series temporales

  • Configuración del sistema

    • Windows:
      • Versiones: 9x/Me/NT/2000/XP/Vista/Win 7
      • Excel: 97 o superior
      • Procesador: 32 o 64 bits
      • Disco duro: 150 Mb
    • Mac OS X:
      • OS: OS X
      • Excel: X, 2004 y 2011
      • Disco duro: 150 Mb

Ventajas

  • Sencillo y dirigido a los usuarios
    Sencillo y dirigido a los usuarios XLSTAT es un software que se integra de forma transparente con Microsoft Excel, que es la hoja de cálculo más difundida del mundo. Esta integración hace que sea una de las herramientas más sencillas para trabajar ya que utiliza la misma filosofía que Microsoft Excel. El programa está disponible en una ficha de XLSTAT. Los análisis se agrupan en menús funcionales. Los cuadros de diálogo están dirigidos a los usuarios, por lo que la preparación de los análisis es tarea sencilla.
  • Resultados y datos compartidos sin dificultad
    Resultados y datos compartidos sin dificultad Una de las mayores ventajas de XLSTAT es la forma transparente con la que se pueden compartir los datos y los resultados. Los resultados se almacenan en Microsoft Excel de modo que cualquier usuario puede acceder a ellos. No es necesario que el destinatario tenga una licencia de XLSTAT o cualquier visor adicional, lo que facilita y rentabiliza el trabajo en equipo. Del mismo modo, es fácil integrar los resultados en otras aplicaciones de Microsoft Office, como PowerPoint, por lo que se pueden crear estupendas presentaciones en cuestión de minutos.
  • Modular
    Modular XLSTAT es un producto modular. XLSTAT-Pro es un módulo principal de estadística de XLSTAT, que incluye todas las funcionalidades dominantes en el análisis estadístico y multivariado. Es posible añadir funciones más avanzadas por medio de módulos complementarios para aplicaciones específicas. De este modo es posible adaptar el software a sus necesidades, aumentando la rentabilidad.
  • Didáctico
    Didáctico Los resultados de XLSTAT están organizados por análisis y es fácil desplazarse por ellos. La información útil se proporciona junto con los resultados para ayudarle en su interpretación.
  • Asequible
    Asequible XLSTAT es una solución completa y modular que se puede ajustar a cualquier necesidad de análisis comercial. Tiene un precio muy razonable, por lo que el retorno de su inversión es casi inmediato. Todas las licencias de XLSTAT incluyen también un servicio de asistencia de la mayor calidad.
  • Accesible: disponible en muchos idiomas
    Accesible: disponible en muchos idiomas Nos hemos asegurado de que XLSTAT sea accesible para todos traduciendo el programa a muchos idiomas, incluyendo chino, inglés, alemán, italiano, japonés, polaco, portugués y español.
  • Automatizable y personalizable
    Automatizable y personalizable La mayoría de las funciones estadísticas disponibles en XLSTAT pueden llamarse directamente desde la ventana Visual Basic de Microsoft Excel. Pueden modificarse e integrarse en código fuente adicional para ajustarse a sus necesidades. Añadir tablas y trazados, así como modificar los resultados existentes se convierte en tarea sencilla. Además, XLSTAT incluye algunas herramientas especiales en los cuadros de diálogo para generar automáticamente el código fuente VBA para reproducir su análisis empleando el editor de VBA o simplemente cargar ajustes predeterminados. Esta automatización de análisis rutinarios sin esfuerzo le ahorrará gran cantidad de tiempo.

Dataset for the differencing transformation

An Excel sheet with both the data and results can be downloaded by clicking here.

The data have been obtained in [Box, G.E.P. and Jenkins, G.M. (1976). Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco], and correspond to monthly international airline passengers (in thousands) from January 1949 to December 1960. It is widely used as an nonstationary seasonal time series.

Our goal is to show how helpful descriptive analysis can be before a modeling approach.

hw1.gif

We notice that on the chart, there is global upward trend, that every year, a similar cycles start while the variability within a year seems to increase over time. In order to confirm this trend we are going to analyse the autocorrelation function of the series.

Setting up a descriptive analysis of time series

After opening XLSTAT, select the XLSTAT / XLSTAT-Time / Descriptive analysis command, or click on the corresponding button of the "XLSTAT-Time" toolbar (see below).

bardesc.gif

Once you've clicked on the button, the Descriptive analysis dialog appears. Select the data on the Excel sheet. The "Time series" corresponds to the series of interest, the Passengers. After you selected the data. The option "Series labels" is activated because the first row of the selected data contains the header of the variable.

desc1a.gif

In the options tab, automatic time steps are selected:

desc1b.gif

The outputs and charts tabs are as follows:

desc1c.gif

desc1d.gif

The computations begin once you have clicked on "OK". The results will then be displayed.

Interpreting the descriptive statistics of a time series

The first table displays the summary statistics. Then the "Normality test and white noise tests" table is displayed. The Jarque-Bera test is a normality test, based on the skewness and kurtosis coefficients. The bigger the value of the Chi-square statistic, the more unlikely the null hypothesis that the data are normally distributed. Here the p-value, which corresponds to the probability of being wrong when rejecting the null hypothesis, is close to 0.01. With an alpha=0.05 significance level, one should reject the null hypothesis.

The three other three tests (Box-Pierce, Ljung-Box, McLeod-Li) are computed at different time lags. They allow to test if the data could be assumed to be a white noise or not. These tests are also based on the Chi-square distribution. They all agree that the data cannot be assumed to be generated by a white noise process. While the sorting of the data has no influence on the Jarque-Bera test, it does have an influence on the three other tests which are particularly suited for time series analysis.

desc2.gif

Below the table that displays the descriptive functions of the time series, two bar charts display the evolution of the autocorrelation function (ACF) and of the partial autocorrelation function (PACF). The 95% confidence intervals are also displayed. By looking at the autocorrelogram, we can identify a clear lag 1 autocorrelation, as well as a seasonnality which seems to be of 12 months.

desc3.gif

desc4.gif

Transformation of a time series

In order to improve the normality of the data, we want to perform two transformations:

  1. First, we want to stabilize the increasing variability of the series,
  2. Second, we want to remove the autocorrelations by differencing the series.

Setting up the transformation of a time series

This can be done using the Time series transformation tool. To activate the corresponding dialog box, select the XLSTAT / XLSTAT-Time / Transforming series command, or click on the corresponding button of the XLSTAT-Time toolbar (see below).

bartrans.gif

Once you've clicked on the button, the dialog appears.

Select the data on the Excel sheet. The Time series corresponds to the series of interest, the "Passengers". After you selected the data.

desc5a.gif

In the options tab, select the Box-Cox option.

While we could ask for an optimized transformation (the lambda parameter of the Box-Cox transformation would be adjusted so that the likelihood of a regression model - tranformed Y = simple linear function of time - would be as high as possible), we decide here to fix the lambda value to 0, which corresponds to a log transformation of the series.

The log transformation is often a good choice for removing increasing variability. Then, in order to remove the trend and the seasonnal component, we decide to use the differencing method. We set the d value to one the remove the trend, and D and s to 1 and 12 to remove the 12 months seasonal component.

desc5b.gif

The computations begin once you have clicked on OK.

Results of the transformation of a time series

We first see a table and a chart that correspond to the Box-Cox transformation. We can see the transformed series on the chart below. It looks like the log transformation has removed the increasing variability.

desc6.gif

Next, a table and a chart display the differencing transformation. We see that the differencing has well removed the trend, but it is not clear if we have obtained a white noise or not.

desc7.gif

Descriptive statistics on transformed time series

In order to verify if the transformations have made that the series looks now like a white noise and is normality distributed, we need to perform a descriptive analysis on the transformed series.

desc8.gif

In the "missing data" tab, select the "remove the observations" option.

desc8a.gif

The Jarque-Bera test confirms that the series looks more like a normal sample (we jumped from 0.01 to 0.04). But looking at the white noise tests it looks like the transformations have not been effiicient enough. The autocorrelogram indicates that we removed too much of the lag 1 and lag 12 components, as they have now negative autocorrelation coefficients. Furthermore the lag 3 and 9 coefficients seem to be also significant. Therefore, it seems that further work is necessary to understand the underlying phenomenon.

desc9.gif

desc10.gif