Missing data

There are several ways to deal with missing data, including imputation or removal. Handle missing data in Excel using the XLSTAT add-on statistical software.

Missing data management

There are three types of missing values (Allison, 2001): data missing completely at random (MCAR), data missing at random (MAR) and data not missing at random (NMAR).

Data is missing completely at random (MCAR) if the event that leads to a missing data is independent of observable variables and of unobservable parameters. It should occur entirely at random. When data are MCAR, the analyses performed on the data are unbiased.

Data is missing at random (MAR) when the event that leads to a missing data is related to a particular variable, but it is not related to the value of the variable that has missing data. This is the most common case.

Data is not missing at random (NMAR) when data is missing for a particular reason. An example of this is the filtered questions in a questionnaire (the question is only intended for some respondents, the others are missing)

Handling missing data in Excel with XLSTAT

Most XLSTAT functions include options to handle missing data. However, only few approaches are available. This tool allows you to complete or clean your dataset using advanced missing value treatment methods.

The methods available in this tool correspond to the MCAR and MAR cases.

Different methods are available depending on your needs and data:

  • For quantitative data, XLSTAT allows you to:

    • Remove observations with missing values.

    • Use a mean imputation method.

    • Use a nearest neighbor approach.

    • Use the NIPALS algorithm.

    • Use an MCMC multiple imputation algorithm.

  • For qualitative data, XLSTAT allows you to:

    • Remove the observations with missing value.

    • Use a mode imputation method.

    • Use a nearest neighbor approach.