Kaplan-Meier analysis

Kaplan-Meier analysis is a widely used method to generate and analyze survival-time data. It is available in Excel using the XLSTAT statistical software.

What is Kaplan-Meier analysis

The Kaplan-Meier method, also called product-limit analysis, belongs to the descriptive methods of survival analysis, as does life table analysis. The life table analysis method was developed first, but the Kaplan-Meier method has been shown to be superior in many cases.

Kaplan-Meier analysis allows you to quickly obtain a population survival curve and essential statistics such as the median survival time. Kaplan-Meier analysis, which main result is the Kaplan-Meier table, is based on irregular time intervals, contrary to the life table analysis, where the time intervals are regular.

Use of Kaplan-Meier analysis

Kaplan-Meier analysis is used to analyze how a given population evolves with time. This technique is mostly applied to survival data and product quality data. There are three main reasons why a population of individuals or products may evolve: some individuals die (products fail), some other go out of the surveyed population because they get healed (repaired) or because their trace is lost (individuals move from location, the study is terminated, among other reasons). The first type of data is usually called failure data, or event data, while the second is called censored data.

The Kaplan-Meier analysis allows you to compare populations, through their survival curves. For example, it can be of interest to compare the survival times of two samples of the same product produced in two different locations. Tests can be performed to check if the survival curves have arisen from identical survival functions. These results can later be used to model the survival curves and to predict probabilities of failure.

Censoring data for Kaplan-Meier analysis

Types of censoring

There are several types of censoring of survival data:

Left censoring: when an event is reported at time t=t_(i), we know that the event occurred at t * t_(i).
Right censoring: when an event is reported at time t=t_(i), we know that the event occurred at t * t_(i), if it ever occurred.
Interval censoring: when an event is reported at time t=t_(i), we know that the event occurred during [t_(i-1); t_(i)].
Exact censoring: when an event is reported at time t=t_(i), we know that the event occurred exactly at t=t_(i).

Independent censoring

The Kaplan-Meier method requires that the observations are independent. Second, the censoring must be independent: if you consider two random individuals in the study at time t-1, if one of the individuals is censored at time t, and if the other survives, then both must have equal chances to survive at time t. There are four different types of independent censoring:

Simple type I: all individuals are censored at the same time or equivalently individuals are followed during a fixed time interval.
Progressive type I: all individuals are censored at the same date (for example, when the study terminates).
Type II: the study is continued until n events have been recorded.
Random: the time when a censoring occurs is independent of the survival time.

Results for the Kaplan-Meier analysis in XLSTAT

Kaplan-Meier table

This table displays the various results obtained from the analysis, including:

Interval start lime: lower bound of the time interval.
At risk: number of individuals that were at risk.
Events: number of events recorded.
Censored: number of censored data recorded.
Proportion failed: proportion of individuals who "failed" (the event did occur).
Survival rate: proportion of individuals who "survived" (the event did not occur).
Survival distribution function (SDF): Probability of an individual to survive until at least the time of interest. Also called cumulative survival distribution function, or survival curve.
Survival distribution function standard error.
Survival distribution function confidence interval.

Mean and Median residual lifetime

Mean and Median residual lifetime are computed and displayed into two tables.

A first table displays the mean residual lifetime, the standard error, and a confidence range.
A second table displays statistics (estimator, and confidence range) for the 3 quartiles including the median residual lifetime (50%). The median residual lifetime is one of the key results of the Kaplan-Meier analysis as it allows to evaluate the time remaining for half of the population to "fail".

Confidence interval for the Kaplan-Meier analysis function

Computing confidence intervals for the survival function can be done using three different methods:

Greenwood’s method
Exponential Greenwood’s method
Log-transformed method

These three approaches give similar results, but the last ones will be preferred when samples are small.

Charts for Kaplan-Meier analysis

XLSTAT offers the following charts:

Survival distribution function (SDF)
-Log(SDF) corresponding to the –Log() of the survival distribution function (SDF).
Log(-Log(SDF)) corresponding to the Log(–Log()) of the survival distribution function.

It is also possible to identify on the charts the times when censored data have been recorded.