Exercise 1: Life tables and Kaplan-Meier estimates of survival

(a) Hand calculation: Life table and Kaplan-Meier estimates of survival

Using hand calculation (i.e., using a spreadsheet program or pen, paper, and a calculator) estimate the cause-specific survivor function for the sample of 35 patients diagnosed with colon carcinoma (see the table below) using both the Kaplan-Meier method (up to at least 30 months) and the actuarial method (at least the first 5 annual intervals).

In the lectures we estimated the observed survivor function (i.e. all deaths were considered to be events) using the Kaplan-Meier and actuarial methods; your task is to estimate the cause-specific survivor function (only deaths due to colon carcinoma are considered events) using the same data. The next page includes some hints to help you get started.

##       sex age     stage mmdx yydx surv_mm surv_yy       status                subsite
## 1    Male  72 Localised    2 1989       2    0.01  Dead: other Descending and sigmoid
## 2  Female  82   Distant   12 1991       2    0.01 Dead: cancer Descending and sigmoid
## 3    Male  73   Distant   11 1993       3    0.01 Dead: cancer Descending and sigmoid
## 4    Male  63   Distant    6 1988       5    0.01 Dead: cancer             Transverse
## 5    Male  67 Localised    5 1989       7    0.01 Dead: cancer             Transverse
## 6    Male  74  Regional    7 1992       8    0.01 Dead: cancer   Coecum and ascending
## 7  Female  56   Distant    1 1986       9    0.01 Dead: cancer             Transverse
## 8  Female  52   Distant    5 1986      11    0.01 Dead: cancer   Coecum and ascending
## 9    Male  64 Localised   11 1994      13    1.00        Alive Descending and sigmoid
## 10 Female  70 Localised   10 1994      14    1.00        Alive Descending and sigmoid
## 11 Female  83 Localised    7 1990      19    1.00  Dead: other Descending and sigmoid
## 12   Male  64   Distant    8 1989      22    1.00 Dead: cancer Descending and sigmoid
## 13 Female  79 Localised   11 1993      25    2.00        Alive Descending and sigmoid
## 14 Female  70   Distant    6 1988      27    2.00 Dead: cancer   Coecum and ascending
## 15   Male  70  Regional    9 1993      27    2.00        Alive   Coecum and ascending
## 16 Female  68   Distant    9 1991      28    2.00 Dead: cancer Descending and sigmoid
## 17   Male  58 Localised   11 1990      32    2.00 Dead: cancer Descending and sigmoid
## 18   Male  54   Distant    4 1990      32    2.00 Dead: cancer   Coecum and ascending
## 19 Female  86 Localised    4 1993      32    2.00        Alive Descending and sigmoid
## 20   Male  31 Localised    1 1990      33    2.00 Dead: cancer   Coecum and ascending
## 21 Female  75 Localised    1 1993      35    2.00        Alive Descending and sigmoid
## 22 Female  85 Localised   11 1992      37    3.00        Alive   Coecum and ascending
## 23 Female  68   Distant    7 1986      43    3.00 Dead: cancer Descending and sigmoid
## 24   Male  54  Regional    6 1985      46    3.00 Dead: cancer             Transverse
## 25   Male  80 Localised    6 1991      54    4.00        Alive   Coecum and ascending
## 26 Female  52 Localised    7 1989      77    6.00        Alive             Transverse
## 27   Male  52 Localised    6 1989      78    6.00        Alive Descending and sigmoid
## 28   Male  65 Localised    1 1989      83    6.00        Alive Descending and sigmoid
## 29   Male  60 Localised   11 1988      85    7.00        Alive             Transverse
## 30 Female  71 Localised   11 1987      97    8.00        Alive Descending and sigmoid
## 31   Male  58 Localised    8 1987     100    8.00        Alive Descending and sigmoid
## 32 Female  80 Localised    5 1987     102    8.00 Dead: cancer Descending and sigmoid
## 33   Male  66 Localised    1 1986     103    8.00  Dead: other   Coecum and ascending
## 34   Male  67 Localised    3 1987     105    8.00        Alive   Coecum and ascending
## 35 Female  56   Distant   12 1986     108    9.00        Alive             Transverse

Actuarial approach

We suggest you start with the actuarial approach. Your task is to construct a life table with the following structure.

interval \([u,v)\) \(l\) \(d\) \(w\) \(l'\) \(p\) \(S(u)\) \(S(v)\)
[0-1) 35 1.000
[1-2)
[2-3)
[3-4)
[4-5)
[5-6)

We have already entered \(l\) (the number of people alive) at the start of the first interval. The next step is to add the number who experienced the event (\(d\)) and the number censored (\(w\)) during the interval. From \(l\), \(d\), and \(w\) you will then be able to calculate \(l'\) (the effective number at risk), followed by \(p\) (conditional probability of surviving the interval) and finally \(S(t)\), the cumulative probability of surviving from time zero until the end of the interval.

Kaplan-Meier approach

To estimate survival using the Kaplan-Meier approach you will find it easiest to add a line to the table at each and every time there is an event or censoring. We should use time in months. The first time at which there is an event or censoring is time equal to 2 months. The trick is what to do when there are both events and censorings at the same time.

time \(t\) # at risk \(d\) \(w\) \(p\) \(S(t)\)
2 35
3
5
7
8
9
11
\(\ldots\)

(b) Using R to validate the hand calculations done in part 1 (a)

We will now use R to reproduce the same analyses done by hand calculation in section 1.1 although you can do this part without having done the hand calculations, since this question also serves as an introduction to survival analysis using R. Our aim is to estimate the cause-specific survivor function for the sample of 35 patients diagnosed with colon carcinoma using both the Kaplan-Meier method and the actuarial method. In the lectures we estimated the all-cause survivor function (i.e. all deaths were considered to be events) using the Kaplan-Meier and actuarial methods whereas we will now estimate the cause-specific survivor function (only deaths due to colon carcinoma are considered events).

Life tables are available using the lifetab function from the KMsurv package on CRAN. We have written a small wrapper lifetab2 which allows for a Surv object and a dataset. The following command will give the actuarial estimates:

A listing of the Kaplan-Meier estimates and a graph are obtained as follows