Biostatistics III in R

Code by Annika Tillander, Andreas Karlsson and Mark Clements

Exercise 2. Comparing survival proportions and mortality rates by stage for cause-specific and all-cause survival

The purpose of this exercise is to study survival of the patients using two alternative measures — survival proportions and mortality rates. A second purpose is to study the difference between cause-specific and all-cause survival.

You may have to install the required packages the first time you use them. You can install a package by install.packages("package_of_interest") for each package you require.

Load dependencies and define two 1/0 variables for the events that we are interested in:

We then list the first few observations to get an idea about the data.

(a) Plot estimates of the survivor function and hazard function by stage

We now tabulate the distribution of the melanoma patients by cancer stage at diagnosis.

We then plot the survival and hazards by stage. Does it appear that stage is associated with patient survival?

We can also look at the confidence intervals for each hazard curve:

(b) Estimate the mortality rates for each stage using, for example, the `survRate` command

What are the units of the estimated rates? The survRate function, as the name suggests, is used to estimates rates. Look at the help pages if you are not familiar with the function (e.g. ?survRate or help(survRate)).

(c)

If you haven’t already done so, estimate the mortality rates for each stage per person-year and per 1000 person-years of follow-up.

(d)

Study whether survival is different for males and females (both by plotting the survivor function and by tabulating mortality rates). Is there a difference in survival between males and females? If yes, is the difference present throughout the follow up?

(e)

The plots you made above were based on cause-specific survival (i.e., only deaths due to cancer are counted as events, deaths due to other causes are censored). In the next part of this question we will estimate all-cause survival (i.e., any death is counted as an event). First, however, study the coding of vital status and tabulate vital status by age group.

How many patients die of each cause? Does the distribution of cause of death depend on age?

(f)

To get all-cause survival, specify all deaths (both cancer and other) as events.

Now plot the survivor proportion for all-cause survival by stage. We name the graph to be able to separate them in the graph window. Is the survivor proportion different compared to the cause-specific survival you estimated above? Why?

(g)

It is more common to die from a cause other than cancer in older ages. How does this impact the survivor proportion for different stages? Compare cause-specific and all-cause survival by plotting the survivor proportion by stage for the oldest age group (75+ years) for both cause-specific and all-cause survival.

(h)

Now estimate both cancer-specific and all-cause survival for each age group.

Exercise 2. Comparing survival proportions and mortality rates by stage for cause-specific and all-cause survival

(a) Plot estimates of the survivor function and hazard function by stage

(b) Estimate the mortality rates for each stage using, for example, the survRate command

(c)

(d)

(e)

(f)

(g)

(h)

(b) Estimate the mortality rates for each stage using, for example, the `survRate` command