Exercise 9. Localised melanoma: modelling cause-specific mortality using Cox regression

In exercise 7 we modelled the cause-specific mortality of patients diagnosed with localised melanoma using Poisson regression. We will now model cause-specific mortality using Cox regression and compare the results to those we obtained using the Poisson regression model.

To fit a Cox proportional hazards model (for cause-specific survival) with calendar period as the only explanatory variable, the following commands can be used. Note that we are censoring all survival times at 120 months (10 years) in order to facilitate comparisons with the Poisson regression model in exercise 7.


You may have to install the required packages the first time you use them. You can install a package by install.packages("package_of_interest") for each package you require.

(a)

## Call:
## coxph(formula = Surv(surv_mm, death_cancer) ~ year8594, data = melanoma.l2)
## 
##   n= 5318, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## year8594Diagnosed 85-94 -0.25297   0.77649  0.06579 -3.845 0.000121 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## year8594Diagnosed 85-94    0.7765      1.288    0.6825    0.8834
## 
## Concordance= 0.533  (se = 0.008 )
## Likelihood ratio test= 14.83  on 1 df,   p=1e-04
## Wald test            = 14.78  on 1 df,   p=1e-04
## Score (logrank) test = 14.86  on 1 df,   p=1e-04

Patients diagnosed during 1985–94 experience only 77.7% of the cancer mortality experienced by those diagnosed 1975–84. That is, mortality due to skin melanoma has decreased by 22.3% in the latter period compared to the earlier period. This estimate is not adjusted for potential confounders. There is strong evidence of a statistically significant difference in survival between the two periods (based on the test statistic or the fact that the CI for the hazard ratio does not contain 1).

The regression equation is

\[\begin{align*} h(t|\text{year8594}) &= h_0(t) \exp(\beta_1 I(\text{year8594}="\text{Diagnosed 85-94}")) \end{align*}\] where \(h(t|\text{year8594})\) is the hazard at time \(t\) given covariate \(\text{year8594}\), with baseline hazard \(h_0(t)\) and regression coefficient \(\beta_1\) for the log hazard ratio for the calendar period 1985–1994 compared with the reference calendar period 1974–1984.

(b)

## Call:
## survdiff(formula = Surv(surv_mm, death_cancer) ~ year8594, data = melanoma.l)
## 
##                             N Observed Expected (O-E)^2/E (O-E)^2/V
## year8594=Diagnosed 75-84 2145      572      512      7.03      15.5
## year8594=Diagnosed 85-94 3173      441      501      7.18      15.5
## 
##  Chisq= 15.5  on 1 degrees of freedom, p= 8e-05
## Call:
## coxph(formula = Surv(surv_mm, death_cancer) ~ year8594, data = melanoma.l)
## 
##   n= 5318, number of events= 1013 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## year8594Diagnosed 85-94 -0.25790   0.77267  0.06565 -3.928 8.56e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## year8594Diagnosed 85-94    0.7727      1.294    0.6794    0.8788
## 
## Concordance= 0.533  (se = 0.008 )
## Likelihood ratio test= 15.49  on 1 df,   p=8e-05
## Wald test            = 15.43  on 1 df,   p=9e-05
## Score (logrank) test = 15.51  on 1 df,   p=8e-05

The three test statistics are

    log-rank

    :   14.85 

    Wald

    :   $-3.84^2=14.75$ (from the $z$ test above)

    Likelihood ratio

    :   14.78 (from the output above)

The three test statistics are very similar. We would expect each of these test statistics to be similar since they each test the same null hypothesis that survival is independent of calendar period. The null hypothesis in each case is that survival depends on calendar period in such a way that the hazard ratio between the two periods is constant over follow-up time (i.e. proportional hazards).

(c)

## Call:
## coxph(formula = Surv(surv_mm, death_cancer) ~ year8594 + sex + 
##     agegrp, data = melanoma.l2)
## 
##   n= 5318, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## year8594Diagnosed 85-94 -0.33339   0.71649  0.06618 -5.037 4.72e-07 ***
## sexFemale               -0.53061   0.58825  0.06545 -8.107 5.19e-16 ***
## agegrp45-59              0.28283   1.32688  0.09417  3.003  0.00267 ** 
## agegrp60-74              0.62006   1.85904  0.09088  6.823 8.90e-12 ***
## agegrp75+                1.21801   3.38045  0.10443 11.663  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## year8594Diagnosed 85-94    0.7165     1.3957    0.6293    0.8157
## sexFemale                  0.5882     1.7000    0.5174    0.6688
## agegrp45-59                1.3269     0.7536    1.1032    1.5959
## agegrp60-74                1.8590     0.5379    1.5557    2.2215
## agegrp75+                  3.3804     0.2958    2.7547    4.1483
## 
## Concordance= 0.646  (se = 0.009 )
## Likelihood ratio test= 212.7  on 5 df,   p=<2e-16
## Wald test            = 217.9  on 5 df,   p=<2e-16
## Score (logrank) test = 226.8  on 5 df,   p=<2e-16
## Linear hypothesis test
## 
## Hypothesis:
## agegrp45 - 59 = 0
## agegrp60 - 74 = 0
## agegrp75  +  = 0
## 
## Model 1: restricted model
## Model 2: Surv(surv_mm, death_cancer) ~ year8594 + sex + agegrp
## 
##   Res.Df Df  Chisq Pr(>Chisq)    
## 1   5316                         
## 2   5313  3 154.38  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The regression equation is

\[\begin{align*} h(t|\text{year8594},\text{sex},\text{agegrp}) &= h_0(t) \exp(\beta_1 I(\text{year8594}="\text{Diagnosed 85-94}")+\beta_2 I(\text{sex}="\text{Female}")+\beta_3 I(\text{agegrp}="\text{45-59}")+\beta_4 I(\text{agegrp}="\text{60-74}")+\beta_5 I(\text{agegrp}="\text{75+}")) \end{align*}\] where \(h(t|\text{year8594},\text{sex},\text{agegrp})\) is the hazard at time \(t\) given covariates \(\text{year8594}\), \(\text{sex}\) and \(\text{agegrp}\), with baseline hazard \(h_0(t)\) and regression coefficients representing log hazard ratios for \(\beta_1\) for the calendar period 1985–1994, \(\beta_2\) for females, \(\beta_3\) for those aged 45–59 years at diagnosis, \(\beta_4\) for those aged 60–74 years and \(\beta_5\) for those aged 75 years and over.

  1. For patients of the same sex diagnosed in the same calendar period, those aged 60–74 at diagnosis have an estimated 86% higher risk of death due to skin melanoma than those aged 0–44 at diagnosis. The difference is statistically significant.

    If this were an exam question the previous paragraph would be awarded full marks. It is worth noting, however, that the analysis is adjusted for the fact that mortality may depend on time since diagnosis (since this is the underlying time scale) and the mortality ratio between the two age groups is assumed to be the same at each point during the follow-up (i.e., proportional hazard).

  2. The parameter estimate for period changes from 0.78 to 0.72 when age and sex are added to the model. Whether this is ‘strong confounding’, or even ‘confounding’, is a matter of judgement. I would consider this confounding but not strong confounding but there is no correct answer to this question.

  3. Age (modelled as a categorical variable with 4 levels) is highly significant in the model.

(d)

## Call:
## coxph(formula = Surv(surv_mm, death_cancer) ~ year8594 + sex, 
##     data = melanoma.l2)
## 
##   n= 5318, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## year8594Diagnosed 85-94 -0.27976   0.75596  0.06588 -4.246 2.17e-05 ***
## sexFemale               -0.47825   0.61987  0.06494 -7.364 1.78e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## year8594Diagnosed 85-94    0.7560      1.323    0.6644    0.8602
## sexFemale                  0.6199      1.613    0.5458    0.7040
## 
## Concordance= 0.578  (se = 0.009 )
## Likelihood ratio test= 69.32  on 2 df,   p=9e-16
## Wald test            = 69.03  on 2 df,   p=1e-15
## Score (logrank) test = 70.02  on 2 df,   p=6e-16
## Analysis of Deviance Table
##  Cox model: response is  Surv(surv_mm, death_cancer)
##  Model 1: ~ year8594 + sex + agegrp
##  Model 2: ~ year8594 + sex
##    loglik  Chisq Df Pr(>|Chi|)    
## 1 -7792.7                         
## 2 -7864.4 143.37  3  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of Deviance Table
##  Cox model: response is Surv(surv_mm, death_cancer)
## Terms added sequentially (first to last)
## 
##           loglik   Chisq Df Pr(>|Chi|)    
## NULL     -7899.0                          
## year8594 -7891.6  14.829  1  0.0001177 ***
## sex      -7864.4  54.495  1  1.559e-13 ***
## agegrp   -7792.7 143.365  3  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Age (modelled as a categorical variable with 4 levels) is highly significant in the model. The Wald test is an approximation to the LR test and we would expect the two to be similar (which they are).

(e)

## Call:
## coxph(formula = Surv(surv_mm, death_cancer) ~ year8594 + sex + 
##     agegrp, data = melanoma.l2)
## 
##   n= 5318, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## year8594Diagnosed 85-94 -0.33339   0.71649  0.06618 -5.037 4.72e-07 ***
## sexFemale               -0.53061   0.58825  0.06545 -8.107 5.19e-16 ***
## agegrp45-59              0.28283   1.32688  0.09417  3.003  0.00267 ** 
## agegrp60-74              0.62006   1.85904  0.09088  6.823 8.90e-12 ***
## agegrp75+                1.21801   3.38045  0.10443 11.663  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## year8594Diagnosed 85-94    0.7165     1.3957    0.6293    0.8157
## sexFemale                  0.5882     1.7000    0.5174    0.6688
## agegrp45-59                1.3269     0.7536    1.1032    1.5959
## agegrp60-74                1.8590     0.5379    1.5557    2.2215
## agegrp75+                  3.3804     0.2958    2.7547    4.1483
## 
## Concordance= 0.646  (se = 0.009 )
## Likelihood ratio test= 212.7  on 5 df,   p=<2e-16
## Wald test            = 217.9  on 5 df,   p=<2e-16
## Score (logrank) test = 226.8  on 5 df,   p=<2e-16
## 
## Call:
## glm(formula = death_cancer ~ fu + year8594 + sex + agegrp + offset(log(pt)), 
##     family = poisson, data = melanoma.spl)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.5646  -0.2672  -0.2098  -0.1597   3.6806  
## 
## Coefficients:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -6.85172    0.14322 -47.841  < 2e-16 ***
## fu12                     1.26827    0.13592   9.331  < 2e-16 ***
## fu24                     1.30657    0.13806   9.464  < 2e-16 ***
## fu36                     1.07575    0.14627   7.354 1.92e-13 ***
## fu48                     0.89517    0.15559   5.753 8.75e-09 ***
## fu60                     0.81370    0.16368   4.971 6.65e-07 ***
## fu72                     0.58637    0.17957   3.265  0.00109 ** 
## fu84                     0.25361    0.20758   1.222  0.22181    
## fu96                     0.36427    0.21006   1.734  0.08290 .  
## fu108                   -0.22796    0.27844  -0.819  0.41296    
## year8594Diagnosed 85-94 -0.32516    0.06618  -4.913 8.97e-07 ***
## sexFemale               -0.53180    0.06545  -8.125 4.48e-16 ***
## agegrp45-59              0.28352    0.09417   3.011  0.00261 ** 
## agegrp60-74              0.62185    0.09088   6.843 7.76e-12 ***
## agegrp75+                1.22386    0.10444  11.718  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 8651.5  on 34308  degrees of freedom
## Residual deviance: 8233.4  on 34294  degrees of freedom
## AIC: 10183
## 
## Number of Fisher Scoring iterations: 7
##                          exp(beta)        2.5 %      97.5 %
## (Intercept)             0.00105764 0.0007987845 0.001400381
## fu12                    3.55468470 2.7233409232 4.639809589
## fu24                    3.69349752 2.8178702496 4.841217922
## fu36                    2.93219656 2.2013365582 3.905707478
## fu48                    2.44775331 1.8043764564 3.320535593
## fu60                    2.25623262 1.6370304171 3.109646344
## fu72                    1.79745329 1.2641700059 2.555699257
## fu84                    1.28866663 0.8579195554 1.935684625
## fu96                    1.43945962 0.9536609622 2.172726024
## fu108                   0.79615726 0.4613049163 1.374072448
## year8594Diagnosed 85-94 0.72241051 0.6345232662 0.822470947
## sexFemale               0.58754651 0.5168075778 0.667967959
## agegrp45-59             1.32779475 1.1040048878 1.596948453
## agegrp60-74             1.86237635 1.5585268021 2.225464233
## agegrp75+               3.40028687 2.7708463710 4.172714492
## Call:
## coxph(formula = Surv(surv_mm, death_cancer) ~ year8594 + sex + 
##     agegrp, data = melanoma.l2)
## 
##   n= 5318, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## year8594Diagnosed 85-94 -0.33339   0.71649  0.06618 -5.037 4.72e-07 ***
## sexFemale               -0.53061   0.58825  0.06545 -8.107 5.19e-16 ***
## agegrp45-59              0.28283   1.32688  0.09417  3.003  0.00267 ** 
## agegrp60-74              0.62006   1.85904  0.09088  6.823 8.90e-12 ***
## agegrp75+                1.21801   3.38045  0.10443 11.663  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## year8594Diagnosed 85-94    0.7165     1.3957    0.6293    0.8157
## sexFemale                  0.5882     1.7000    0.5174    0.6688
## agegrp45-59                1.3269     0.7536    1.1032    1.5959
## agegrp60-74                1.8590     0.5379    1.5557    2.2215
## agegrp75+                  3.3804     0.2958    2.7547    4.1483
## 
## Concordance= 0.646  (se = 0.009 )
## Likelihood ratio test= 212.7  on 5 df,   p=<2e-16
## Wald test            = 217.9  on 5 df,   p=<2e-16
## Score (logrank) test = 226.8  on 5 df,   p=<2e-16
  1. Both models adjust for the same factors. When fitting the Poisson regression model we split time since diagnosis into annual intervals and explicitly estimated the effect of this factor in the model. The Cox model does not estimate the effect of ‘time’ but the other estimates are adjusted for ‘time’.

  2. Since the two models are conceptually similar we would expect the parameter estimates to be similar, which they are.

  3. Yes, both models assume ‘proportional hazards’. The proportional hazards assumption implies that the risk ratios for sex, period, and age are constant across all levels of follow-up time. In other words, the assumption is that there is no effect modification by follow-up time. This assumption is implicit in Poisson regression (as it is in logistic regression) where it is assumed that estimated risk ratios are constant across all combination of the other covariates. We can, of course, relax this assumption by fitting interaction terms.

(f)

## List of 16
##  $ n        : int 5318
##  $ time     : num [1:121] 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 ...
##  $ n.risk   : num [1:121] 5318 5315 5304 5298 5292 ...
##  $ n.event  : num [1:121] 0 5 1 4 4 7 5 8 7 11 ...
##  $ n.censor : num [1:121] 3 6 5 2 11 9 6 7 9 6 ...
##  $ surv     : num [1:121] 1 0.999 0.999 0.998 0.997 ...
##  $ std.err  : num [1:121] 0 0.000421 0.000461 0.000596 0.000706 ...
##  $ cumhaz   : num [1:121] 0 0.000941 0.001129 0.001884 0.00264 ...
##  $ std.chaz : num [1:121] 0 0.000421 0.000461 0.000596 0.000706 ...
##  $ type     : chr "right"
##  $ logse    : logi TRUE
##  $ conf.int : num 0.95
##  $ conf.type: chr "log"
##  $ lower    : num [1:121] 1 0.998 0.998 0.997 0.996 ...
##  $ upper    : num [1:121] 1 1 1 0.999 0.999 ...
##  $ call     : language survfit(formula = Surv(surv_mm, event = death_cancer) ~ 1, data = melanoma.l2)
##  - attr(*, "class")= chr "survfit"
## [1] 0.5 1.5 2.5 3.5 4.5 5.5
## `summarise()` has grouped output by 'fu', 'year8594', 'sex'. You can override using the `.groups`
## argument.
##                         exp(beta)     2.5 %    97.5 %
## year8594Diagnosed 85-94 0.7168836 0.6296723 0.8161739
## sexFemale               0.5888144 0.5179256 0.6694059
## agegrp45-59             1.3263971 1.1028413 1.5952697
## agegrp60-74             1.8573227 1.5542946 2.2194298
## agegrp75+               3.3726522 2.7483711 4.1387363
## Call:
## coxph(formula = Surv(surv_mm, death_cancer) ~ year8594 + sex + 
##     agegrp, data = melanoma.l2)
## 
##   n= 5318, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## year8594Diagnosed 85-94 -0.33339   0.71649  0.06618 -5.037 4.72e-07 ***
## sexFemale               -0.53061   0.58825  0.06545 -8.107 5.19e-16 ***
## agegrp45-59              0.28283   1.32688  0.09417  3.003  0.00267 ** 
## agegrp60-74              0.62006   1.85904  0.09088  6.823 8.90e-12 ***
## agegrp75+                1.21801   3.38045  0.10443 11.663  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## year8594Diagnosed 85-94    0.7165     1.3957    0.6293    0.8157
## sexFemale                  0.5882     1.7000    0.5174    0.6688
## agegrp45-59                1.3269     0.7536    1.1032    1.5959
## agegrp60-74                1.8590     0.5379    1.5557    2.2215
## agegrp75+                  3.3804     0.2958    2.7547    4.1483
## 
## Concordance= 0.646  (se = 0.009 )
## Likelihood ratio test= 212.7  on 5 df,   p=<2e-16
## Wald test            = 217.9  on 5 df,   p=<2e-16
## Score (logrank) test = 226.8  on 5 df,   p=<2e-16

(g)

## `summarise()` has grouped output by 'mid', 'year8594', 'sex'. You can override using the `.groups`
## argument.
## 
## Call:
## glm(formula = death_cancer ~ ns(mid, df = 3) + year8594 + sex + 
##     agegrp + offset(log(pt)), family = poisson, data = melanoma.spl4)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.6814  -0.9307  -0.3826   0.5299   3.2291  
## 
## Coefficients:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -6.57844    0.12618 -52.134  < 2e-16 ***
## ns(mid, df = 3)1        -0.81303    0.16775  -4.847 1.26e-06 ***
## ns(mid, df = 3)2         1.24995    0.27265   4.584 4.55e-06 ***
## ns(mid, df = 3)3        -0.99353    0.19434  -5.112 3.18e-07 ***
## year8594Diagnosed 85-94 -0.32337    0.06622  -4.884 1.04e-06 ***
## sexFemale               -0.53157    0.06545  -8.121 4.61e-16 ***
## agegrp45-59              0.28455    0.09417   3.022  0.00251 ** 
## agegrp60-74              0.62317    0.09088   6.857 7.01e-12 ***
## agegrp75+                1.22532    0.10445  11.731  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 1101.87  on 639  degrees of freedom
## Residual deviance:  733.79  on 631  degrees of freedom
## AIC: 1781
## 
## Number of Fisher Scoring iterations: 5
##                           exp(beta)       2.5 %      97.5 %
## (Intercept)             0.001390012 0.001085452 0.001780027
## ns(mid, df = 3)1        0.443510637 0.319238413 0.616159200
## ns(mid, df = 3)2        3.490164898 2.045326838 5.955650114
## ns(mid, df = 3)3        0.370266633 0.252983714 0.541921761
## year8594Diagnosed 85-94 0.723704143 0.635618981 0.823996297
## sexFemale               0.587678725 0.516922269 0.668120343
## agegrp45-59             1.329162191 1.105140512 1.598595030
## agegrp60-74             1.864821637 1.560572459 2.228387231
## agegrp75+               3.405250925 2.774846360 4.178874200
## lattice theme set by effectsTheme()
## See ?effectsTheme for details.