Exercise 13. Modelling the diet data using Cox regression


Now fit a model to the localised melanoma data where the outcome is observed survival (i.e. all deaths are considered to be events).

You may have to install the required packages the first time you use them. You can install a package by install.packages("package_of_interest") for each package you require.

Load the melanoma data and explore it.

(a)

## 
## Call:
## glm(formula = chd ~ hieng + offset(log(y)), family = poisson, 
##     data = diet)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.7382  -0.6337  -0.4899  -0.3891   3.0161  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -4.2980     0.1890 -22.744   <2e-16 ***
## hienghigh    -0.6532     0.3021  -2.162   0.0306 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 262.82  on 336  degrees of freedom
## Residual deviance: 258.00  on 335  degrees of freedom
## AIC: 354
## 
## Number of Fisher Scoring iterations: 6
##              exp(beta)       2.5 %     97.5 %
## (Intercept) 0.01359599 0.009387713 0.01969074
## hienghigh   0.52035993 0.287843219 0.94070120
## Call:
## coxph(formula = Surv(y, chd) ~ hieng, data = diet)
## 
##   n= 337, number of events= 46 
## 
##              coef exp(coef) se(coef)      z Pr(>|z|)  
## hienghigh -0.6475    0.5234   0.3022 -2.143   0.0321 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##           exp(coef) exp(-coef) lower .95 upper .95
## hienghigh    0.5234      1.911    0.2895    0.9462
## 
## Concordance= 0.581  (se = 0.037 )
## Likelihood ratio test= 4.73  on 1 df,   p=0.03
## Wald test            = 4.59  on 1 df,   p=0.03
## Score (logrank) test = 4.75  on 1 df,   p=0.03

These two models are conceptually different since the Cox model adjusts for ‘time’ even though this is not explicit in the coxph function. In this example, ‘time’ refers to ‘time on study’ (time since entry) which we do not expect to be a strong confounder. That is, we would expect the estimates of the effect of high energy to be similar for the two models, which they are.

(b)

If we use a different timescale then this amounts to adjusting for a different factor. As such, we would not expect the estimates to be identical. Attained age, unlike time since entry, is expected to be a confounder but we see that it is not a strong confounder.

## Call:
## coxph(formula = Surv((doe - dob)/scale, (dox - dob)/scale, chd) ~ 
##     hieng, data = diet)
## 
##   n= 337, number of events= 46 
## 
##              coef exp(coef) se(coef)      z Pr(>|z|)  
## hienghigh -0.6114    0.5426   0.3028 -2.019   0.0435 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##           exp(coef) exp(-coef) lower .95 upper .95
## hienghigh    0.5426      1.843    0.2997    0.9823
## 
## Concordance= 0.584  (se = 0.037 )
## Likelihood ratio test= 4.2  on 1 df,   p=0.04
## Wald test            = 4.08  on 1 df,   p=0.04
## Score (logrank) test = 4.2  on 1 df,   p=0.04