Exercise 13. Modelling the diet data using Cox regression
Now fit a model to the localised melanoma data where the outcome is observed survival (i.e. all deaths are considered to be events).
You may have to install the required packages the first time you use
them. You can install a package by
install.packages("package_of_interest")
for each package
you require.
Load the melanoma data and explore it.
(a)
## y is the observed time
## so it already measures time since entry
poisson13a <- glm(chd ~ hieng + offset(log(y)), family=poisson, data=diet)
summary(poisson13a)
##
## Call:
## glm(formula = chd ~ hieng + offset(log(y)), family = poisson,
## data = diet)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.2980 0.1890 -22.744 <2e-16 ***
## hienghigh -0.6532 0.3021 -2.162 0.0306 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 262.82 on 336 degrees of freedom
## Residual deviance: 258.00 on 335 degrees of freedom
## AIC: 354
##
## Number of Fisher Scoring iterations: 6
## exp(beta) 2.5 % 97.5 %
## (Intercept) 0.01359599 0.009387713 0.01969074
## hienghigh 0.52035993 0.287843219 0.94070120
## Call:
## coxph(formula = Surv(y, chd) ~ hieng, data = diet)
##
## n= 337, number of events= 46
##
## coef exp(coef) se(coef) z Pr(>|z|)
## hienghigh -0.6475 0.5234 0.3022 -2.143 0.0321 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## hienghigh 0.5234 1.911 0.2895 0.9462
##
## Concordance= 0.581 (se = 0.037 )
## Likelihood ratio test= 4.73 on 1 df, p=0.03
## Wald test = 4.59 on 1 df, p=0.03
## Score (logrank) test = 4.75 on 1 df, p=0.03
These two models are conceptually different since the Cox model
adjusts for ‘time’ even though this is not explicit in the
coxph
function. In this example, ‘time’ refers to ‘time on
study’ (time since entry) which we do not expect to be a strong
confounder. That is, we would expect the estimates of the effect of high
energy to be similar for the two models, which they are.
(b)
If we use a different timescale then this amounts to adjusting for a different factor. As such, we would not expect the estimates to be identical. Attained age, unlike time since entry, is expected to be a confounder but we see that it is not a strong confounder.
## counting process data with coxph
scale <- 365.24
cox13b <- coxph(Surv((doe-dob)/scale, (dox-dob)/scale, chd) ~ hieng, data=diet)
summary(cox13b)
## Call:
## coxph(formula = Surv((doe - dob)/scale, (dox - dob)/scale, chd) ~
## hieng, data = diet)
##
## n= 337, number of events= 46
##
## coef exp(coef) se(coef) z Pr(>|z|)
## hienghigh -0.6114 0.5426 0.3028 -2.019 0.0435 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## hienghigh 0.5426 1.843 0.2997 0.9823
##
## Concordance= 0.584 (se = 0.037 )
## Likelihood ratio test= 4.2 on 1 df, p=0.04
## Wald test = 4.08 on 1 df, p=0.04
## Score (logrank) test = 4.2 on 1 df, p=0.04