Exercise 13. Modelling the diet data using Cox regression
Now fit a model to the localised melanoma data where the outcome is observed survival (i.e. all deaths are considered to be events).
You may have to install the required packages the first time you use them. You can install a package by install.packages("package_of_interest")
for each package you require.
Load the melanoma data and explore it.
(a)
## y is the observed time
## so it already measures time since entry
poisson13a <- glm( chd ~ hieng + offset( log( y ) ), family=poisson, data=diet)
summary(poisson13a)
##
## Call:
## glm(formula = chd ~ hieng + offset(log(y)), family = poisson,
## data = diet)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.7382 -0.6337 -0.4899 -0.3891 3.0161
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.2980 0.1890 -22.744 <2e-16 ***
## hienghigh -0.6532 0.3021 -2.162 0.0306 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 262.82 on 336 degrees of freedom
## Residual deviance: 258.00 on 335 degrees of freedom
## AIC: 354
##
## Number of Fisher Scoring iterations: 6
## exp(beta) 2.5 % 97.5 %
## (Intercept) 0.01359599 0.009387713 0.01969074
## hienghigh 0.52035993 0.287843219 0.94070120
## Call:
## coxph(formula = Surv(y, chd) ~ hieng, data = diet)
##
## n= 337, number of events= 46
##
## coef exp(coef) se(coef) z Pr(>|z|)
## hienghigh -0.6475 0.5234 0.3022 -2.143 0.0321 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## hienghigh 0.5234 1.911 0.2895 0.9462
##
## Concordance= 0.581 (se = 0.037 )
## Likelihood ratio test= 4.73 on 1 df, p=0.03
## Wald test = 4.59 on 1 df, p=0.03
## Score (logrank) test = 4.75 on 1 df, p=0.03
These two models are conceptually different since the Cox model adjusts for ‘time’ even though this is not explicit in the coxph
function. In this example, ‘time’ refers to ‘time on study’ (time since entry) which we do not expect to be a strong confounder. That is, we would expect the estimates of the effect of high energy to be similar for the two models, which they are.
(b)
If we use a different timescale then this amounts to adjusting for a different factor. As such, we would not expect the estimates to be identical. Attained age, unlike time since entry, is expected to be a confounder but we see that it is not a strong confounder.
## counting process data with coxph
scale <- 365.24
cox13b <- coxph(Surv((doe-dob)/scale, (dox-dob)/scale, chd) ~ hieng, data=diet)
summary(cox13b)
## Call:
## coxph(formula = Surv((doe - dob)/scale, (dox - dob)/scale, chd) ~
## hieng, data = diet)
##
## n= 337, number of events= 46
##
## coef exp(coef) se(coef) z Pr(>|z|)
## hienghigh -0.6114 0.5426 0.3028 -2.019 0.0435 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## hienghigh 0.5426 1.843 0.2997 0.9823
##
## Concordance= 0.584 (se = 0.037 )
## Likelihood ratio test= 4.2 on 1 df, p=0.04
## Wald test = 4.08 on 1 df, p=0.04
## Score (logrank) test = 4.2 on 1 df, p=0.04