Exercise 10. Examining the proportional hazards hypothesis (localised melanoma)
Load the diet data using time-on-study as the timescale with a maximum of 10 years follow-up.
You may have to install the required packages the first time you use
them. You can install a package by
install.packages("package_of_interest")
for each package
you require.
library(biostat3)
library(bshazard)
library(car) # car::linearHypothesis -> biostat3::lincom
library(tinyplot) # plt()
Load melanoma data and explore it.
localised <-
subset(biostat3::melanoma, stage == "Localised") |>
transform(death_cancer = ifelse( status == "Dead: cancer" & surv_mm <= 120, 1, 0), #censoring for > 120 months
trunc_yy = pmin(surv_mm/12,10)) #scale to years and truncate to 10 years
(a)
If we look at the hazard curves, at their peak the ratio is approximately \(0.04/0.065\approx0.62\). The ratio seems to vary with respect to time.
# Using bshazard to smooth the Kaplan-Meier hazards by strata
hazDiaDate <- lapply(levels(localised$year8594),
function(level) bshazard(Surv(trunc_yy,death_cancer)~1, v=F,
data=subset(localised, year8594==level)) |>
with(data.frame(time,hazard,lower.ci,upper.ci,
year8594=level))) |>
do.call(what=rbind)
## Comparing hazards
with(hazDiaDate, plt(hazard~time|year8594, ymin=lower.ci, ymax=upper.ci,
group=year8594,
type="ribbon",
xlab="Time since diagnosis (years)",
ylab="Hazard"))
(b)
There is some evidence against the assumption of proportional hazards since we see non-parallel curves when plotting the instantaneous cause-specific hazard on the log scale.
(c)
If the proportional hazards assumption is appropriate then we should see parallel lines. This looks okay; we shouldn’t put too much weight on the fact that the curves cross early in the follow-up since there are so few deaths there. The difference between the two log-cumulative hazard curves is similar during the part of the follow-up where we have the most information (most deaths). Note that these curves are not based on the estimated Cox model (i.e., they are unadjusted).
(d)
The estimated hazard ratio from the Cox model is \(0.78\) which is similar (as it should be) to the estimate made by looking at the hazard function plot.
# Cox regression with time-since-entry as the timescale
# Note that R uses the Efron method for approximating the likelihood in the presence of ties
# whereas Stata (and some other software) use the Breslow method
cox1 <- coxph(Surv(trunc_yy, death_cancer==1) ~ year8594, data=localised)
summary(cox1)
## Call:
## coxph(formula = Surv(trunc_yy, death_cancer == 1) ~ year8594,
## data = localised)
##
## n= 5318, number of events= 960
##
## coef exp(coef) se(coef) z Pr(>|z|)
## year8594Diagnosed 85-94 -0.25297 0.77649 0.06579 -3.845 0.000121 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## year8594Diagnosed 85-94 0.7765 1.288 0.6825 0.8834
##
## Concordance= 0.533 (se = 0.008 )
## Likelihood ratio test= 14.83 on 1 df, p=1e-04
## Wald test = 14.78 on 1 df, p=1e-04
## Score (logrank) test = 14.86 on 1 df, p=1e-04
(e)
The plot of the scaled Schoenfeld residuals for the effect of period. Under proportional hazards, the smoother will be a horizontal line. The line does vary (with possibly an increasing log hazard ratio at around 4 years), but not in a linear pattern, which is consistent with the p-value of 0.10 for linear trend.
cox2 <- coxph(Surv(trunc_yy, death_cancer==1) ~ sex + year8594 + agegrp, data=localised)
summary(cox2)
## Call:
## coxph(formula = Surv(trunc_yy, death_cancer == 1) ~ sex + year8594 +
## agegrp, data = localised)
##
## n= 5318, number of events= 960
##
## coef exp(coef) se(coef) z Pr(>|z|)
## sexFemale -0.53061 0.58825 0.06545 -8.107 5.19e-16 ***
## year8594Diagnosed 85-94 -0.33339 0.71649 0.06618 -5.037 4.72e-07 ***
## agegrp45-59 0.28283 1.32688 0.09417 3.003 0.00267 **
## agegrp60-74 0.62006 1.85904 0.09088 6.823 8.90e-12 ***
## agegrp75+ 1.21801 3.38045 0.10443 11.663 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## sexFemale 0.5882 1.7000 0.5174 0.6688
## year8594Diagnosed 85-94 0.7165 1.3957 0.6293 0.8157
## agegrp45-59 1.3269 0.7536 1.1032 1.5959
## agegrp60-74 1.8590 0.5379 1.5557 2.2215
## agegrp75+ 3.3804 0.2958 2.7547 4.1483
##
## Concordance= 0.646 (se = 0.009 )
## Likelihood ratio test= 212.7 on 5 df, p=<2e-16
## Wald test = 217.9 on 5 df, p=<2e-16
## Score (logrank) test = 226.8 on 5 df, p=<2e-16
## Plot of the scaled Schoenfeld residuals for calendar period 1985–94.
## The smooth line shows the estimated log hazard ratio as a function of time.
cox2.phtest <- cox.zph(cox2, terms=FALSE) # for separate plots
print(cox2.phtest)
## chisq df p
## sexFemale 0.823 1 0.36431
## year8594Diagnosed 85-94 2.632 1 0.10471
## agegrp45-59 0.860 1 0.35365
## agegrp60-74 0.855 1 0.35508
## agegrp75+ 10.502 1 0.00119
## GLOBAL 22.247 5 0.00047
(f)
Based on the tests for linear trends, we expect a trend for the log hazard ratio comparing ages less than 45 years with those aged 75 years and over. All three log hazard ratios tend to decrease by time of diagnosis, while the trend is more marked for the oldest age group (75+). There is evidence for non-proportionality, particularly for those patients aged 75 years and over.
(g)
It seems that there is evidence of non-proportional hazards by age (particularly for the comparison of the oldest to youngest) but not for calendar period. The plot of Schoenfeld residuals suggested non-proportionality for period but this was not statistically significant.
## chisq df p
## sex 0.823 1 0.36431
## year8594 2.632 1 0.10471
## agegrp 16.858 3 0.00076
## GLOBAL 22.247 5 0.00047
(h)
The hazard ratios for age in the top panel are for the first two
years subsequent to diagnosis. To obtain the hazard ratios for the
period two years or more following diagnosis we multiply the hazard
ratios in the top and bottom panel. That is, during the first two years
following diagnosis patients aged 75 years or more at diagnosis have 5.4
times higher cancer-specific mortality than patients aged 0–44 at
diagnosis. During the period two years or more following diagnosis the
corresponding hazard ratio is \(5.4 \times
0.49=2.66\).
Using survSplit
to split on time will give you the same
results as above. We see that the age:follow up interaction is
statistically significant.
melanoma2p8Split <- survSplit(localised, cut=c(2), end="trunc_yy", start="start",
event="death_cancer", episode="fu") |>
transform(fu = as.factor(fu))
##Tabulate ageband including risk_time
melanoma2p8Split |> subset(id<=3, select=c(id, start, trunc_yy))
## id start trunc_yy
## 1 1 0 2.000000
## 2 1 2 2.208333
## 3 2 0 2.000000
## 4 2 2 4.625000
## 5 3 0 2.000000
## 6 3 2 10.000000
## sex age stage mmdx yydx surv_mm surv_yy status subsite
## 1 Female 81 Localised 2 1981 26.5 2.5 Dead: other Head and Neck
## 2 Female 81 Localised 2 1981 26.5 2.5 Dead: other Head and Neck
## 3 Female 75 Localised 9 1975 55.5 4.5 Dead: other Head and Neck
## 4 Female 75 Localised 9 1975 55.5 4.5 Dead: other Head and Neck
## 5 Female 78 Localised 2 1978 177.5 14.5 Dead: other Limbs
## 6 Female 78 Localised 2 1978 177.5 14.5 Dead: other Limbs
## year8594 dx exit agegrp id ydx yexit start
## 1 Diagnosed 75-84 1981-02-02 1983-04-20 75+ 1 1981.088 1983.298 0
## 2 Diagnosed 75-84 1981-02-02 1983-04-20 75+ 1 1981.088 1983.298 2
## 3 Diagnosed 75-84 1975-09-21 1980-05-07 75+ 2 1975.720 1980.348 0
## 4 Diagnosed 75-84 1975-09-21 1980-05-07 75+ 2 1975.720 1980.348 2
## 5 Diagnosed 75-84 1978-02-21 1992-12-07 75+ 3 1978.140 1992.934 0
## 6 Diagnosed 75-84 1978-02-21 1992-12-07 75+ 3 1978.140 1992.934 2
## trunc_yy death_cancer fu
## 1 2.000000 0 1
## 2 2.208333 0 2
## 3 2.000000 0 1
## 4 4.625000 0 2
## 5 2.000000 0 1
## 6 10.000000 0 2
cox2p8Split1 <- coxph(Surv(start, trunc_yy, death_cancer) ~ sex + year8594 + agegrp*fu,
data=melanoma2p8Split)
summary(cox2p8Split1)
## Call:
## coxph(formula = Surv(start, trunc_yy, death_cancer) ~ sex + year8594 +
## agegrp * fu, data = melanoma2p8Split)
##
## n= 9856, number of events= 960
##
## coef exp(coef) se(coef) z Pr(>|z|)
## sexFemale -0.52742 0.59012 0.06543 -8.061 7.58e-16 ***
## year8594Diagnosed 85-94 -0.33548 0.71499 0.06623 -5.065 4.08e-07 ***
## agegrp45-59 0.53058 1.69992 0.19634 2.702 0.00689 **
## agegrp60-74 0.90046 2.46074 0.18741 4.805 1.55e-06 ***
## agegrp75+ 1.68918 5.41503 0.19175 8.809 < 2e-16 ***
## fu2 NA NA 0.00000 NA NA
## agegrp45-59:fu2 -0.32093 0.72547 0.22382 -1.434 0.15161
## agegrp60-74:fu2 -0.36715 0.69270 0.21467 -1.710 0.08720 .
## agegrp75+:fu2 -0.70783 0.49271 0.23207 -3.050 0.00229 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## sexFemale 0.5901 1.6946 0.5191 0.6709
## year8594Diagnosed 85-94 0.7150 1.3986 0.6280 0.8141
## agegrp45-59 1.6999 0.5883 1.1569 2.4978
## agegrp60-74 2.4607 0.4064 1.7043 3.5529
## agegrp75+ 5.4150 0.1847 3.7186 7.8853
## fu2 NA NA NA NA
## agegrp45-59:fu2 0.7255 1.3784 0.4678 1.1250
## agegrp60-74:fu2 0.6927 1.4436 0.4548 1.0550
## agegrp75+:fu2 0.4927 2.0296 0.3126 0.7765
##
## Concordance= 0.645 (se = 0.009 )
## Likelihood ratio test= 222.5 on 8 df, p=<2e-16
## Wald test = 224.5 on 8 df, p=<2e-16
## Score (logrank) test = 238 on 8 df, p=<2e-16
cox2p8Split1b <- coxph(Surv(start, trunc_yy, death_cancer) ~ sex + year8594 + agegrp +
I(agegrp=="45-59" & fu=="2") + I(agegrp=="60-74" & fu=="2") +
I(agegrp=="75+" & fu=="2"), data=melanoma2p8Split)
summary(cox2p8Split1b)
## Call:
## coxph(formula = Surv(start, trunc_yy, death_cancer) ~ sex + year8594 +
## agegrp + I(agegrp == "45-59" & fu == "2") + I(agegrp == "60-74" &
## fu == "2") + I(agegrp == "75+" & fu == "2"), data = melanoma2p8Split)
##
## n= 9856, number of events= 960
##
## coef exp(coef) se(coef) z
## sexFemale -0.52742 0.59012 0.06543 -8.061
## year8594Diagnosed 85-94 -0.33548 0.71499 0.06623 -5.065
## agegrp45-59 0.53058 1.69992 0.19634 2.702
## agegrp60-74 0.90046 2.46074 0.18741 4.805
## agegrp75+ 1.68918 5.41503 0.19175 8.809
## I(agegrp == "45-59" & fu == "2")TRUE -0.32093 0.72547 0.22382 -1.434
## I(agegrp == "60-74" & fu == "2")TRUE -0.36715 0.69270 0.21467 -1.710
## I(agegrp == "75+" & fu == "2")TRUE -0.70783 0.49271 0.23207 -3.050
## Pr(>|z|)
## sexFemale 7.58e-16 ***
## year8594Diagnosed 85-94 4.08e-07 ***
## agegrp45-59 0.00689 **
## agegrp60-74 1.55e-06 ***
## agegrp75+ < 2e-16 ***
## I(agegrp == "45-59" & fu == "2")TRUE 0.15161
## I(agegrp == "60-74" & fu == "2")TRUE 0.08720 .
## I(agegrp == "75+" & fu == "2")TRUE 0.00229 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## sexFemale 0.5901 1.6946 0.5191 0.6709
## year8594Diagnosed 85-94 0.7150 1.3986 0.6280 0.8141
## agegrp45-59 1.6999 0.5883 1.1569 2.4978
## agegrp60-74 2.4607 0.4064 1.7043 3.5529
## agegrp75+ 5.4150 0.1847 3.7186 7.8853
## I(agegrp == "45-59" & fu == "2")TRUE 0.7255 1.3784 0.4678 1.1250
## I(agegrp == "60-74" & fu == "2")TRUE 0.6927 1.4436 0.4548 1.0550
## I(agegrp == "75+" & fu == "2")TRUE 0.4927 2.0296 0.3126 0.7765
##
## Concordance= 0.645 (se = 0.009 )
## Likelihood ratio test= 222.5 on 8 df, p=<2e-16
## Wald test = 224.5 on 8 df, p=<2e-16
## Score (logrank) test = 238 on 8 df, p=<2e-16
The regression equation for the cox2p8Split1 model is \[\begin{align*} h(t|\text{year8594},\text{sex},\text{agegrp},\text{fu}) &= h_0(t) \exp(\beta_1 I(\text{sex}=\text{"Female"})+\beta_2 I(\text{year8594}=\text{"Diagnosed 85-94"})+\\ &\qquad\beta_3 I(\text{agegrp}=\text{"45-59"})+\beta_4 I(\text{agegrp}=\text{"60-74"})+\beta_5 I(\text{agegrp}=\text{"75+"}) + \\ &\qquad \beta_6 I(\text{agegrp}=\text{"45-59"} \&\ \text{fu}=2)+\beta_7 I(\text{agegrp}=\text{"60-74"} \&\ \text{fu}=2)+\beta_8 I(\text{agegrp}=\text{"75+"} \&\ \text{fu}=2)) \end{align*}\] where \(h(t|\text{year8594},\text{sex},\text{agegrp},\text{fu})\) is the hazard at time \(t\) given covariates \(\text{year8594}\), \(\text{sex}\), \(\text{agegrp}\) and \(\text{fu}\), with baseline hazard \(h_0(t)\) and regression coefficients representing log hazard ratios for \(\beta_1\) for females, \(\beta_2\) for the calendar period 1985–1994, \(\beta_3\) for those aged 45–59 years at diagnosis, \(\beta_4\) for those aged 60–74 years and \(\beta_5\) for those aged 75 years and over, with interaction terms for the change in log hazard ratio for the second follow-up period being \(\beta_6\) for those aged 45–59 years at diagnosis, \(\beta_7\) for those aged 60–74 years and \(\beta_8\) for those aged 75 years and over.
(i)
0–2 years | 2+ years | |
---|---|---|
Agegrp0-44 | 1.00 | 1.00 |
Agegrp45-59 | 1.70 | 1.23 |
Agegrp60-74 | 2.46 | 1.70 |
Agegrp75+ | 5.42 | 2.67 |
cox2p8Split2 <- coxph(Surv(start, trunc_yy, death_cancer) ~ sex + year8594 + fu +
fu:agegrp, data=melanoma2p8Split)
summary(cox2p8Split2)
## Call:
## coxph(formula = Surv(start, trunc_yy, death_cancer) ~ sex + year8594 +
## fu + fu:agegrp, data = melanoma2p8Split)
##
## n= 9856, number of events= 960
##
## coef exp(coef) se(coef) z Pr(>|z|)
## sexFemale -0.52742 0.59012 0.06543 -8.061 7.58e-16 ***
## year8594Diagnosed 85-94 -0.33548 0.71499 0.06623 -5.065 4.08e-07 ***
## fu2 NA NA 0.00000 NA NA
## fu1:agegrp45-59 0.53058 1.69992 0.19634 2.702 0.00689 **
## fu2:agegrp45-59 0.20965 1.23325 0.10774 1.946 0.05167 .
## fu1:agegrp60-74 0.90046 2.46074 0.18741 4.805 1.55e-06 ***
## fu2:agegrp60-74 0.53331 1.70456 0.10479 5.089 3.59e-07 ***
## fu1:agegrp75+ 1.68918 5.41503 0.19175 8.809 < 2e-16 ***
## fu2:agegrp75+ 0.98135 2.66806 0.13157 7.458 8.75e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## sexFemale 0.5901 1.6946 0.5191 0.6709
## year8594Diagnosed 85-94 0.7150 1.3986 0.6280 0.8141
## fu2 NA NA NA NA
## fu1:agegrp45-59 1.6999 0.5883 1.1569 2.4978
## fu2:agegrp45-59 1.2332 0.8109 0.9985 1.5232
## fu1:agegrp60-74 2.4607 0.4064 1.7043 3.5529
## fu2:agegrp60-74 1.7046 0.5867 1.3881 2.0932
## fu1:agegrp75+ 5.4150 0.1847 3.7186 7.8853
## fu2:agegrp75+ 2.6681 0.3748 2.0616 3.4530
##
## Concordance= 0.645 (se = 0.009 )
## Likelihood ratio test= 222.5 on 8 df, p=<2e-16
## Wald test = 224.5 on 8 df, p=<2e-16
## Score (logrank) test = 238 on 8 df, p=<2e-16
The regression equation for the cox2p8Split2
model is
\[\begin{align*}
h(t|\text{year8594},\text{sex},\text{agegrp},\text{fu}) &= h_0(t)
\exp(\beta_1 I(\text{sex}=\text{"Female"})+\beta_2
I(\text{year8594}=\text{"Diagnosed 85-94"})+\\
&\qquad\beta_3 I(\text{agegrp}=\text{"45-59"} \&\
\text{fu}=1)+\beta_4 I(\text{agegrp}=\text{"45-59"} \&\
\text{fu}=2)+\beta_5 I(\text{agegrp}=\text{"60-74"} \&\
\text{fu}=1) + \\
&\qquad \beta_6 I(\text{agegrp}=\text{"60-74"} \&\
\text{fu}=2)+\beta_7 I(\text{agegrp}=\text{"75+"} \&\
\text{fu}=1)+\beta_8 I(\text{agegrp}=\text{"75+"} \&\
\text{fu}=2))
\end{align*}\] where \(h(t|\text{year8594},\text{sex},\text{agegrp},\text{agegrp},\text{fu})\)
is the hazard at time \(t\) given
covariates \(\text{year8594}\), \(\text{sex}\) and \(\text{agegrp}\), \(\text{agegrp}\) and \(\text{fu}\), with baseline hazard \(h_0(t)\) and regression coefficients
representing log hazard ratios for \(\beta_1\) for the calendar period
1985–1994, \(\beta_2\) for females,
with log hazard ratios for the first and second follow-up period being
\(\beta_3\) and \(\beta_4\) for those aged 45–59 years at
diagnosis, \(\beta_5\) and \(\beta_6\) for those aged 60–74 years and
\(\beta_7\) and \(\beta_8\) for those aged 75 years and
over.
We can also use the tt
argument in coxph
for modelling for time-varying effects:
## Alternative approach using tt():
## http://cran.r-project.org/web/packages/survival/vignettes/timedep.pdf
cox2p8tvc2 <- coxph(Surv(trunc_yy, death_cancer) ~ sex + year8594 + agegrp +
tt(agegrp=="45-59") + tt(agegrp=="60-74") + tt(agegrp=="75+"),
data=localised,
tt = function(x, t, ...) x*(t>=2))
summary(cox2p8tvc2)
## Call:
## coxph(formula = Surv(trunc_yy, death_cancer) ~ sex + year8594 +
## agegrp + tt(agegrp == "45-59") + tt(agegrp == "60-74") +
## tt(agegrp == "75+"), data = localised, tt = function(x, t,
## ...) x * (t >= 2))
##
## n= 5318, number of events= 960
##
## coef exp(coef) se(coef) z Pr(>|z|)
## sexFemale -0.52742 0.59012 0.06543 -8.061 7.58e-16 ***
## year8594Diagnosed 85-94 -0.33548 0.71499 0.06623 -5.065 4.08e-07 ***
## agegrp45-59 0.53058 1.69992 0.19634 2.702 0.00689 **
## agegrp60-74 0.90046 2.46074 0.18741 4.805 1.55e-06 ***
## agegrp75+ 1.68918 5.41503 0.19175 8.809 < 2e-16 ***
## tt(agegrp == "45-59") -0.32093 0.72547 0.22382 -1.434 0.15161
## tt(agegrp == "60-74") -0.36715 0.69270 0.21467 -1.710 0.08720 .
## tt(agegrp == "75+") -0.70783 0.49271 0.23207 -3.050 0.00229 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## sexFemale 0.5901 1.6946 0.5191 0.6709
## year8594Diagnosed 85-94 0.7150 1.3986 0.6280 0.8141
## agegrp45-59 1.6999 0.5883 1.1569 2.4978
## agegrp60-74 2.4607 0.4064 1.7043 3.5529
## agegrp75+ 5.4150 0.1847 3.7186 7.8853
## tt(agegrp == "45-59") 0.7255 1.3784 0.4678 1.1250
## tt(agegrp == "60-74") 0.6927 1.4436 0.4548 1.0550
## tt(agegrp == "75+") 0.4927 2.0296 0.3126 0.7765
##
## Concordance= 0.645 (se = 0.009 )
## Likelihood ratio test= 222.5 on 8 df, p=<2e-16
## Wald test = 224.5 on 8 df, p=<2e-16
## Score (logrank) test = 238 on 8 df, p=<2e-16
## The tt labels do not play nicely with lincom:(
cox2p8tvct <- coxph(Surv(trunc_yy, death_cancer) ~ sex + year8594 + agegrp + tt(agegrp),
data=localised,
tt = function(x, t, ...) cbind(`45-59`=(x=="45-59")*t,
`60-74`=(x=="60-74")*t,
`75+`=(x=="75+")*t))
summary(cox2p8tvct)
## Call:
## coxph(formula = Surv(trunc_yy, death_cancer) ~ sex + year8594 +
## agegrp + tt(agegrp), data = localised, tt = function(x, t,
## ...) cbind(`45-59` = (x == "45-59") * t, `60-74` = (x ==
## "60-74") * t, `75+` = (x == "75+") * t))
##
## n= 5318, number of events= 960
##
## coef exp(coef) se(coef) z Pr(>|z|)
## sexFemale -0.52429 0.59198 0.06541 -8.016 1.10e-15 ***
## year8594Diagnosed 85-94 -0.33768 0.71342 0.06627 -5.095 3.48e-07 ***
## agegrp45-59 0.49614 1.64237 0.18573 2.671 0.007555 **
## agegrp60-74 0.99490 2.70445 0.17856 5.572 2.52e-08 ***
## agegrp75+ 1.89383 6.64478 0.20455 9.259 < 2e-16 ***
## tt(agegrp)45-59 -0.05256 0.94880 0.04054 -1.297 0.194801
## tt(agegrp)60-74 -0.09839 0.90630 0.04040 -2.435 0.014875 *
## tt(agegrp)75+ -0.21623 0.80555 0.05739 -3.768 0.000165 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## sexFemale 0.5920 1.6893 0.5208 0.6729
## year8594Diagnosed 85-94 0.7134 1.4017 0.6265 0.8124
## agegrp45-59 1.6424 0.6089 1.1412 2.3635
## agegrp60-74 2.7044 0.3698 1.9058 3.8377
## agegrp75+ 6.6448 0.1505 4.4501 9.9218
## tt(agegrp)45-59 0.9488 1.0540 0.8763 1.0273
## tt(agegrp)60-74 0.9063 1.1034 0.8373 0.9810
## tt(agegrp)75+ 0.8055 1.2414 0.7198 0.9015
##
## Concordance= 0.645 (se = 0.009 )
## Likelihood ratio test= 229.7 on 8 df, p=<2e-16
## Wald test = 231.3 on 8 df, p=<2e-16
## Score (logrank) test = 245.9 on 8 df, p=<2e-16
## Estimate 2.5 % 97.5 % Chisq Pr(>Chisq)
## agegrp75+ 6.644778 4.450099 9.921819 85.72268 2.070247e-20
## Estimate 2.5 % 97.5 % Chisq Pr(>Chisq)
## agegrp75+ + tt(agegrp)75+ 5.35269 3.915088 7.318171 110.5215 7.532419e-26
## Estimate 2.5 % 97.5 % Chisq Pr(>Chisq)
## agegrp75+ + 2*tt(agegrp)75+ 4.31185 3.373556 5.511113 136.2282 1.778638e-31
cox2p8tvclogt <- coxph(Surv(trunc_yy, death_cancer) ~ sex + year8594 + agegrp +
tt(agegrp),
data=localised,
tt = function(x, t, ...) cbind(`45-59`=(x=="45-59")*log(t),
`60-74`=(x=="60-74")*log(t),
`75+`=(x=="75+")*log(t)))
summary(cox2p8tvclogt)
## Call:
## coxph(formula = Surv(trunc_yy, death_cancer) ~ sex + year8594 +
## agegrp + tt(agegrp), data = localised, tt = function(x, t,
## ...) cbind(`45-59` = (x == "45-59") * log(t), `60-74` = (x ==
## "60-74") * log(t), `75+` = (x == "75+") * log(t)))
##
## n= 5318, number of events= 960
##
## coef exp(coef) se(coef) z Pr(>|z|)
## sexFemale -0.52385 0.59224 0.06540 -8.010 1.15e-15 ***
## year8594Diagnosed 85-94 -0.33810 0.71313 0.06628 -5.101 3.38e-07 ***
## agegrp45-59 0.56930 1.76703 0.19477 2.923 0.00347 **
## agegrp60-74 1.02025 2.77389 0.18503 5.514 3.51e-08 ***
## agegrp75+ 1.89910 6.67989 0.18956 10.018 < 2e-16 ***
## tt(agegrp)45-59 -0.23758 0.78853 0.14336 -1.657 0.09747 .
## tt(agegrp)60-74 -0.34509 0.70816 0.13838 -2.494 0.01264 *
## tt(agegrp)75+ -0.73311 0.48041 0.15837 -4.629 3.67e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## sexFemale 0.5922 1.6885 0.5210 0.6732
## year8594Diagnosed 85-94 0.7131 1.4023 0.6263 0.8121
## agegrp45-59 1.7670 0.5659 1.2063 2.5884
## agegrp60-74 2.7739 0.3605 1.9302 3.9864
## agegrp75+ 6.6799 0.1497 4.6070 9.6855
## tt(agegrp)45-59 0.7885 1.2682 0.5954 1.0444
## tt(agegrp)60-74 0.7082 1.4121 0.5399 0.9288
## tt(agegrp)75+ 0.4804 2.0816 0.3522 0.6553
##
## Concordance= 0.643 (se = 0.009 )
## Likelihood ratio test= 237.2 on 8 df, p=<2e-16
## Wald test = 235.8 on 8 df, p=<2e-16
## Score (logrank) test = 257.7 on 8 df, p=<2e-16
lincom(cox2p8tvclogt, "agegrp75+ - 0.6931472*tt(agegrp)75+",eform=TRUE) # t=0.5 => log(t)=-0.6931472
## Estimate 2.5 % 97.5 % Chisq
## agegrp75+ - 0.6931472*tt(agegrp)75+ 11.10348 6.347921 19.42166 71.20567
## Pr(>Chisq)
## agegrp75+ - 0.6931472*tt(agegrp)75+ 3.218622e-17
## Estimate 2.5 % 97.5 % Chisq Pr(>Chisq)
## agegrp75+ 6.679893 4.606969 9.685537 100.3693 1.264746e-23
## Estimate 2.5 % 97.5 % Chisq
## agegrp75+ + 0.6931472*tt(agegrp)75+ 4.018649 3.17166 5.091825 132.6644
## Pr(>Chisq)
## agegrp75+ + 0.6931472*tt(agegrp)75+ 1.070656e-30
The regression equation for the cox2p8tvct
model is
\[\begin{align*}
h(t|\text{year8594},\text{sex},\text{agegrp}) &= h_0(t) \exp(\beta_1
I(\text{sex}=\text{"Female"})+\beta_2
I(\text{year8594}=\text{"Diagnosed 85-94"})+\\
&\qquad\beta_3 I(\text{agegrp}=\text{"45-59"})+\beta_4
I(\text{agegrp}=\text{"60-64"}) + \beta_5
I(\text{agegrp}=\text{"75+"}) + \\
&\qquad \beta_6 I(\text{agegrp}=\text{"45-59"}) t +\beta_7
I(\text{agegrp}=\text{"60-74"})t + \beta_8
I(\text{agegrp}=\text{"75+"}) t)
\end{align*}\] where \(h(t|\text{year8594},\text{sex},\text{agegrp},\text{agegrp})\)
is the hazard at time \(t\) given
covariates \(\text{year8594}\), \(\text{sex}\) and \(\text{agegrp}\) and \(\text{agegrp}\), with baseline hazard \(h_0(t)\) and regression coefficients
representing log hazard ratios for \(\beta_1\) for the calendar period
1985–1994, \(\beta_2\) for females,
with log hazard ratios at time 0 for those aged 45–59 years, 60–74 years
and 75 years and over are \(\beta_3\),
\(\beta_4\) and \(\beta_5\), respectively, while the change
in log hazard ratios per year for those aged those aged 45–59 years,
60–74 years and 75 years and over are \(\beta_3\), \(\beta_4\) and \(\beta_5\), respectively.
The hazard ratio for model cox2p8tvct
for the those aged
75 years and over compared with those aged less than 45 years is \[\begin{align*}
\frac{h(t|\text{year8594},\text{sex},\text{agegrp}=\text{"75+"})}{h(t|\text{year8594},\text{sex},\text{agegrp}=\text{"0-44"})}
&=
\frac{h_0(t) \exp(\beta_1
I(\text{sex}=\text{"Female"})+\beta_2
I(\text{year8594}=\text{"Diagnosed 85-94"})+ \beta_5 + \beta_8
t)}{h_0(t)\exp(\beta_1 I(\text{sex}=\text{"Female"})+\beta_2
I(\text{year8594}=\text{"Diagnosed 85-94"}))} \\
&= \exp(\beta_5 + \beta_8 t)
\end{align*}\]
We have shown several ways to use the tt
functionality
for a factor variable, including using different tt arguments for each
factor level (as per model cox2p8tvc2
) and using a
tt
term that returns a set of columns (as per model
cox2p8tvct
). We have used the lincom
function
to estimate the hazard ratio for agegrp75+
. We will later
describe a more flexible approach to modelling time-dependent effects
using stpm2
.
(j)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:car':
##
## recode
## The following object is masked from 'package:MASS':
##
## select
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
##
## Attaching package: 'rstpm2'
## The following objects are masked from 'package:biostat3':
##
## colon, eform, popmort
## The following object is masked from 'package:survival':
##
## colon
time.cuts <- seq(0,10,length=100)
delta <- diff(time.cuts)[1]
## split and collapse
melanoma2p8Split2 <- survSplit(Surv(trunc_yy,death_cancer)~., data=localised,
cut=time.cuts, end="tstop", start="tstart",
event="death_cancer") |>
mutate(fu=cut(tstop,time.cuts),
mid=time.cuts[unclass(fu)]+delta/2) |>
group_by(mid,sex,year8594,agegrp) |>
summarise(pt=sum(tstop-tstart), death_cancer=sum(death_cancer),
.groups="keep") |>
mutate(age75 = (agegrp=="75+")+0)
poisson2p8tvc <- glm(death_cancer ~ sex + year8594 + agegrp + ns(mid,df=3) +
age75:ns(mid,df=3) + offset(log(pt)),
data=melanoma2p8Split2, family=poisson)
newdata <- data.frame(mid=seq(0,max(time.cuts),length=100), year8594="Diagnosed 85-94",
sex="Male", agegrp="75+", age75=1, pt=1)
predictnl(poisson2p8tvc,
fun=function(fit,newdata)
predict(fit, newdata) -
predict(fit, transform(newdata, agegrp='0-44', age75=0)),
newdata=newdata) |>
cbind(newdata) |>
transform(RateRatio=exp(fit),
lower.ci=exp(fit-1.96*se.fit),
upper.ci=exp(fit+1.96*se.fit)) |>
with(plt(mid, RateRatio, type="ribbon", xlab="Time since diagnosis (months)",
ymin=lower.ci, ymax=upper.ci,
ylab="Rate ratio", main="Ages 75+ compared with ages 0-44 years"))
##
## Call:
## glm(formula = death_cancer ~ sex + year8594 + agegrp + ns(mid,
## df = 3) + age75:ns(mid, df = 3) + offset(log(pt)), family = poisson,
## data = melanoma2p8Split2)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.29223 0.14458 -29.688 < 2e-16 ***
## sexFemale -0.52750 0.06541 -8.064 7.36e-16 ***
## year8594Diagnosed 85-94 -0.32668 0.06628 -4.929 8.28e-07 ***
## agegrp45-59 0.28753 0.09418 3.053 0.002265 **
## agegrp60-74 0.62943 0.09089 6.925 4.36e-12 ***
## agegrp75+ 1.71564 0.23189 7.399 1.38e-13 ***
## ns(mid, df = 3)1 -0.67587 0.17890 -3.778 0.000158 ***
## ns(mid, df = 3)2 1.71393 0.32526 5.269 1.37e-07 ***
## ns(mid, df = 3)3 -0.92248 0.20709 -4.454 8.41e-06 ***
## ns(mid, df = 3)1:age75 -0.93947 0.56050 -1.676 0.093713 .
## ns(mid, df = 3)2:age75 -1.51156 0.80963 -1.867 0.061907 .
## ns(mid, df = 3)3:age75 -0.52292 0.85343 -0.613 0.540055
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 1841.4 on 1583 degrees of freedom
## Residual deviance: 1460.0 on 1572 degrees of freedom
## AIC: 2893.5
##
## Number of Fisher Scoring iterations: 6
newdata <- expand.grid(mid=c(0.5,2), year8594="Diagnosed 85-94",
sex="Male", agegrp=levels(localised$agegrp), pt=1) |>
transform(age75=(agegrp=="75+")+0)
predictnl(poisson2p8tvc,
fun=function(fit,newdata)
predict(fit, newdata) -
predict(fit, transform(newdata, agegrp='0-44', age75=0)),
newdata=newdata) |>
cbind(newdata) |>
transform(RateRatio=exp(fit),
lower.ci=exp(fit-1.96*se.fit),
upper.ci=exp(fit+1.96*se.fit)) |>
subset(select=c(agegrp, mid, RateRatio, lower.ci, upper.ci))
## agegrp mid RateRatio lower.ci upper.ci
## 1 0-44 0.5 1.000000 1.000000 1.000000
## 2 0-44 2.0 1.000000 1.000000 1.000000
## 3 45-59 0.5 1.333128 1.108426 1.603381
## 4 45-59 2.0 1.333128 1.108426 1.603381
## 5 60-74 0.5 1.876535 1.570315 2.242468
## 6 60-74 2.0 1.876535 1.570315 2.242468
## 7 75+ 0.5 5.091943 3.557290 7.288662
## 8 75+ 2.0 3.817641 2.996414 4.863940
We use the predictnl
function from rstpm2
to calculate the hazard ratios and confidence intervals based on the
delta method. We see clear evidence for a rate ratio for those aged 75
years and over compared with those aged less than 45 years declining by
time since diagnosis. Fitting for an interaction between age 75+ and
time from diagnosis, we get a rate ratio for those aged 75 years at 0.5
years from diagnosis of 5.19 (95% confidence interval (CI): 3.56, 7.29),
while at 2 years the rate ratio is 3.82 (95% CI: 3.00, 4.86).