Exercise 14. Non-collapsibility of proportional hazards models


We simulate for time-to-event data assuming constant hazards and then investigate whether we can estimate the underlying parameters. Note that the binary variable \(X\) is essentially a coin toss and we have used a large variance for the normally distributed \(U\).

You may have to install the required packages the first time you use them. You can install a package by install.packages("package_of_interest") for each package you require.

The assumed causal diagram is reproduced below:

(b) Fitting models with only \(X\)

We now model by excluding the variable \(U\). This variable could be excluded when it is not measured or perhaps when the variable is not considered to be a confounding variable – from the causal diagram, the two variables \(X\) and \(U\) are not correlated and are only connected through the time variable \(T\).

Again, we suggest investigating whether the hazard ratio for \(X\) is time-varying.

What do you see from the time-varing hazard ratio? Is \(U\) a potential confounder for \(X\)?

(e) Accelerated failure time models

As an alternative model class, we can fit accelerated failure time models with a smooth baseline survival function. We can use the rstpm2::aft function, which uses splines to model baseline survival. Using the baseline simulation, fit and interpret smooth accelerated failure time models:

## Maximum likelihood estimation
## 
## Call:
## bbmle::mle2(minuslogl = negll, start = coef, eval.only = TRUE, 
##     vecpar = TRUE, gr = gradient, control = control)
## 
## Coefficients:
##                                        Estimate Std. Error  z value  Pr(z)    
## x                                     -1.036381   0.087589 -11.8324 <2e-16 ***
## u                                     -0.993865   0.048413 -20.5289 <2e-16 ***
## nsx(logtstar, df, intercept = TRUE)1  -3.424935   0.362341  -9.4523 <2e-16 ***
## nsx(logtstar, df, intercept = TRUE)2   0.285719   0.263517   1.0843 0.2783    
## nsx(logtstar, df, intercept = TRUE)3 -14.160032   0.770291 -18.3827 <2e-16 ***
## nsx(logtstar, df, intercept = TRUE)4   4.529814   0.488246   9.2777 <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## -2 log L: 7681.266
## Maximum likelihood estimation
## 
## Call:
## bbmle::mle2(minuslogl = negll, start = coef, eval.only = TRUE, 
##     vecpar = TRUE, gr = gradient, control = control)
## 
## Coefficients:
##                                        Estimate Std. Error  z value   Pr(z)    
## x                                     -1.094700   0.094337 -11.6042 < 2e-16 ***
## nsx(logtstar, df, intercept = TRUE)1  -3.076847   0.303330 -10.1436 < 2e-16 ***
## nsx(logtstar, df, intercept = TRUE)2   0.577282   0.226405   2.5498 0.01078 *  
## nsx(logtstar, df, intercept = TRUE)3 -13.490435   0.660353 -20.4291 < 2e-16 ***
## nsx(logtstar, df, intercept = TRUE)4   4.657197   0.419495  11.1019 < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## -2 log L: 8334.838