Biostat III exercises in R

Laboratory exercise 7

Suggested solutions by

Author: Johan Zetterqvist, 2015-03-03

Model cause-specific mortality with poisson regression

In this exercise we model, using Poisson regression, cause-specific mortality of patients diagnosed with localised (stage==1) melanoma.

In exercise 9 we model cause-specific mortality using Cox regression and in exercise 28 we use flexible parametric models. The aim is to illustrate that these three methods are very similar.

The aim of these exercises is to explore the similarities and differences to these three approaches to modelling. We will be comparing the results (and their interpretation) as we proceed through the exercises.


Load the diet data using time-on-study as the timescale.

You may have to install the required packages the first time you use them. You can install a package by install.packages("package_of_interest") for each package you require.

require(survival) # for Surv and survfit
require(dplyr)    # for data manipulation
require(foreign)  # for reading data set from Stata


###########################################
### A help function to calculate ###
### and print incidence (hazard) ratios
### from a fitted poisson regression
### from glm
###########################################

IRR <- function(fit){
    summfit <- summary(fit )$coefficients
    IRfit <- exp( cbind( summfit[, 1:2], summfit[, 1] - 1.96*summfit[, 2], summfit[, 1] +
                        1.96*summfit[, 2] ) )
    colnames(IRfit) <- c("IRR", "Std. err", "CI_lower", "CI_upper")
    print(IRfit)
}

Load the melanoma data and explore it.

## Read melanoma data
melanoma <- tbl_df( read.dta("http://biostat3.net/download/melanoma.dta") )

## Create a new dataset with only localised cancer
melanoma.l <- filter(melanoma, stage=="Localised")
head( melanoma.l )
summary(melanoma.l)

Rates can be modelled on different timescales, e.g., attained age, time-since-entry, calendar time. Plot the CHD incidence rates both by attained age and by time-since-entry. Is there a difference? Do the same for CHD hazard by different energy intakes (hieng).

(a) i. Plot Kaplan-Meier estimates of cause-specific survival as a function of calendar period of diagnosis.

## Plot Kaplan-Meier curve using survfit
## Create a new event indicator
melanoma.l <- mutate(melanoma.l,
                   death_cancer = as.numeric(status=="Dead: cancer") )

## Create a fitted object for our subcohort
## using survfit
sfit7a1 <- survfit(Surv(surv_mm, event=death_cancer) ~ year8594,
                   data = melanoma.l )

## Have a look at the fitted object
str(sfit7a1, 1)
## List of 14
##  $ n        : int [1:2] 2145 3173
##  $ time     : num [1:381] 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 ...
##  $ n.risk   : num [1:381] 2145 2143 2140 2138 2135 ...
##  $ n.event  : num [1:381] 0 3 0 2 2 0 4 5 4 5 ...
##  $ n.censor : num [1:381] 2 0 2 1 3 2 3 2 5 3 ...
##  $ surv     : num [1:381] 1 0.999 0.999 0.998 0.997 ...
##  $ type     : chr "right"
##  $ strata   : Named int [1:2] 249 132
##   ..- attr(*, "names")= chr [1:2] "year8594=Diagnosed 75-84" "year8594=Diagnosed 85-94"
##  $ std.err  : num [1:381] 0 0.000809 0.000809 0.001045 0.001237 ...
##  $ upper    : num [1:381] 1 1 1 1 0.999 ...
##  $ lower    : num [1:381] 1 0.997 0.997 0.996 0.994 ...
##  $ conf.type: chr "log"
##  $ conf.int : num 0.95
##  $ call     : language survfit(formula = Surv(surv_mm, event = death_cancer) ~ year8594,      data = melanoma.l)
##  - attr(*, "class")= chr "survfit"
## Plot the survival curve (with some bells and whistles)
plot(sfit7a1,
     ## No automatic labelling of the curve (we do that ourselves)
     mark.time=F,
     ## Time is measured in months,  but we want to see it in years
     xscale=12,
     ## Make the plot prettier
     xlab="Years since diagnose",
     ylab="S(t)",
     col=c("blue","red"),
     lty=c("solid","dashed"))
## Add legend too
legend("bottomleft",legend=levels(melanoma.l$year8594),col=c("blue","red"),lty=c("solid","dashed"))

### TRY IF YOU WANT ###
## install.packages("survMisc")
## require(survMisc)
## autoplot(sfit7a1)
  1. Now plot the estimated hazard function (cause-specific mortality rate) as a function of calendar period of diagnosis.
## To plot smoothed hazards, we need use the muhaz package
require(muhaz)

## The plot.muhaz cannot rescale so we create a new variable
## Create a new variable base on the number of months
## translated to years (more accurate than surv_yy)
melanoma.l <- mutate( melanoma.l, surv_yy1 = surv_mm/12)

## Since the muhaz interface does not support data frames
## we do this the old school way
hazfit7a.early <- with(melanoma.l,
                       muhaz(times=surv_yy1,
                             delta=status=="Dead: cancer",
                             subset=year8594=="Diagnosed 75-84"))

hazfit7a.late <- with(melanoma.l,
                      muhaz(times=surv_yy1,
                            delta=status=="Dead: cancer",
                            subset=year8594=="Diagnosed 85-94",
                            max.time=max(surv_yy1)))
## Warning in muhaz(times = surv_yy1, delta = status == "Dead: cancer", subset = year8594 == : maximum time > maximum Survival Time
## With bells and whistles
plot(hazfit7a.early,xlab="Years since diagnose",col="blue",lty="solid")
lines(hazfit7a.late,col="red",lty="dashed")
legend("topright",legend=levels(melanoma$year8594),col=c("blue","red"),lty=c("solid","dashed"))

During which calendar period (the early or the latter) is mortality the lowest?

  1. Is the interpretation (with respect to how prognosis depends on period) based on the hazard consistent with the interpretation of the survival plot?
## Compare with Kaplan-Meier plot
par(mfrow=c(1,2)) ## Two graphs in the same window

plot(sfit7a1,
     ## No automatic labelling of the curve (we do that ourselves)
     mark.time=F,
     ## Time is measured in months,  but we want to see it in years
     xscale=12,
     ## Make the plot prettier
     xlab="Years since diagnose",
     ylab="S(t)",
     col=c("blue","red"),
     lty=c("solid","dashed"))
legend("bottomleft",legend=levels(melanoma.l$year8594),col=c("blue","red"),lty=c("solid","dashed"))

plot(hazfit7a.early,xlab="Years since diagnose",col="blue",lty="solid")
lines(hazfit7a.late,col="red",lty="dashed")
legend("topright",legend=levels(melanoma$year8594),col=c("blue","red"),lty=c("solid","dashed"))

(b) Estimate the cause-specific mortality rate for each calendar period.

## Calculate the incidence rate by time of diagnosis
rates_by_diag_yr <- melanoma.l %>%

    ## Stratify on year of diagnosis
    group_by(year8594) %>%

    ## We only need the deaths and the persontime
    select(death_cancer, surv_mm) %>%

    summarise(
        ## Calculate the number of deaths
        D = sum(death_cancer),
        ## Calculate the total person time
        Y = sum(surv_mm)/12/1000,
        ## Calculate the rate
        Rate = D/Y) %>%

    ## Add confidence intervals for the rates
    mutate(CI_low = Rate + qnorm(0.025) * Rate / sqrt(D),
           CI_high = Rate + qnorm(0.975) * Rate / sqrt(D))

rates_by_diag_yr
## Source: local data frame [2 x 6]
## 
##          year8594   D        Y     Rate   CI_low CI_high
## 1 Diagnosed 75-84 572 22.66279 25.23961 23.17122 27.3080
## 2 Diagnosed 85-94 441 15.96379 27.62502 25.04673 30.2033

During which calendar period (the early or the latter) is mortality the lowest? Is this consistent with what you found earlier? If not, why the inconsistency?

(c) The reason for the inconsistency between parts 7a and 7b was confounding by time since diagnosis. The comparison in part 7a was adjusted for time since diagnosis (since we compare the differences between the curves at each point in time) whereas the comparison in part 7b was not. Understanding this concept is central to the remainder of the exercise so please ask for help if you don’t follow.

Two approaches for controlling for confounding are ‘restriction’ and ‘statistical adjustment’. We will first use restriction to control for confounding. We will restrict the potential follow-up time to a maximum of 120 months. Individuals who survive more than 120 months are censored at 120 months.

  1. Estimate the cause-specific mortality rate for each calendar period.
## Calculate the incidence rate by time of diagnosis
## but with new variables
melanoma.l2 <- mutate(melanoma.l,
                       ## Update the death indicator (only count deaths within 120 months)
                       death_cancer = death_cancer * as.numeric(surv_mm<=120),
                       ## Create a new time variable
                       surv_mm = pmin(surv_mm, 120) )

## Calculate the rates on the truncated data
rates_by_diag_yr2 <- melanoma.l2 %>%

    ## Stratify on year of diagnosis
    group_by(year8594) %>%

    ## We only need the deaths and the persontime
    select(death_cancer, surv_mm) %>%

    summarise(
        ## Calculate the number of deaths
        D = sum(death_cancer),
        ## Calculate the total person time
        Y = sum(surv_mm)/12/1000,
        ## Calculate the rate
        Rate = D/Y) %>%

    ## Add confidence intervals for the rates
    mutate(CI_low = Rate + qnorm(0.025) * Rate / sqrt(D),
           CI_high = Rate + qnorm(0.975) * Rate / sqrt(D))

rates_by_diag_yr2
## Source: local data frame [2 x 6]
## 
##          year8594   D        Y     Rate   CI_low  CI_high
## 1 Diagnosed 75-84 519 16.50104 31.45256 28.74661 34.15851
## 2 Diagnosed 85-94 441 15.87562 27.77843 25.18583 30.37104
rates_by_diag_yr
## Source: local data frame [2 x 6]
## 
##          year8594   D        Y     Rate   CI_low CI_high
## 1 Diagnosed 75-84 572 22.66279 25.23961 23.17122 27.3080
## 2 Diagnosed 85-94 441 15.96379 27.62502 25.04673 30.2033

During which calendar period (the early of the latter) is mortality the lowest? Is this consistent with what you found in part 7b?

  1. Calculate by hand the ratio (85–94/75–84) of the two mortality rates (i.e., a mortality rate ratio) and interpret the estimate (i.e., during which period is mortality higher/lower and by how much).
## MRR full data
rates_by_diag_yr2[2, "Rate"] / rates_by_diag_yr2[1, "Rate"]
##        Rate
## 1 0.8831852
## ## MRR truncated data
## rates_by_diag_yr2[2, "Rate"] / rates_by_diag_yr2[1, "Rate"]
  1. Now use Poisson regression to estimate the same mortality rate ratio. Write the linear predictor using pen and paper and draw a graph of the fitted hazard rates.
## Use glm to estimate the rate ratios
## we scale the offset term to personyears
poisson7c <- glm( death_cancer ~ year8594 + offset( log( surv_mm/12/1000 ) ), family=poisson, data=melanoma.l2 )
summary( poisson7c )
## 
## Call:
## glm(formula = death_cancer ~ year8594 + offset(log(surv_mm/12/1000)), 
##     family = poisson, data = melanoma.l2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.7931  -0.7568  -0.5632  -0.3081   3.0550  
## 
## Coefficients:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              3.44848    0.04389  78.566   <2e-16 ***
## year8594Diagnosed 85-94 -0.12422    0.06476  -1.918   0.0551 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 4814.3  on 5317  degrees of freedom
## Residual deviance: 4810.6  on 5316  degrees of freedom
## AIC: 6734.6
## 
## Number of Fisher Scoring iterations: 6
## IRR
IRR(poisson7c)
##                                IRR Std. err   CI_lower  CI_upper
## (Intercept)             31.4525600 1.044870 28.8598349 34.278212
## year8594Diagnosed 85-94  0.8831851 1.066905  0.7779032  1.002716
## with confidence intervals (takes some time to compute)
## exp(cbind(coef(poisson7c),confint(poisson7c)))


## Note that the scaling of the offset term only has an impact on the intercept
summary( glm( death_cancer ~ year8594 + offset( log( surv_mm ) ),
             family=poisson, data=melanoma.l2 ) )
## 
## Call:
## glm(formula = death_cancer ~ year8594 + offset(log(surv_mm)), 
##     family = poisson, data = melanoma.l2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.7931  -0.7568  -0.5632  -0.3081   3.0550  
## 
## Coefficients:
##                         Estimate Std. Error  z value Pr(>|z|)    
## (Intercept)             -5.94418    0.04389 -135.426   <2e-16 ***
## year8594Diagnosed 85-94 -0.12422    0.06476   -1.918   0.0551 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 4814.3  on 5317  degrees of freedom
## Residual deviance: 4810.6  on 5316  degrees of freedom
## AIC: 6734.6
## 
## Number of Fisher Scoring iterations: 6

(7d) In order to adjust for time since diagnosis (i.e., adjust for the fact that we expect mortality to depend on time since diagnosis) we need to split the data by this timescale. We will restrict our analysis to mortality up to 10 years following diagnosis.

## Add a new variable for year
melanoma.l2 <- mutate( melanoma.l2, surv_yy1 = surv_mm/12)

## Split follow up by year
melanoma.spl <- survSplit(melanoma.l2, cut=0:9, end="surv_yy1", start="start",
                           event="death_cancer")

## Calculate persontime and
## recode start time as a factor
melanoma.spl <- mutate(melanoma.spl,
                       pt = surv_yy1 - start,
                       fu = as.factor(start) )

(e) Now tabulate (and produce a graph of) the rates by follow-up time.

## Calculate the incidence rate by observation year
yearly_rates <- melanoma.spl %>%

    ## Stratify on follow-up year
    group_by(fu) %>%

    ## We only need the deaths and the persontime
    select(death_cancer, pt) %>%

    summarise(
        ## Calculate the number of deaths
        D = sum(death_cancer),
        ## Calculate the total person time measured in 1000
        Y = sum(pt)/1000,
        ## Calculate the rate
        Rate = D/Y) %>%

    ## Add confidence intervals for the rates
    mutate(CI_low = Rate + qnorm(0.025) * Rate / sqrt(D),
           CI_high = Rate + qnorm(0.975) * Rate / sqrt(D))

yearly_rates
## Source: local data frame [10 x 6]
## 
##    fu   D        Y     Rate    CI_low  CI_high
## 1   0  71 5.257000 13.50580 10.364286 16.64732
## 2   1 228 4.857917 46.93370 40.841621 53.02578
## 3   2 202 4.235458 47.69260 41.115663 54.26953
## 4   3 138 3.711583 37.18090 30.977517 43.38428
## 5   4 100 3.265583 30.62240 24.620520 36.62428
## 6   5  80 2.864667 27.92646 21.806917 34.04600
## 7   6  56 2.524792 22.18005 16.370845 27.98925
## 8   7  35 2.190250 15.97991 10.685856 21.27397
## 9   8  34 1.886375 18.02399 11.965566 24.08241
## 10  9  16 1.583042 10.10712  5.154725 15.05953
## Plot by year
with(yearly_rates, matplot(fu,
                           cbind(Rate, CI_low,
                             CI_high),
                           lty=c("solid","dashed","dashed"),
                           col=c("black","gray","gray"),
                           type="l",
                           main="Cancer deaths per 1000 person-year by years since diagnose",
                           ylab="IR",
                           xlab="Years since diagnose") )

Mortality appears to be quite low during the first year of follow-up. Does this seem reasonable considering the disease with which these patients have been diagnosed?

(f) Compare the plot of the estimated rates to a plot of the hazard rate as a function of continuous time.

hazfit7f <- with(melanoma.l2, muhaz(times=surv_yy1, delta=death_cancer, max.time=10.5) )

# Plot smoothed hazards
frame() # Open new graphics device (window)
plot(hazfit7f,xlab="Years since diagnose",col="blue",lty="solid")

par(mfrow=c(1,2))
with(yearly_rates, matplot(fu,
                           cbind(Rate, CI_low,
                             CI_high),
                           lty=c("solid","dashed","dashed"),
                           col=c("black","gray","gray"),
                           type="l",
                           main="Cancer deaths per 1000 person-year by years since diagnose",
                           ylab="IR",
                           xlab="Years since diagnose") )
plot(hazfit7f,xlab="Years since diagnose",col="blue",lty="solid")

Is the interpretation similar? Do you think it is sufficient to classify follow-up time into annual intervals or might it be preferable to use, for example, narrower intervals?

(g) Use Poisson regression to estimate incidence rate ratios as a function of follow-up time. Write the linear predictor using pen and paper.

## Run Poisson regression
summary(poisson7g <- glm( death_cancer ~ fu + offset( log(pt) ),
                         family = poisson,
                         data = melanoma.spl ))
## 
## Call:
## glm(formula = death_cancer ~ fu + offset(log(pt)), family = poisson, 
##     data = melanoma.spl)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.3088  -0.2889  -0.2363  -0.1644   3.6805  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -4.3046     0.1187 -36.276  < 2e-16 ***
## fu1           1.2456     0.1359   9.166  < 2e-16 ***
## fu2           1.2617     0.1380   9.145  < 2e-16 ***
## fu3           1.0127     0.1460   6.934 4.08e-12 ***
## fu4           0.8186     0.1552   5.275 1.33e-07 ***
## fu5           0.7265     0.1630   4.456 8.36e-06 ***
## fu6           0.4961     0.1787   2.776  0.00551 ** 
## fu7           0.1682     0.2065   0.815  0.41531    
## fu8           0.2886     0.2085   1.384  0.16641    
## fu9          -0.2899     0.2765  -1.048  0.29452    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 8651.5  on 34308  degrees of freedom
## Residual deviance: 8446.4  on 34299  degrees of freedom
## AIC: 10386
## 
## Number of Fisher Scoring iterations: 6
## IRR
IRR(poisson7g)
##                   IRR Std. err   CI_lower   CI_upper
## (Intercept) 0.0135058 1.125992 0.01070313 0.01704237
## fu1         3.4750768 1.145561 2.66249592 4.53565336
## fu2         3.5312671 1.147924 2.69463908 4.62765041
## fu3         2.7529574 1.157242 2.06770351 3.66530999
## fu4         2.2673515 1.167868 1.67273474 3.07334006
## fu5         2.0677380 1.177078 1.50216436 2.84625343
## fu6         1.6422607 1.195671 1.15697204 2.33110231
## fu7         1.1831887 1.229369 0.78936358 1.77349886
## fu8         1.3345367 1.231877 0.88678589 2.00836313
## fu9         0.7483554 1.318553 0.43522761 1.28676550

Does the pattern of estimated incident rate ratios mirror the pattern you observed in the plots? Draw a graph of the fitted hazard rate using pen and paper.

(h) Now estimate the effect of calendar period of diagnosis while adjusting for time since diagnosis. Before fitting this model, predict what you expect the estimated effect to be (i.e., will it be higher, lower, or similar to the value we obtained in part c). Write the linear predictor using pen and paper and draw a graph of the fitted hazard rates.

summary(poisson7h <- glm( death_cancer ~ fu + year8594 + offset( log(pt) ),
                         family = poisson,
                         data = melanoma.spl ))
## 
## Call:
## glm(formula = death_cancer ~ fu + year8594 + offset(log(pt)), 
##     family = poisson, data = melanoma.spl)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.3297  -0.2900  -0.2207  -0.1761   3.6707  
## 
## Coefficients:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -4.16612    0.12380 -33.651  < 2e-16 ***
## fu1                      1.24352    0.13589   9.151  < 2e-16 ***
## fu2                      1.25370    0.13797   9.087  < 2e-16 ***
## fu3                      0.99738    0.14610   6.827 8.68e-12 ***
## fu4                      0.79438    0.15532   5.115 3.14e-07 ***
## fu5                      0.69247    0.16329   4.241 2.23e-05 ***
## fu6                      0.45104    0.17911   2.518 0.011796 *  
## fu7                      0.10844    0.20710   0.524 0.600559    
## fu8                      0.21049    0.20954   1.004 0.315139    
## fu9                     -0.39239    0.27780  -1.413 0.157802    
## year8594Diagnosed 85-94 -0.24444    0.06579  -3.715 0.000203 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 8651.5  on 34308  degrees of freedom
## Residual deviance: 8432.6  on 34298  degrees of freedom
## AIC: 10375
## 
## Number of Fisher Scoring iterations: 6
## IRR
IRR(poisson7h)
##                                IRR Std. err   CI_lower   CI_upper
## (Intercept)             0.01551228 1.131793 0.01217005 0.01977237
## fu1                     3.46780081 1.145561 2.65691769 4.52616296
## fu2                     3.50326901 1.147943 2.67318800 4.59110761
## fu3                     2.71116147 1.157309 2.03607897 3.61007438
## fu4                     2.21306295 1.168028 1.63224549 3.00055821
## fu5                     1.99864158 1.177377 1.45124418 2.75251278
## fu6                     1.56993640 1.196153 1.10514659 2.23020217
## fu7                     1.11453748 1.230112 0.74268278 1.67257652
## fu8                     1.23427730 1.233114 0.81855162 1.86114157
## fu9                     0.67543732 1.320224 0.39184648 1.16427120
## year8594Diagnosed 85-94 0.78314061 1.068003 0.68839475 0.89092661
# Add interaction term
summary(poisson7h2 <- glm( death_cancer ~ fu*year8594 + offset( log(pt) ), family=poisson, data=melanoma.spl ))
## 
## Call:
## glm(formula = death_cancer ~ fu * year8594 + offset(log(pt)), 
##     family = poisson, data = melanoma.spl)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.3546  -0.2682  -0.2244  -0.1764   3.6684  
## 
## Coefficients:
##                             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 -4.16333    0.17408 -23.917  < 2e-16 ***
## fu1                          1.28028    0.19807   6.464 1.02e-10 ***
## fu2                          1.39664    0.19729   7.079 1.45e-12 ***
## fu3                          0.81978    0.21609   3.794 0.000148 ***
## fu4                          0.65306    0.22519   2.900 0.003732 ** 
## fu5                          0.66513    0.22711   2.929 0.003404 ** 
## fu6                          0.41495    0.24264   1.710 0.087240 .  
## fu7                          0.03655    0.27163   0.135 0.892966    
## fu8                          0.30339    0.25453   1.192 0.233279    
## fu9                         -0.38675    0.31895  -1.213 0.225302    
## year8594Diagnosed 85-94     -0.24965    0.23795  -1.049 0.294089    
## fu1:year8594Diagnosed 85-94 -0.07110    0.27234  -0.261 0.794025    
## fu2:year8594Diagnosed 85-94 -0.30852    0.27725  -1.113 0.265806    
## fu3:year8594Diagnosed 85-94  0.34410    0.29326   1.173 0.240642    
## fu4:year8594Diagnosed 85-94  0.29777    0.31086   0.958 0.338115    
## fu5:year8594Diagnosed 85-94  0.06578    0.32894   0.200 0.841494    
## fu6:year8594Diagnosed 85-94  0.09658    0.36443   0.265 0.790988    
## fu7:year8594Diagnosed 85-94  0.22179    0.42829   0.518 0.604555    
## fu8:year8594Diagnosed 85-94 -0.51530    0.53954  -0.955 0.339542    
## fu9:year8594Diagnosed 85-94 -0.06040    0.79245  -0.076 0.939240    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 8651.5  on 34308  degrees of freedom
## Residual deviance: 8419.4  on 34289  degrees of freedom
## AIC: 10379
## 
## Number of Fisher Scoring iterations: 7
## IRR
IRR(poisson7h2)
##                                    IRR Std. err   CI_lower   CI_upper
## (Intercept)                 0.01555564 1.190148 0.01105885 0.02188093
## fu1                         3.59765631 1.219047 2.44016588 5.30420127
## fu2                         4.04160662 1.218098 2.74546930 5.94965096
## fu3                         2.26998933 1.241219 1.48621340 3.46710072
## fu4                         1.92140436 1.252563 1.23575373 2.98748416
## fu5                         1.94474961 1.254970 1.24607016 3.03518307
## fu6                         1.51429324 1.274610 0.94117447 2.43640694
## fu7                         1.03722422 1.312098 0.60905846 1.76638887
## fu8                         1.35444106 1.289856 0.82243078 2.23059575
## fu9                         0.67926283 1.375688 0.36352881 1.26921988
## year8594Diagnosed 85-94     0.77907144 1.268642 0.48868908 1.24200098
## fu1:year8594Diagnosed 85-94 0.93136483 1.313031 0.54613612 1.58832278
## fu2:year8594Diagnosed 85-94 0.73453606 1.319496 0.42659267 1.26477380
## fu3:year8594Diagnosed 85-94 1.41072301 1.340786 0.79399346 2.50649346
## fu4:year8594Diagnosed 85-94 1.34685580 1.364600 0.73233622 2.47703240
## fu5:year8594Diagnosed 85-94 1.06799324 1.389488 0.56049676 2.03499759
## fu6:year8594Diagnosed 85-94 1.10140218 1.439693 0.53918372 2.24985790
## fu7:year8594Diagnosed 85-94 1.24831405 1.534628 0.53921004 2.88994613
## fu8:year8594Diagnosed 85-94 0.59732398 1.715213 0.20746593 1.71978088
## fu9:year8594Diagnosed 85-94 0.94138428 2.208793 0.19916969 4.44949406

Is the estimated effect of calendar period of diagnosis consistent with what you expected? Add an interaction between follow-up and calendar period of diagnosis and interpret the results.

(i) Now control for age, sex, and calendar period. Write the linear predictor using pen and paper.

  1. Interpret the estimated hazard ratio for the parameter labelled agegrp 2, including a comment on statistical significance.
  2. Is the effect of calendar period strongly confounded by age and sex? That is, does the inclusion of sex and age in the model change the estimate for the effect of calendar period?
  3. Perform a Wald test of the overall effect of age and interpret the results.
summary(poisson7i <- glm( death_cancer ~ fu + agegrp + year8594 + sex + offset( log(pt) ), family=poisson, data=melanoma.spl ))
## 
## Call:
## glm(formula = death_cancer ~ fu + agegrp + year8594 + sex + offset(log(pt)), 
##     family = poisson, data = melanoma.spl)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.5646  -0.2672  -0.2098  -0.1597   3.6806  
## 
## Coefficients:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -4.36681    0.14322 -30.490  < 2e-16 ***
## fu1                      1.26827    0.13592   9.331  < 2e-16 ***
## fu2                      1.30657    0.13806   9.464  < 2e-16 ***
## fu3                      1.07575    0.14627   7.354 1.92e-13 ***
## fu4                      0.89517    0.15559   5.753 8.75e-09 ***
## fu5                      0.81370    0.16368   4.971 6.65e-07 ***
## fu6                      0.58637    0.17957   3.265  0.00109 ** 
## fu7                      0.25361    0.20758   1.222  0.22181    
## fu8                      0.36427    0.21006   1.734  0.08290 .  
## fu9                     -0.22796    0.27844  -0.819  0.41296    
## agegrp45-59              0.28352    0.09417   3.011  0.00261 ** 
## agegrp60-74              0.62185    0.09088   6.843 7.76e-12 ***
## agegrp75+                1.22386    0.10444  11.718  < 2e-16 ***
## year8594Diagnosed 85-94 -0.32516    0.06618  -4.913 8.97e-07 ***
## sexFemale               -0.53180    0.06545  -8.125 4.48e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 8651.5  on 34308  degrees of freedom
## Residual deviance: 8233.4  on 34294  degrees of freedom
## AIC: 10183
## 
## Number of Fisher Scoring iterations: 7
## IRR
IRR(poisson7i)
##                                IRR Std. err    CI_lower   CI_upper
## (Intercept)             0.01269168 1.153983 0.009585364 0.01680466
## fu1                     3.55468470 1.145595 2.723327591 4.63983230
## fu2                     3.69349752 1.148044 2.817856238 4.84124199
## fu3                     2.93219656 1.157511 2.201324962 3.90572805
## fu4                     2.44775331 1.168350 1.804366345 3.32055420
## fu5                     2.25623262 1.177841 1.637020767 3.10966468
## fu6                     1.79745329 1.196705 1.264161830 2.55571579
## fu7                     1.28866663 1.230698 0.857913141 1.93569910
## fu8                     1.43945962 1.233755 0.953653747 2.17274246
## fu9                     0.79615726 1.321071 0.461300290 1.37408623
## agegrp45-59             1.32779475 1.098749 1.104001143 1.59695387
## agegrp60-74             1.86237635 1.095132 1.558521701 2.22547152
## agegrp75+               3.40028687 1.110094 2.770835948 4.17273019
## year8594Diagnosed 85-94 0.72241051 1.068424 0.634521754 0.82247291
## sexFemale               0.58754651 1.067642 0.516806360 0.66796953
## Test if the effect of age is significant
## For this we use the car package
require(car)
linearHypothesis(poisson7i,c("agegrp45-59 = 0","agegrp60-74 = 0","agegrp75+ = 0"))
## Linear hypothesis test
## 
## Hypothesis:
## agegrp45 - 59 = 0
## agegrp60 - 74 = 0
## agegrp75  +  = 0
## 
## Model 1: restricted model
## Model 2: death_cancer ~ fu + agegrp + year8594 + sex + offset(log(pt))
## 
##   Res.Df Df  Chisq Pr(>Chisq)    
## 1  34297                         
## 2  34294  3 155.82  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## ADVANCED:
## Alternative by comparing deviances
## poisson7i_2 <- update(poisson7i,. ~ . - agegrp)
## anova(poisson7i_2,poisson7i,test="Chisq")

(j) Is the effect of sex modified by calendar period (whilst adjusting for age and follow-up)? Fit an appropriate interaction term to test this hypothesis. Write the linear predictor using pen and paper.

summary(poisson7j <- glm( death_cancer ~ fu + agegrp + year8594*sex + offset( log(pt) ), family=poisson, data=melanoma.spl ))
## 
## Call:
## glm(formula = death_cancer ~ fu + agegrp + year8594 * sex + offset(log(pt)), 
##     family = poisson, data = melanoma.spl)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.5611  -0.2689  -0.2108  -0.1593   3.6840  
## 
## Coefficients:
##                                   Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                       -4.37900    0.14596 -30.001  < 2e-16 ***
## fu1                                1.26830    0.13592   9.331  < 2e-16 ***
## fu2                                1.30659    0.13806   9.464  < 2e-16 ***
## fu3                                1.07569    0.14627   7.354 1.92e-13 ***
## fu4                                0.89511    0.15559   5.753 8.77e-09 ***
## fu5                                0.81360    0.16369   4.971 6.68e-07 ***
## fu6                                0.58630    0.17958   3.265 0.001095 ** 
## fu7                                0.25340    0.20759   1.221 0.222197    
## fu8                                0.36405    0.21007   1.733 0.083090 .  
## fu9                               -0.22829    0.27845  -0.820 0.412297    
## agegrp45-59                        0.28270    0.09419   3.001 0.002688 ** 
## agegrp60-74                        0.62118    0.09089   6.835 8.23e-12 ***
## agegrp75+                          1.22364    0.10444  11.717  < 2e-16 ***
## year8594Diagnosed 85-94           -0.29917    0.08840  -3.384 0.000714 ***
## sexFemale                         -0.50562    0.08813  -5.737 9.64e-09 ***
## year8594Diagnosed 85-94:sexFemale -0.05792    0.13061  -0.443 0.657440    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 8651.5  on 34308  degrees of freedom
## Residual deviance: 8233.2  on 34293  degrees of freedom
## AIC: 10185
## 
## Number of Fisher Scoring iterations: 7
## IRR
IRR(poisson7j)
##                                          IRR Std. err    CI_lower
## (Intercept)                       0.01253791 1.157150 0.009418494
## fu1                               3.55479530 1.145595 2.723411882
## fu2                               3.69354707 1.148045 2.817891675
## fu3                               2.93201254 1.157511 2.201183545
## fu4                               2.44760423 1.168351 1.804252130
## fu5                               2.25601958 1.177844 1.636858558
## fu6                               1.79732528 1.196709 1.264063290
## fu7                               1.28840069 1.230704 0.857729137
## fu8                               1.43915154 1.233761 0.953440609
## fu9                               0.79589580 1.321077 0.461144810
## agegrp45-59                       1.32670920 1.098771 1.103055378
## agegrp60-74                       1.86113111 1.095147 1.557438165
## agegrp75+                         3.39953913 1.110085 2.770266410
## year8594Diagnosed 85-94           0.74143513 1.092423 0.623486770
## sexFemale                         0.60313385 1.092132 0.507450969
## year8594Diagnosed 85-94:sexFemale 0.94372451 1.139528 0.730573802
##                                     CI_upper
## (Intercept)                       0.01669048
## fu1                               4.63997741
## fu2                               4.84131100
## fu3                               3.90548873
## fu4                               3.32035993
## fu5                               3.10938556
## fu6                               2.55555098
## fu7                               1.93531533
## fu8                               2.17229802
## fu9                               1.37364685
## agegrp45-59                       1.59571072
## agegrp60-74                       2.22404271
## agegrp75+                         4.17175267
## year8594Diagnosed 85-94           0.88169641
## sexFemale                         0.71685830
## year8594Diagnosed 85-94:sexFemale 1.21906362

(k) Based on the interaction model you fitted in exercise 7j, estimate the hazard ratio for the effect of sex (with 95% confidence interval) for each calendar period.

ADVANCED: Do this with each of the following methods and confirm that the results are the same:

  1. Using hand-calculation on the estimates from exercise 7j.
# hand calculations
hz7k <- exp(coef(poisson7j))
hz7k["sexFemale"]
## sexFemale 
## 0.6031338
hz7k["sexFemale"]*hz7k["year8594Diagnosed 85-94:sexFemale"]
## sexFemale 
## 0.5691922
  1. Using the estimates from exercise 7j.
linearHypothesis(poisson7j,c("sexFemale = 0","year8594Diagnosed 85-94:sexFemale = 0"))
## Linear hypothesis test
## 
## Hypothesis:
## sexFemale = 0
## year8594Diagnosed 85 - 94:sexFemale = 0
## 
## Model 1: restricted model
## Model 2: death_cancer ~ fu + agegrp + year8594 * sex + offset(log(pt))
## 
##   Res.Df Df  Chisq Pr(>Chisq)    
## 1  34295                         
## 2  34293  2 66.077  4.484e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# ADVANCED:
# hand calculations with confidence intervals
# use estimates with covariance matrix from glm
# to obtain estimates with ci's
length(coef(poisson7j))
## [1] 16
linvec <- c(rep(0,14),c(1,1))
coef.Female8594 <- crossprod(coef(poisson7j),linvec)
var.Female8594 <- t(linvec)%*%vcov(poisson7j)%*%linvec
ci.d <- 1.96*sqrt(var.Female8594)
ci.lower <- coef.Female8594 - ci.d
ci.upper <- coef.Female8594 + ci.d
exp(c(coef.Female8594,ci.lower,ci.upper))
## [1] 0.5691922 0.4705525 0.6885093
  1. Creating appropriate dummy variables that represent the effects of sex for each calendar period.
## Create dummies and Poisson regression
melanoma.spl <- melanoma.spl %>%

    ## Add confidence intervals for the rates
    mutate(femaleEarly = sex=="Female" & year8594=="Diagnosed 75-84",
           femaleLate = sex=="Female" & year8594=="Diagnosed 85-94")

summary(poisson7k <- glm( death_cancer ~ fu + agegrp + year8594 + femaleEarly +
                         femaleLate + offset( log(pt) ), family=poisson,
                         data=melanoma.spl ))
## 
## Call:
## glm(formula = death_cancer ~ fu + agegrp + year8594 + femaleEarly + 
##     femaleLate + offset(log(pt)), family = poisson, data = melanoma.spl)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.5611  -0.2689  -0.2108  -0.1593   3.6840  
## 
## Coefficients:
##                         Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -4.37900    0.14596 -30.001  < 2e-16 ***
## fu1                      1.26830    0.13592   9.331  < 2e-16 ***
## fu2                      1.30659    0.13806   9.464  < 2e-16 ***
## fu3                      1.07569    0.14627   7.354 1.92e-13 ***
## fu4                      0.89511    0.15559   5.753 8.77e-09 ***
## fu5                      0.81360    0.16369   4.971 6.68e-07 ***
## fu6                      0.58630    0.17958   3.265 0.001095 ** 
## fu7                      0.25340    0.20759   1.221 0.222197    
## fu8                      0.36405    0.21007   1.733 0.083090 .  
## fu9                     -0.22829    0.27845  -0.820 0.412297    
## agegrp45-59              0.28270    0.09419   3.001 0.002688 ** 
## agegrp60-74              0.62118    0.09089   6.835 8.23e-12 ***
## agegrp75+                1.22364    0.10444  11.717  < 2e-16 ***
## year8594Diagnosed 85-94 -0.29917    0.08840  -3.384 0.000714 ***
## femaleEarlyTRUE         -0.50562    0.08813  -5.737 9.64e-09 ***
## femaleLateTRUE          -0.56354    0.09710  -5.804 6.48e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 8651.5  on 34308  degrees of freedom
## Residual deviance: 8233.2  on 34293  degrees of freedom
## AIC: 10185
## 
## Number of Fisher Scoring iterations: 7
## IRR
IRR(poisson7k)
##                                IRR Std. err    CI_lower   CI_upper
## (Intercept)             0.01253791 1.157150 0.009418494 0.01669048
## fu1                     3.55479530 1.145595 2.723411882 4.63997741
## fu2                     3.69354707 1.148045 2.817891675 4.84131100
## fu3                     2.93201254 1.157511 2.201183545 3.90548873
## fu4                     2.44760423 1.168351 1.804252130 3.32035993
## fu5                     2.25601958 1.177844 1.636858558 3.10938556
## fu6                     1.79732528 1.196709 1.264063290 2.55555098
## fu7                     1.28840069 1.230704 0.857729137 1.93531533
## fu8                     1.43915154 1.233761 0.953440609 2.17229802
## fu9                     0.79589580 1.321077 0.461144810 1.37364685
## agegrp45-59             1.32670920 1.098771 1.103055378 1.59571072
## agegrp60-74             1.86113111 1.095147 1.557438165 2.22404271
## agegrp75+               3.39953913 1.110085 2.770266410 4.17175267
## year8594Diagnosed 85-94 0.74143513 1.092423 0.623486770 0.88169641
## femaleEarlyTRUE         0.60313385 1.092132 0.507450969 0.71685830
## femaleLateTRUE          0.56919219 1.101968 0.470552474 0.68850930

Write the linear predictor using pen and paper.

  1. Using syntax available from Stata 11 and onwards to repeat the previous model.
## Add interaction term
summary(poisson7k2 <- glm( death_cancer ~ fu + agegrp + year8594 + year8594:sex +
                         offset( log(pt) ), family=poisson,
                         data=melanoma.spl ))
## 
## Call:
## glm(formula = death_cancer ~ fu + agegrp + year8594 + year8594:sex + 
##     offset(log(pt)), family = poisson, data = melanoma.spl)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.5611  -0.2689  -0.2108  -0.1593   3.6840  
## 
## Coefficients:
##                                   Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                       -4.37900    0.14596 -30.001  < 2e-16 ***
## fu1                                1.26830    0.13592   9.331  < 2e-16 ***
## fu2                                1.30659    0.13806   9.464  < 2e-16 ***
## fu3                                1.07569    0.14627   7.354 1.92e-13 ***
## fu4                                0.89511    0.15559   5.753 8.77e-09 ***
## fu5                                0.81360    0.16369   4.971 6.68e-07 ***
## fu6                                0.58630    0.17958   3.265 0.001095 ** 
## fu7                                0.25340    0.20759   1.221 0.222197    
## fu8                                0.36405    0.21007   1.733 0.083090 .  
## fu9                               -0.22829    0.27845  -0.820 0.412297    
## agegrp45-59                        0.28270    0.09419   3.001 0.002688 ** 
## agegrp60-74                        0.62118    0.09089   6.835 8.23e-12 ***
## agegrp75+                          1.22364    0.10444  11.717  < 2e-16 ***
## year8594Diagnosed 85-94           -0.29917    0.08840  -3.384 0.000714 ***
## year8594Diagnosed 75-84:sexFemale -0.50562    0.08813  -5.737 9.64e-09 ***
## year8594Diagnosed 85-94:sexFemale -0.56354    0.09710  -5.804 6.48e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 8651.5  on 34308  degrees of freedom
## Residual deviance: 8233.2  on 34293  degrees of freedom
## AIC: 10185
## 
## Number of Fisher Scoring iterations: 7
## IRR
IRR(poisson7k2)
##                                          IRR Std. err    CI_lower
## (Intercept)                       0.01253791 1.157150 0.009418494
## fu1                               3.55479530 1.145595 2.723411882
## fu2                               3.69354707 1.148045 2.817891675
## fu3                               2.93201254 1.157511 2.201183545
## fu4                               2.44760423 1.168351 1.804252130
## fu5                               2.25601958 1.177844 1.636858558
## fu6                               1.79732528 1.196709 1.264063290
## fu7                               1.28840069 1.230704 0.857729137
## fu8                               1.43915154 1.233761 0.953440609
## fu9                               0.79589580 1.321077 0.461144810
## agegrp45-59                       1.32670920 1.098771 1.103055378
## agegrp60-74                       1.86113111 1.095147 1.557438165
## agegrp75+                         3.39953913 1.110085 2.770266410
## year8594Diagnosed 85-94           0.74143513 1.092423 0.623486770
## year8594Diagnosed 75-84:sexFemale 0.60313385 1.092132 0.507450969
## year8594Diagnosed 85-94:sexFemale 0.56919219 1.101968 0.470552474
##                                     CI_upper
## (Intercept)                       0.01669048
## fu1                               4.63997741
## fu2                               4.84131100
## fu3                               3.90548873
## fu4                               3.32035993
## fu5                               3.10938556
## fu6                               2.55555098
## fu7                               1.93531533
## fu8                               2.17229802
## fu9                               1.37364685
## agegrp45-59                       1.59571072
## agegrp60-74                       2.22404271
## agegrp75+                         4.17175267
## year8594Diagnosed 85-94           0.88169641
## year8594Diagnosed 75-84:sexFemale 0.71685830
## year8594Diagnosed 85-94:sexFemale 0.68850930

Using syntax available from Stata 11 and onwards to repeat the previous model.

(l) Now fit a separate model for each calendar period in order to estimate the hazard ratio for the effect of sex (with 95% confidence interval) for each calendar period.

Why do the estimates differ from those you obtained in the previous part?

melanoma.spl %>%

    group_by(year8594) %>%

    summarize()
## Source: local data frame [2 x 1]
## 
##          year8594
## 1 Diagnosed 75-84
## 2 Diagnosed 85-94
summary( poisson7l.early <- glm( death_cancer ~ fu + agegrp + sex + offset( log(pt) ),
                       family = poisson, data = melanoma.spl,
                       subset = which(year8594 == "Diagnosed 75-84" ) ) )
## 
## Call:
## glm(formula = death_cancer ~ fu + agegrp + sex + offset(log(pt)), 
##     family = poisson, data = melanoma.spl, subset = which(year8594 == 
##         "Diagnosed 75-84"))
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.5488  -0.2801  -0.2186  -0.1698   3.6866  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -4.35024    0.19749 -22.028  < 2e-16 ***
## fu1          1.29711    0.19807   6.549 5.80e-11 ***
## fu2          1.43389    0.19734   7.266 3.70e-13 ***
## fu3          0.87511    0.21622   4.047 5.18e-05 ***
## fu4          0.72162    0.22538   3.202  0.00137 ** 
## fu5          0.74575    0.22738   3.280  0.00104 ** 
## fu6          0.50579    0.24297   2.082  0.03737 *  
## fu7          0.13806    0.27198   0.508  0.61172    
## fu8          0.41333    0.25503   1.621  0.10508    
## fu9         -0.26674    0.31931  -0.835  0.40352    
## agegrp45-59  0.36623    0.12112   3.024  0.00250 ** 
## agegrp60-74  0.59417    0.11933   4.979 6.39e-07 ***
## agegrp75+    1.02300    0.15322   6.677 2.45e-11 ***
## sexFemale   -0.48356    0.08839  -5.471 4.48e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 4649.9  on 16933  degrees of freedom
## Residual deviance: 4430.6  on 16920  degrees of freedom
## AIC: 5496.6
## 
## Number of Fisher Scoring iterations: 6
IRR(poisson7l.early)
##                    IRR Std. err    CI_lower   CI_upper
## (Intercept) 0.01290368 1.218341 0.008762065 0.01900292
## fu1         3.65871653 1.219049 2.481572437 5.39424377
## fu2         4.19499842 1.218162 2.849375955 6.17609331
## fu3         2.39915019 1.241370 1.570402620 3.66525219
## fu4         2.05775938 1.252803 1.322953370 3.20069759
## fu5         2.10802429 1.255308 1.349972720 3.29174535
## fu6         1.65830219 1.275036 1.030005311 2.66985630
## fu7         1.14804722 1.312559 0.673668979 1.95646892
## fu8         1.51184459 1.290498 0.917112444 2.49225064
## fu9         0.76587332 1.376184 0.409591721 1.43206494
## agegrp45-59 1.44229093 1.128756 1.137513421 1.82872841
## agegrp60-74 1.81153335 1.126747 1.433726763 2.28889714
## agegrp75+   2.78152101 1.165585 2.059947185 3.75585315
## sexFemale   0.61658148 1.092411 0.518505690 0.73320839
summary( poisson7l.late <- glm( death_cancer ~ fu + agegrp + sex + offset( log(pt) ),
                       family = poisson, data = melanoma.spl,
                       subset = which(year8594 == "Diagnosed 85-94" ) ) )
## 
## Call:
## glm(formula = death_cancer ~ fu + agegrp + sex + offset(log(pt)), 
##     family = poisson, data = melanoma.spl, subset = which(year8594 == 
##         "Diagnosed 85-94"))
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.5041  -0.2578  -0.1938  -0.1457   3.6334  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -4.716255   0.198840 -23.719  < 2e-16 ***
## fu1          1.242842   0.186946   6.648 2.97e-11 ***
## fu2          1.158910   0.194930   5.945 2.76e-09 ***
## fu3          1.269035   0.198560   6.391 1.65e-10 ***
## fu4          1.091821   0.214814   5.083 3.72e-07 ***
## fu5          0.907839   0.238726   3.803 0.000143 ***
## fu6          0.712226   0.272810   2.611 0.009036 ** 
## fu7          0.467548   0.331976   1.408 0.159019    
## fu8          0.003177   0.476361   0.007 0.994678    
## fu9         -0.219501   0.725878  -0.302 0.762352    
## agegrp45-59  0.170696   0.149540   1.141 0.253672    
## agegrp60-74  0.657696   0.140647   4.676 2.92e-06 ***
## agegrp75+    1.384288   0.148765   9.305  < 2e-16 ***
## sexFemale   -0.588835   0.097576  -6.035 1.59e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 3997.8  on 17374  degrees of freedom
## Residual deviance: 3780.7  on 17361  degrees of freedom
## AIC: 4690.7
## 
## Number of Fisher Scoring iterations: 7
IRR(poisson7l.late)
##                     IRR Std. err    CI_lower   CI_upper
## (Intercept) 0.008948632 1.219987 0.006060392 0.01321334
## fu1         3.465448196 1.205563 2.402300548 4.99909606
## fu2         3.186458106 1.215226 2.174603986 4.66913301
## fu3         3.557416952 1.219645 2.410554906 5.24991791
## fu4         2.979696169 1.239631 1.955775170 4.53967787
## fu5         2.478958882 1.269631 1.552606986 3.95801204
## fu6         2.038523435 1.313650 1.194250387 3.47965371
## fu7         1.596076214 1.393720 0.832663877 3.05940891
## fu8         1.003182304 1.610204 0.394360631 2.55191481
## fu9         0.802919497 2.066546 0.193549378 3.33082816
## agegrp45-59 1.186130380 1.161300 0.884792362 1.59009655
## agegrp60-74 1.930340028 1.151019 1.465252202 2.54305206
## agegrp75+   3.991981639 1.160400 2.982343303 5.34342153
## sexFemale   0.554973666 1.102495 0.458367860 0.67194015
# compare with results in i
IRR(poisson7i)
##                                IRR Std. err    CI_lower   CI_upper
## (Intercept)             0.01269168 1.153983 0.009585364 0.01680466
## fu1                     3.55468470 1.145595 2.723327591 4.63983230
## fu2                     3.69349752 1.148044 2.817856238 4.84124199
## fu3                     2.93219656 1.157511 2.201324962 3.90572805
## fu4                     2.44775331 1.168350 1.804366345 3.32055420
## fu5                     2.25623262 1.177841 1.637020767 3.10966468
## fu6                     1.79745329 1.196705 1.264161830 2.55571579
## fu7                     1.28866663 1.230698 0.857913141 1.93569910
## fu8                     1.43945962 1.233755 0.953653747 2.17274246
## fu9                     0.79615726 1.321071 0.461300290 1.37408623
## agegrp45-59             1.32779475 1.098749 1.104001143 1.59695387
## agegrp60-74             1.86237635 1.095132 1.558521701 2.22547152
## agegrp75+               3.40028687 1.110094 2.770835948 4.17273019
## year8594Diagnosed 85-94 0.72241051 1.068424 0.634521754 0.82247291
## sexFemale               0.58754651 1.067642 0.516806360 0.66796953
# compare with results in j
IRR(poisson7j)
##                                          IRR Std. err    CI_lower
## (Intercept)                       0.01253791 1.157150 0.009418494
## fu1                               3.55479530 1.145595 2.723411882
## fu2                               3.69354707 1.148045 2.817891675
## fu3                               2.93201254 1.157511 2.201183545
## fu4                               2.44760423 1.168351 1.804252130
## fu5                               2.25601958 1.177844 1.636858558
## fu6                               1.79732528 1.196709 1.264063290
## fu7                               1.28840069 1.230704 0.857729137
## fu8                               1.43915154 1.233761 0.953440609
## fu9                               0.79589580 1.321077 0.461144810
## agegrp45-59                       1.32670920 1.098771 1.103055378
## agegrp60-74                       1.86113111 1.095147 1.557438165
## agegrp75+                         3.39953913 1.110085 2.770266410
## year8594Diagnosed 85-94           0.74143513 1.092423 0.623486770
## sexFemale                         0.60313385 1.092132 0.507450969
## year8594Diagnosed 85-94:sexFemale 0.94372451 1.139528 0.730573802
##                                     CI_upper
## (Intercept)                       0.01669048
## fu1                               4.63997741
## fu2                               4.84131100
## fu3                               3.90548873
## fu4                               3.32035993
## fu5                               3.10938556
## fu6                               2.55555098
## fu7                               1.93531533
## fu8                               2.17229802
## fu9                               1.37364685
## agegrp45-59                       1.59571072
## agegrp60-74                       2.22404271
## agegrp75+                         4.17175267
## year8594Diagnosed 85-94           0.88169641
## sexFemale                         0.71685830
## year8594Diagnosed 85-94:sexFemale 1.21906362
# Poisson-regression with effects specific for diagnose period
summary(poisson7l2 <- glm( death_cancer ~ fu + fu:year8594 + agegrp + agegrp:year8594
                          + sex*year8594 + offset( log(pt) ),
                          family=poisson, data=melanoma.spl ))
## 
## Call:
## glm(formula = death_cancer ~ fu + fu:year8594 + agegrp + agegrp:year8594 + 
##     sex * year8594 + offset(log(pt)), family = poisson, data = melanoma.spl)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.5488  -0.2630  -0.2067  -0.1606   3.6866  
## 
## Coefficients:
##                                     Estimate Std. Error z value Pr(>|z|)
## (Intercept)                         -4.35024    0.19751 -22.026  < 2e-16
## fu1                                  1.29711    0.19808   6.548 5.82e-11
## fu2                                  1.43389    0.19736   7.266 3.72e-13
## fu3                                  0.87511    0.21623   4.047 5.18e-05
## fu4                                  0.72162    0.22540   3.202  0.00137
## fu5                                  0.74575    0.22740   3.280  0.00104
## fu6                                  0.50579    0.24299   2.082  0.03738
## fu7                                  0.13806    0.27202   0.508  0.61177
## fu8                                  0.41333    0.25504   1.621  0.10510
## fu9                                 -0.26674    0.31945  -0.835  0.40372
## agegrp45-59                          0.36623    0.12113   3.024  0.00250
## agegrp60-74                          0.59417    0.11934   4.979 6.40e-07
## agegrp75+                            1.02300    0.15323   6.676 2.45e-11
## sexFemale                           -0.48356    0.08839  -5.471 4.48e-08
## year8594Diagnosed 85-94             -0.36601    0.28026  -1.306  0.19156
## fu1:year8594Diagnosed 85-94         -0.05427    0.27237  -0.199  0.84207
## fu2:year8594Diagnosed 85-94         -0.27498    0.27739  -0.991  0.32153
## fu3:year8594Diagnosed 85-94          0.39392    0.29357   1.342  0.17965
## fu4:year8594Diagnosed 85-94          0.37020    0.31137   1.189  0.23445
## fu5:year8594Diagnosed 85-94          0.16209    0.32970   0.492  0.62298
## fu6:year8594Diagnosed 85-94          0.20643    0.36533   0.565  0.57204
## fu7:year8594Diagnosed 85-94          0.32949    0.42919   0.768  0.44267
## fu8:year8594Diagnosed 85-94         -0.41015    0.54034  -0.759  0.44781
## fu9:year8594Diagnosed 85-94          0.04724    0.79306   0.060  0.95250
## year8594Diagnosed 85-94:agegrp45-59 -0.19554    0.19244  -1.016  0.30959
## year8594Diagnosed 85-94:agegrp60-74  0.06352    0.18446   0.344  0.73056
## year8594Diagnosed 85-94:agegrp75+    0.36129    0.21357   1.692  0.09070
## year8594Diagnosed 85-94:sexFemale   -0.10527    0.13166  -0.800  0.42397
##                                        
## (Intercept)                         ***
## fu1                                 ***
## fu2                                 ***
## fu3                                 ***
## fu4                                 ** 
## fu5                                 ** 
## fu6                                 *  
## fu7                                    
## fu8                                    
## fu9                                    
## agegrp45-59                         ** 
## agegrp60-74                         ***
## agegrp75+                           ***
## sexFemale                           ***
## year8594Diagnosed 85-94                
## fu1:year8594Diagnosed 85-94            
## fu2:year8594Diagnosed 85-94            
## fu3:year8594Diagnosed 85-94            
## fu4:year8594Diagnosed 85-94            
## fu5:year8594Diagnosed 85-94            
## fu6:year8594Diagnosed 85-94            
## fu7:year8594Diagnosed 85-94            
## fu8:year8594Diagnosed 85-94            
## fu9:year8594Diagnosed 85-94            
## year8594Diagnosed 85-94:agegrp45-59    
## year8594Diagnosed 85-94:agegrp60-74    
## year8594Diagnosed 85-94:agegrp75+   .  
## year8594Diagnosed 85-94:sexFemale      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 8651.5  on 34308  degrees of freedom
## Residual deviance: 8211.4  on 34281  degrees of freedom
## AIC: 10187
## 
## Number of Fisher Scoring iterations: 7
IRR(poisson7l2)
##                                            IRR Std. err    CI_lower
## (Intercept)                         0.01290368 1.218361 0.008761789
## fu1                                 3.65871657 1.219065 2.481512002
## fu2                                 4.19499847 1.218178 2.849304080
## fu3                                 2.39915022 1.241385 1.570364523
## fu4                                 2.05775940 1.252819 1.322918711
## fu5                                 2.10802431 1.255328 1.349931744
## fu6                                 1.65830221 1.275056 1.029974056
## fu7                                 1.14804717 1.312607 0.673621336
## fu8                                 1.51184461 1.290518 0.917084502
## fu9                                 0.76587284 1.376367 0.409484495
## agegrp45-59                         1.44229095 1.128767 1.137492583
## agegrp60-74                         1.81153338 1.126757 1.433701327
## agegrp75+                           2.78152106 1.165596 2.059911332
## sexFemale                           0.61658148 1.092417 0.518500871
## year8594Diagnosed 85-94             0.69349480 1.323475 0.400387597
## fu1:year8594Diagnosed 85-94         0.94717591 1.313075 0.555371233
## fu2:year8594Diagnosed 85-94         0.75958505 1.319686 0.441015710
## fu3:year8594Diagnosed 85-94         1.48278208 1.341201 0.834044889
## fu4:year8594Diagnosed 85-94         1.44802943 1.365288 0.786570537
## fu5:year8594Diagnosed 85-94         1.17596314 1.390545 0.616241529
## fu6:year8594Diagnosed 85-94         1.22928344 1.440996 0.600721199
## fu7:year8594Diagnosed 85-94         1.39025317 1.536006 0.599464655
## fu8:year8594Diagnosed 85-94         0.66354855 1.716590 0.230105184
## fu9:year8594Diagnosed 85-94         1.04837181 2.210152 0.221537835
## year8594Diagnosed 85-94:agegrp45-59 0.82239328 1.212205 0.563988399
## year8594Diagnosed 85-94:agegrp60-74 1.06558347 1.202566 0.742290439
## year8594Diagnosed 85-94:agegrp75+   1.43517937 1.238087 0.944308900
## year8594Diagnosed 85-94:sexFemale   0.90008164 1.140720 0.695362176
##                                       CI_upper
## (Intercept)                         0.01900352
## fu1                                 5.39437527
## fu2                                 6.17624924
## fu3                                 3.66534120
## fu4                                 3.20078151
## fu5                                 3.29184531
## fu6                                 2.66993736
## fu7                                 1.95660711
## fu8                                 2.49232662
## fu9                                 1.43243816
## agegrp45-59                         1.82876198
## agegrp60-74                         2.28893783
## agegrp75+                           3.75591865
## sexFemale                           0.73321519
## year8594Diagnosed 85-94             1.20117365
## fu1:year8594Diagnosed 85-94         1.61539192
## fu2:year8594Diagnosed 85-94         1.30827413
## fu3:year8594Diagnosed 85-94         2.63612034
## fu4:year8594Diagnosed 85-94         2.66573579
## fu5:year8594Diagnosed 85-94         2.24407029
## fu6:year8594Diagnosed 85-94         2.51553927
## fu7:year8594Diagnosed 85-94         3.22421658
## fu8:year8594Diagnosed 85-94         1.91345834
## fu9:year8594Diagnosed 85-94         4.96115456
## year8594Diagnosed 85-94:agegrp45-59 1.19919257
## year8594Diagnosed 85-94:agegrp60-74 1.52968174
## year8594Diagnosed 85-94:agegrp75+   2.18121404
## year8594Diagnosed 85-94:sexFemale   1.16507194

Can you fit a single model that reproduces the estimates you obtained from the stratified models?