Exercise 8. Diet data: Using Poisson regression to study the effect of energy intake adjusting for confounders on two different timescales

Use Poisson regression to study the association between energy intake (hieng) and CHD adjusted for potential confounders (job, BMI). We know that people who expend a lot of energy (i.e., are physically active) require a higher energy intake. We do not have data on physical activity but we are hoping that occupation (job) will serve as a surrogate measure of work-time physical activity (conductors on London double-decker busses expend energy walking up and down the stairs all day).

Fit models both without adjusting for ‘time’ and by adjusting for attained age (you will need to split the data) and time-since-entry and compare the results.


Load the diet data using time-on-study as the timescale.

You may have to install the required packages the first time you use them. You can install a package by install.packages("package_of_interest") for each package you require.

Load diet data and explore it.

Rates can be modelled on different timescales, e.g., attained age, time-since-entry, calendar time. Plot the CHD incidence rates both by attained age and by time-since-entry. Is there a difference? Do the same for CHD hazard by different energy intakes (hieng).

(b)

Fit a poisson model to find the incidence rate ratio for the high energy group compared to the low energy group without adjusting for any time scale.

(d)

Now fit the model for CHD, both without and with the adjustment for job and bmi. Is the effect of hieng on CHD confounded by age, BMI or job? Write the linear predictors using pen and paper.

Firstly, let’s adjust for the timescale attained age. To do this in Poisson regression you must split the data on timescale age. The risktime variable contains the correct amount of risktime for each timeband.

Fitting the model for CHD, without adjustment for job and bmi.

Fitting the model for CHD, with adjustment for job and bmi.

The effect of high energy intake is somewhat confounded by age, but also confounded by job and bmi. What assumption is being made about the shape of the baseline hazard (HINT: the baseline hazard takes the shape of the timescale)?