Biostatistics III in R

Code by Johan Zetterqvist, Andreas Karlsson and Mark Clements

Exercise 6. Diet data: tabulating incidence rates and modelling with Poisson regression

Show the first six rows of data and a summary of the data-frame:

(a)

Tabulate CHD incidence rates per 1000 person-years for each category of hieng. Calculate (by hand) the ratio of the two incidence rates.

We could also calculate the rates using:

Code for the rate ratios:

(b)

Fit a poisson model to find the incidence rate ratio for the high energy group compared to the low energy group and compare the estimate to the one you obtained in the previous question.

Also write out the regression equation, defining your notation.

(c)

Grouping the values of total energy into just two groups does not tell us much about how the CHD rate changes with total energy. It is a useful exploratory device, but to look more closely we need to group the total energy into perhaps 3 or 4 groups.

In this example we shall use the cut points 1500, 2500, 3000, 4500. Compare with a normal distribution to check if these cutpoints seem reasonable.

(d)

Create a new variable eng3 coded 1500 for values of energy in the range 1500–2499, 2500 for values in the range 2500–2999, and 3000 for values in the range 3000–4500.

(e)

Estimate the rates for different levels of eng3. Calculate (by hand) the ratio of rates in the second and third levels to the first level.

(f)

Create your own indicator variables for the three levels of eng3.

(g)

Check the indicator variables.

(h)

Use poisson to compare the second and third levels with the first.

Compare your estimates with those you obtained in part 6e. Write the linear predictor using pen and paper.

Also write out the regression equation, defining your notation.

(i)

Use a poisson model to compare the first and third levels with the second. Write the linear predictor using pen and paper.

(j)

Repeat the analysis comparing the second and third levels with the first but this time have R create the indicators automatically y using the categorical variable eng3.

(k)

Calculate the total number of events during follow-up, person-time at risk, and the crude incidence rate (per 1000 person-years)