Welcome to the Biostatistics III in R course for autumn 2021. This course introduces statistical methods for survival analysis with emphasis on the application of such methods to the analysis of epidemiological cohort studies. Topics covered include methods for estimating patient survival (life table and Kaplan-Meier methods), comparing survival between patient subgroups (log-rank test), and modelling survival (primarily Poisson regression, Cox proportional hazards model and flexible parametric models). The course addresses the concept of 'time' as a potential confounder or effect modifier and approaches to defining 'time' (e.g., time since entry, attained age, calendar time). The course will emphasise the basic concepts of statistical modelling in epidemiology, such as controlling for confounding and assessing effect modification.

Link to course description in the official KI course catalogue

Link to the course syllabus


Teachers: Alexander Ploner, Xinhe Mao, Benjamin Christoffersen, Joshua Entrop, Mark Clements, Anna Johansson.

Course director and examiner: Mark Clements.
[Click here for faculty biographies]

Date and Location

Teaching (lectures and supervised practical sessions) will be held on-line, primarily using Zoom. See the schedule for further details.


The course grade is based solely on a take-home written examination. The exam requires you to understand the concepts of survival analysis and interpret output from standard statistical software. Instructions:

  • The examination is individual-based: you are not allowed to cooperate with anyone, although you are encouraged to consult the available literature. The teachers will use Urkund in order to assess potential plagiarism
  • The examination will be made available at 12:00 on Wednesday 18 November 2020 and the examination is due by 17:00 on Wednesday 25 November 2020.
  • The examination will be graded and results will be returned to you by 4 December 2020.
  • Students who do not obtain a passing grade in the first examination will be offered a second examination within 2 months of the final day of the course.
  • Do not write answers by hand: please use Word, R markdown, LaTeX or a similar format for your examination report.
  • Motivate all answers and show all calculations in your examination report, but write as concise an answer as possible without loss of clarity. Define any notation that you use for equations. The examination report should be written in English.
  • Email the examination report containing the answers as a pdf file to Gunilla Nilsson Roos. Write your name in the email, but do not write your name in the document containing the answers.

Prerequisite Knowledge

Participants are expected to have prerequisite knowledge equivalent to the learning outcomes of the courses Epidemiology I, Biostatistics I and Biostatistics II. We have provided a self-assessment test for you to confirm that you understand the central concepts. We advise all potential applicants to take the test prior to applying for Biostatistics III. If you attempt the test under examination conditions (i.e., without referring to the answers) we would recommend:

  1. if you score 70% or more then you possess the required prerequisite knowledge;
  2. if you score 40% to 70% you should revise the areas where you lost marks;
  3. if you score less than 40% you should, at a minimum, undertake an extensive review of central concepts in statistical modelling and possibly consider studying intermediate-level courses (e.g., Biostatistics II) before taking Biostatistics III.

Knowledge of R is assumed

Participants are expected to possess basic knowledge of R prior to the start of the course. We will use R versions 3.6.3 and 4.0 during the course. There may be issues with installing some software with version 3 of R; for example, the foreign package on CRAN by default now needs R version 4 (see here for a solution).

R, Stata and SAS users

Although we will be using R exclusively during the course, we have made the data sets available in Stata and SAS formats and users are welcome to attempt the exercises using Stata or SAS. Please keep in mind that if you choose to work with R or SAS we expect you to be familiar with those softwares. The teaching assistants can help with the statistical concepts and some are experienced Stata and SAS users.

We have written brief notes on how the methods described during the course can be implemented in SAS (see notes\for\sas\users.pdf).

All files relevant for R users are available from:
http://biostat3.net/download/index.php?dir=R/, with the computing labs available from http://biostat3.net/download/R/

All files relevant for SAS users are available from:

All files relevant for Stata users are available from:
http://biostat3.net/download/do\files and http://biostat3.net/download/Data

Course language

The course language will be English. All instruction and course materials will be in English.