Biostat III - Survival analysis for epidemiologists

Welcome to Biostatistics III for spring 2016. This course introduces statistical methods for survival analysis with emphasis on the application of such methods to the analysis of epidemiological cohort studies. Topics covered include methods for estimating patient survival (life table and Kaplan-Meier methods), comparing survival between patient subgroups (log-rank test), and modelling survival (primarily Poisson regression and the Cox proportional hazards model). The course addresses the concept of 'time' as a potential confounder or effect modifier and approaches to defining 'time' (e.g., time since entry, attained age, calendar time). The course will emphasise the basic concepts of statistical modelling in epidemiology, such as controlling for confounding and assessing effect modification.

Link to course description in the official KI course catalogue

Link to the course syllabus

Applications for the course should be made on the KI web (Link-to-apply) between 2015-10-15 and 2015-11-16.

The 2016 examination is here. The datasets for the 2016 exam are under Download/exams/2016 or here.


Primary teachers: Mark Clements and Hannah Bower.
Teaching assistants: Andreas Karlsson, Annika Tillander, Bénédicte Delcoigne, Henrik Olsson, Johan Zetterqvist, Xing-Rong Liu.
[Click here for faculty biographies]

Date and Location

Teaching (lectures and supervised practical sessions) will be held at KI Solna campus, primarily at the Department of Medical Epidemiology and Biostatistics, Nobelsväg 12A, KI Solna. A map is available at (then click on 'Contact'). See the schedule for further details and specific locations for each day.


The course grade is based solely on a take-home written examination. This is the first year for a take-home examination in Biostatistics III. The content of the examination will be similar to the previous written examinations; in additional to the learning outcomes considered in the previous examinations, the take-home examination will also require you to fit the model using standard statistical software. Instructions:

  • The examination is individual-based: you are not allowed to cooperate with anyone, although you are encouraged to consult the available literature. The teachers will use Urkund in order to assess potential plagiarism
  • The examination will be made available at 09:00 on Wednesday 16 March 2016 and the examination is due by 17:00 on Monday 28 March 2016.
  • The examination will be graded and results will be returned to you by 8 April 2016.
  • Students who do not obtain a passing grade in the first examination will be offered a second examination within 2 months of the final day of the course.
  • Do not write answers by hand: please use Word, LaTeX or a similar format for your examination report.
  • Motivate all answers and show all calculations in your examination report, but write as brief an answer as possible without loss of clarity. Define any notation that you use for equations. The examination report should be written in English.
  • You are expected to write computer code to read the data and for your analysis. Include your computer code in your report. You are encouraged to use Stata, R or SAS for your analysis; if you wish to use other software, please contact Mark Clements.
  • Email the examination report containing the answers as a pdf file to Gunilla Nilsson Roos. Write your name in the email, but do not write your name in the document containing the answers.

Prerequisite Knowledge

Participants are expected to have prerequisite knowledge equivalent to the learning outcomes of the courses Epidemiology I, Biostatistics I and Biostatistics II. We have provided a self-assessment test for you to confirm that you understand the central concepts. We advise all potential applicants to take the test prior to applying for Biostatistics III. If you attempt the test under examination conditions (i.e., without referring to the answers) we would recommend:

  1. if you score 70% or more then you possess the required prerequisite knowledge;
  2. if you score 40% to 70% you should revise the areas where you lost marks;
  3. if you score less than 40% you should, at a minimum, undertake an extensive review of central concepts in statistical modelling and possibly consider studying intermediate-level courses (e.g., Biostatistics II) before taking Biostatistics III.

Knowledge of Stata is assumed

Participants are expected to possess basic knowledge of Stata (e.g., through using Stata in Biostatistics I and Biostatistics II) prior to the start of the course. If you have not used Stata, please work through the Introduction to Stata document and the practice questions in section 3. We will use Stata 14 during the course, although will not rely on any of the new Stata features. That is, you will be able to do the exercises in Stata versions 11 through 14.

For SAS and R users

Although we will be using Stata exclusively during the course, we have made the data sets available in SAS and R formats and users are welcome to attempt the exercises using SAS or R. Neither SAS nor R are installed on the PCs in our computer lab so you will need to bring a laptop if you wish to work with these softwares during the lab sessions. Please keep in mind that if you choose to work with SAS or R we expect you to be familiar with those softwares. The teaching assistants can help with the statistical concepts and many are experienced SAS and R users.

We have written brief notes on how the methods described during the course can be implemented in SAS (see notes_for_sas_users.pdf) and have provided SAS code for reproducing some of the key exercises (e.g., and

We have completed more exercises in R: see this page.

All files relevant for SAS users are available from:

All files relevant for R users are available from:

Course language

The course language will be English. All instruction and course materials will be in English.