Biostat III - Survival analysis for epidemiologists in Stata

Welcome to Biostatistics III in Stata course for spring 2018. This course introduces statistical methods for survival analysis with emphasis on the application of such methods to the analysis of epidemiological cohort studies. Topics covered include methods for estimating patient survival (life table and Kaplan-Meier methods), comparing survival between patient subgroups (log-rank test), and modelling survival (primarily Poisson regression and the Cox proportional hazards model). The course addresses the concept of 'time' as a potential confounder or effect modifier and approaches to defining 'time' (e.g., time since entry, attained age, calendar time). The course will emphasise the basic concepts of statistical modelling in epidemiology, such as controlling for confounding and assessing effect modification.

Link to course description in the official KI course catalogue

Link to the course syllabus


Primary teachers: Anna Johansson, Therese Andersson, Caroline Weibull.
Teaching assistants: Mark Clements, Henrik Olsson, Alex Ploner.

Course director and examiner: Mark Clements.
[Click here for faculty biographies]

Date and Location

Teaching (lectures and supervised practical sessions) will be held at KI Solna campus at Wargentin room at the Department of Medical Epidemiology and Biostatistics, Nobelsväg 12A, KI Solna. A map is available at See the schedule for further details and specific locations for each day.


The course grade is based solely on a take-home written examination. The examination will require you to fit the model using standard statistical software. Instructions:

  • The examination is individual-based: you are not allowed to cooperate with anyone, although you are encouraged to consult the available literature. The teachers will use Urkund in order to assess potential plagiarism
  • The examination will be made available at 17:00 on Wednesday 21 February 2018 and the examination is due by 17:00 on Monday 26 February 2018.
  • The examination will be graded and results will be returned to you by 7 March 2018.
  • Students who do not obtain a passing grade in the first examination will be offered a second examination within 2 months of the final day of the course.
  • Do not write answers by hand: please use Word, LaTeX or a similar format for your examination report.
  • Motivate all answers and show all calculations in your examination report, but write as brief an answer as possible without loss of clarity. Define any notation that you use for equations. The examination report should be written in English.
  • You are expected to write computer code to read the data and for your analysis. Include your computer code in your report. You are encouraged to use R, Stata or SAS for your analysis; if you wish to use other software, please contact Mark Clements.
  • Email the examination report containing the answers as a pdf file to Gunilla Nilsson Roos. Write your name in the email, but do not write your name in the document containing the answers.

Prerequisite Knowledge

Participants are expected to have prerequisite knowledge equivalent to the learning outcomes of the courses Epidemiology I, Biostatistics I and Biostatistics II. We have provided a self-assessment test for you to confirm that you understand the central concepts. We advise all potential applicants to take the test prior to applying for Biostatistics III. If you attempt the test under examination conditions (i.e., without referring to the answers) we would recommend:

  1. if you score 70% or more then you possess the required prerequisite knowledge;
  2. if you score 40% to 70% you should revise the areas where you lost marks;
  3. if you score less than 40% you should, at a minimum, undertake an extensive review of central concepts in statistical modelling and possibly consider studying intermediate-level courses (e.g., Biostatistics II) before taking Biostatistics III.

Knowledge of Stata is assumed

Participants are expected to possess basic knowledge of R prior to the start of the course.

For SAS and R users

Although we will be using Stata exclusively during the course, we have made the data sets available in R and SAS formats and users are welcome to attempt the exercises using R or SAS. SAS is not installed on the PCs in our computer lab so you will need to bring a laptop if you wish to work with that software during the lab sessions. Please keep in mind that if you choose to work with SAS or r we expect you to be familiar with those softwares. The teaching assistants can help with the statistical concepts and some are experienced R and SAS users.

We have written brief notes on how the methods described during the course can be implemented in SAS (see notes_for_sas_users.pdf).

All files relevant for R users are available from:

All files relevant for SAS users are available from:

All files relevant for Stata users are available from: and

Course language

The course language will be English. All instruction and course materials will be in English.