This course will provide fundamental statistical concepts and tools relevant to the analysis of high-dimensional genomics data arising from population-based association studies. A first-course in statistics is assumed.
The course introduces advanced central topics in biostatistics and health data science including survival analysis, design and analysis of clinical trials, models for correlated data, bayesian modeling, and causal inference. The course motivates statistical reasoning and methods through substantive research questions and features of data typically available in public health and biomedical research. Students will obtain hands-on experience in applying selected methods on real data using the statistical programming language R.
This course will focus on developing 1) a disciplined approach to planning, conducting, and verifying analysis programs, including those that produce analysis datasets, via a range of joining and derivation strategies and 2) a purpose-driven foundation in SAS programming supported by web-based resources that will provide the student with the tools to build their SAS skills into the future as new purposes arise during their careers (analyses; data handling challenges).