# HarvardX: PH525.1x Data Analysis for Life Sciences 1: Statistics and R

## Instructors

## Course Description

An introduction to basic statistical concepts and R programming skills necessary for analyzing data in the life sciences. We will learn the basics of statistical inference in order to understand and compute p-values and confidence intervals. We will provide examples by programming in R in a way that will help make the connection between concepts and implementation. Problem sets requiring R programming will be used to test understanding and ability to implement basic data analyses. We will use visualization techniques to explore new data sets and determine the most appropriate approach. We will describe robust statistical techniques as alternatives when data do not fit assumptions required by the standard approaches. We will also introduce the basics of using R scripts to conduct reproducible research.

Topics:

- Distributions
- Inference
- Exploratory Data Analysis
- Non-parametric statistics

## Course Syllabus

Course content will be discussed on a weekly basis with the following schedule:

** Week 1: Getting Started**

- Using Rstudio
- R programming skills
- Getting organized

**Week 2: Random Variables, Probability Distributions, and the Central Limit Theorem**

- Introduction to random variables
- Introduction to the null distribution
- Probability distributions
- The normal distribution

**Week 3: Inference**

- t-tests
- The Central Limit Theorem
- Association tests
- Monte Carlo methods
- Permutation tests
- Power

**Week 4: Exploratory Data Analysis and Robust Summaries**

- Exploratory data analysis
- histogram
- QQ-plot
- boxplot
- scatterplot
- log transformation

- Robust summaries
- Median, MAD and Spearman correlation
- Mann-Whitney-Wilcoxon test

Basic programming skills.We will assume that learners are familiar with very basic programming concepts (variables, functions).Familiarity with the R language. The course will use R in order to demonstrate data analyses. In the first week, we will have a refresher on the commands in R which you will need to use in the following weeks, but this is not a comprehensive R course, and we will not go in depth on R syntax. Please see below for online R resources.