Skip to main content

Statistical Thinking for Data Science and Analytics

Provided by Columbia University (ColumbiaX)
Enroll - startsApr 22, 2019
See prerequisites

$99 USD for graded exams and assignments, plus a certificate

Learn how statistics plays a central role in the data science approach.

Start Date:Apr 22, 2019

Before you start

High School Math. Some exposure to computer programming.

Learning on edX

In this instructor-paced course, plan to complete the course within the defined time period.

What you will learn

  • Data collection, analysis and inference
  • Data classification to identify key traits and customers
  • Conditional Probability-How to judge the probability of an event, based on certain conditions
  • How to use Bayesian modeling and inference for forecasting and studying public opinion
  • Basics of Linear Regression
  • Data Visualization: How to create use data to create compelling graphics

Week 1 – Introduction to Data Science

Week 2 – Statistical Thinking

  • Examples of Statistical Thinking
  • Numerical Data, Summary Statistics
  • From Population to Sampled Data
  • Different Types of Biases
  • Introduction to Probability
  • Introduction to Statistical Inference 

Week 3 – Statistical Thinking 2

  • Association and Dependence
  • Association and Causation
  • Conditional Probability and Bayes Rule
  • Simpsons Paradox, Confounding
  • Introduction to Linear Regression
  • Special Regression Models

Week 4 – Exploratory Data Analysis and Visualization

  • Goals of statistical graphics and data visualization
  • Graphs of Data
  • Graphs of Fitted Models
  • Graphs to Check Fitted Models
  • What makes a good graph?
  • Principles of graphics

Week 5 – Introduction to Bayesian Modeling

  • Bayesian inference: combining models and data in a forecasting problem
  • Bayesian hierarchical modeling for studying public opinion
  • Bayesian modeling for Big Data


This statistics and data analysis course will pave the statistical foundation for our discussion on data science.

You will learn how data scientists exercise statistical thinking in designing data collection, derive insights from visualizing data, obtain supporting evidence for data-based decisions and construct models for predicting future trends from data.

Meet your instructors

Of all edX learners:
73% are employed
Of all edX learners:
45% have children
Based on internal survey results
393,599 people are learning on edX today