• Length:
    5 Weeks
  • Effort:
    7–10 hours per week
  • Price:

    FREE
    Add a Verified Certificate for $99 USD

  • Institution
  • Subject:
  • Level:
    Introductory
  • Language:
    English
  • Video Transcript:
    English

Prerequisites

High School Math. Some exposure to computer programming.

About this course

This statistics and data analysis course will pave the statistical foundation for our discussion on data science.

You will learn how data scientists exercise statistical thinking in designing data collection, derive insights from visualizing data, obtain supporting evidence for data-based decisions and construct models for predicting future trends from data.

What you'll learn

  • Data collection, analysis and inference
  • Data classification to identify key traits and customers
  • Conditional Probability-How to judge the probability of an event, based on certain conditions
  • How to use Bayesian modeling and inference for forecasting and studying public opinion
  • Basics of Linear Regression
  • Data Visualization: How to create use data to create compelling graphics

Week 1 – Introduction to Data Science


Week 2 – Statistical Thinking

  • Examples of Statistical Thinking
  • Numerical Data, Summary Statistics
  • From Population to Sampled Data
  • Different Types of Biases
  • Introduction to Probability
  • Introduction to Statistical Inference 


Week 3 – Statistical Thinking 2

  • Association and Dependence
  • Association and Causation
  • Conditional Probability and Bayes Rule
  • Simpsons Paradox, Confounding
  • Introduction to Linear Regression
  • Special Regression Models


Week 4 – Exploratory Data Analysis and Visualization

  • Goals of statistical graphics and data visualization
  • Graphs of Data
  • Graphs of Fitted Models
  • Graphs to Check Fitted Models
  • What makes a good graph?
  • Principles of graphics


Week 5 – Introduction to Bayesian Modeling

  • Bayesian inference: combining models and data in a forecasting problem
  • Bayesian hierarchical modeling for studying public opinion
  • Bayesian modeling for Big Data

Meet your instructors

Andrew Gelman
Professor of Statistics and Political Science
Columbia University
David Madigan
Executive Vice President and Dean of Faculty of Arts and Sciences
Columbia University
Lauren Hannah
Assistant Professor in the Department of Statistics
Columbia University
Eva Ascarza
Assistant Professor of Marketing at Columbia Business School
Columbia University
James Curley
Assistant Professor of Psychology
Columbia University
Tian Zheng
Series Creator
Columbia University

Pursue a Verified Certificate to highlight the knowledge and skills you gain $99.00

View a PDF of a sample edX certificate
  • Official and Verified

    Receive an instructor-signed certificate with the institution's logo to verify your achievement and increase your job prospects

  • Easily Shareable

    Add the certificate to your CV or resume, or post it directly on LinkedIn

  • Proven Motivator

    Give yourself an additional incentive to complete the course

  • Support our Mission

    EdX, a non-profit, relies on verified certificates to help fund free education for everyone globally