Skip to main content

HarvardX: Introduction to Data Science with Python

4.5 stars
19 ratings

Learn the concepts and techniques that make up the foundation of data science and machine learning.

8 weeks
3–4 hours per week
Self-paced
Progress at your own speed
Free
Optional upgrade available

There is one session available:

109,105 already enrolled! After a course session ends, it will be archivedOpens in a new tab.
Starts Mar 27
Ends Oct 23

About this course

Skip About this course

Every single minute, computers across the world collect millions of gigabytes of data. What can you do to make sense of this mountain of data? How do data scientists use this data for the applications that power our modern world?

Data science is an ever-evolving field, using algorithms and scientific methods to parse complex data sets. Data scientists use a range of programming languages, such as Python and R, to harness and analyze data. This course focuses on using Python in data science. By the end of the course, you’ll have a fundamental understanding of machine learning models and basic concepts around Machine Learning (ML) and Artificial Intelligence (AI).

Using Python, learners will study regression models (Linear, Multilinear, and Polynomial) and classification models (kNN, Logistic), utilizing popular libraries such as sklearn, Pandas, matplotlib, and numPy. The course will cover key concepts of machine learning such as: picking the right complexity, preventing overfitting, regularization, assessing uncertainty, weighing trade-offs, and model evaluation. Participation in this course will build your confidence in using Python, preparing you for more advanced study in Machine Learning (ML) and Artificial Intelligence (AI), and advancement in your career.

Learners must have a minimum baseline of programming knowledge (preferably in Python) and statistics in order to be successful in this course. Python prerequisites can be met with an introductory Python course offered through CS50’s Introduction to Programming with Python, and statistics prerequisites can be met via Fat Chance or with Stat110 offered through HarvardX.

At a glance

  • Language: English
  • Video Transcripts: اَلْعَرَبِيَّةُ, Deutsch, Español, Français, हिन्दी, Bahasa Indonesia, Português, Kiswahili, తెలుగు, Türkçe, 中文
  • Associated skills:Parsing, Python (Programming Language), Pandas (Python Package), Artificial Intelligence, Scikit-learn (Machine Learning Library), R (Programming Language), Machine Learning, NumPy, Algorithms, Data Science, Scientific Methods, Matplotlib

What you'll learn

Skip What you'll learn
  • Gain hands-on experience and practice using Python to solve real data science challenges
  • Practice Python programming and coding for modeling, statistics, and storytelling
  • Utilize popular libraries such as Pandas, numPy, matplotlib, and SKLearn
  • Run basic machine learning models using Python, evaluate how those models are performing, and apply those models to real-world problems
  • Build a foundation for the use of Python in machine learning and artificial intelligence, preparing you for future Python study

Course Outline:

  1. Linear Regression
  2. Multiple and Polynomial Regression
  3. Model Selection and Cross-Validation
  4. Bias, Variance, and Hyperparameters
  5. Classification and Logistic Regression
  6. Multi-logstic Regression and Missingness
  7. Bootstrap, Confidence Intervals, and Hypothesis Testing
  8. Capstone Project

Who can take this course?

Unfortunately, learners residing in one or more of the following countries or regions will not be able to register for this course: Iran, Cuba and the Crimea region of Ukraine. While edX has sought licenses from the U.S. Office of Foreign Assets Control (OFAC) to offer our courses to learners in these countries and regions, the licenses we have received are not broad enough to allow us to offer this course in all locations. edX truly regrets that U.S. sanctions prevent us from offering all of our courses to everyone, no matter where they live.

This course is part of Learning Python for Data Science Professional Certificate Program

Learn more 
Expert instruction
3 skill-building courses
Self-paced
Progress at your own speed
6 months
3 - 6 hours per week

Interested in this course for your business or team?

Train your employees in the most in-demand topics, with edX For Business.