Before you start
What you will learn
- Demonstrate knowledge of Data Science and Machine Learning
- Apply Data Science process to a real life scenario
- Explore New York City - 311 Complaints and Housing datasets
- Analyze and Visualize data using Python
- Perform feature engineering exercise using Python
- Build and validate predictive machine learning model using Python
- Create and share Actionable Insights to real life data problems
New Yorkers use 311 system to report complaints for the non-emergency problems they face. Various agencies in New York get assigned to these problems. The data related to these Complaints are available in New York City Open Dataset. On investigation one can see that in last few years the 311 complaints coming to The Department of Housing Preservation and Development in New York City has increased significantly.
In this Capstone project your task would be to find out answers to some questions that would help The Department of Housing Preservation and Development in New York City to effectively tackle 311 complaints coming to them. You need to use Python and Data Science and Machine Learning techniques such as Data Ingestion, Data Exploration, Data Visualization, Feature Engineering, Probabilistic Modeling, Model Validation, etc.
By the end of this course you will have used real world Data Science tools to create a showcase project and demostrate to employers that you are job ready and a worthy candidate in the field of Data Science.
Meet your instructors
Who can take this course?
Unfortunately, learners from one or more of the following countries or regions will not be able to register for this course: Iran, Cuba and the Crimea region of Ukraine. While edX has sought licenses from the U.S. Office of Foreign Assets Control (OFAC) to offer our courses to learners in these countries and regions, the licenses we have received are not broad enough to allow us to offer this course in all locations. EdX truly regrets that U.S. sanctions prevent us from offering all of our courses to everyone, no matter where they live.
IBM's Python Data Science Professional Certificate
Earn a Professional Certificate in 2-4 months if courses are taken one at a time.View the program
- 10–20 hours of effort
In this course, you will learn how to analyze data in Python using multi-dimensional arrays in numpy, manipulate DataFrames in pandas, use SciPy library of mathematical routines, and perform machine learning using scikit-learn!
- Data Science and Machine Learning Capstone Project
- 10–20 hours of effort
Data visualization is the graphical representation of data in order to interactively and efficiently convey insights to clients, customers, and stakeholders in general.
- 20–30 hours of effort
Machine Learning can be an incredibly beneficial tool to uncover hidden insights and predict future trends. This Machine Learning with Python course will give you all the tools you need to get started with supervised and unsupervised learning.
- 2–5 hours of effort
This Python course provides a beginner-friendly introduction to Python for Data Science. Practice through lab exercises, and you'll be ready to create your first Python scripts on your own!
Get started in data analysis & statisticsBrowse over 200 data analysis & statistics courses