• Length:
    15 Weeks
  • Effort:
    6–9 hours per week
  • Price:

    Add a Verified Certificate for $750 USD

  • Institution
  • Subject:
  • Level:
  • Language:
  • Video Transcript:
  • Course Type:
    Instructor-led on a course schedule


Undergraduate calculus and basic knowledge of statistical analysis.

About this course

Skip About this course

The availability of low cost and ubiquitous sensors in city infrastructure provides high granular data at unprecedented spatiotemporal scales. “Smart Cities” envision to utilize this data to provide a healthy, happy and sustainable urban ecosystem by integrating the information and communication technology (ICT), Internet of things (IoT) and citizen participation to effectively manage and utilize city infrastructure and services. “Data Science” is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge from data in various forms and provides fast and efficient understanding of the current dynamics of cities and ways to improve different services. This course will introduce scientific techniques that will allow the analysis, inference and prediction of large-scale data (e.g. GPS vehicular data, social media data, mobile phone data, individual social network data, etc.) that are present in city networks. Basics of the data science methods to analyze these datasets will be presented. The course will focus both on the methods and their application to smart-city problems. Python will be used to demonstrate the application of each method on datasets available to the instructor. Examples of problems that will be discussed include ridesharing platforms, smart and energy-efficient buildings, evacuation modeling, decision making during extreme events & urban resilience.

What you'll learn

Skip What you'll learn
  • Classify the different types of data generated by smart cities.
  • Apply the basics of various data mining techniques.
  • Map the data mining tool that is appropriate for various smart city applications.
  • Code, apply and solve the data mining algorithms using Python.
  • Interpret the results from the data mining tools and make connections to policy making as they relate to smart cities applications.

Unit 1. Introduction to Data Mining

Week 1: Introduction to the course and syllabus, Review of statistical methods

Instructor introduction, introduction to data mining, course overview, student introduction, introduction of statistical methods, modeling uncertainty, random variables, population and samples, and statistical inference.

Week 2: Optimization, Data pre-processing

Introduction to optimization, optimization-basic concepts, optimization problem formulation, optimization algorithms, data and measurement, types of datasets, data quality, data pre-processing, and task identification.

Week 3: Project discussion/Introduction to Python

Introduction to Python, Python for data mining, optimization using Python, and data pre-processing using python.

Unit 2. Data Mining Tasks

Week 4: Regression analysis, Association rule mining

Introduction to regression analysis, Linear regression, Logistic regression models, Poisson regression models, applications of regression analysis to smart cities, introduction to associate rule mining, association rule mining applications to urban systems, and association rule mining approaches.

Week 5: Association rule mining, Statistical classification

A-priori algorithm, F-P growth algorithm, ECLAT, evaluation methods, introduction to the classification problem, Logistic regression, Naïve Bayes classifier, and Bayesian network classifier.

Weeks 6 and 7: Decision tree, Support vector machines

Introduction to decision trees, decision tree training, decision tree algorithms, practical issues with decision trees, introduction to support vector machines, support vector machines, ensemble classifiers, and classifier performance evaluation.

Weeks 8 and 9: Introduction to data clustering, Clustering algorithms: Partitional and Hierarchical

Introduction to data clustering, (dis)similarity measures, distribution (model)-based clustering algorithms, types of clustering algorithms, partitional clustering (k-means and its variants), and hierarchical clustering.

Week 10: Other clustering approaches, Anomaly detection

Density-based clustering algorithms, cluster validity, characteristics of “data, clusters, and clustering algorithms”, introduction to anomaly detection, the anomaly detection problem, and anomaly detection techniques.

Week 11: Review of materials and project check-up, Avoiding false discoveries

Review of materials and project check-up, review of a real-world application, introduction to avoiding false discoveries, statistical significance testing, hypothesis testing, and multiple hypothesis testing.

Unit 3. Advanced Data Mining Techniques

Week 12: Neural networks, Deep learning

Introduction to neural networks, a neuron model, learning an ANN model, multi-layer-feed-forward ANNs, ANN application to land use prediction, introduction to deep learning, deep learning, and deep learning for smart cities.

Week 13: Self-organizing maps (SOM), Hidden Markov models (HMMs)* __ *

Introduction to self-organizing maps, SOM components & training, SOM application for transportation, introduction to HMMs, Markov process, graphical models, discrete-state HMMs, and applications of HMMs to Smart Cities.

Week 14: Case studies of Data Science applications for Smart Cities

Week 15: Virtual Exam and Project submission

Meet your instructors

Satish Ukkusuri
Data Science for Smart Cities
Purdue University

Pursue a Verified Certificate to highlight the knowledge and skills you gain
$750 USD

View a PDF of a sample edX certificate
  • Official and Verified

    Receive an instructor-signed certificate with the institution's logo to verify your achievement and increase your job prospects

  • Easily Shareable

    Add the certificate to your CV or resume, or post it directly on LinkedIn

  • Proven Motivator

    Give yourself an additional incentive to complete the course

  • Support our Mission

    EdX, a non-profit, relies on verified certificates to help fund free education for everyone globally

Who can take this course?

Unfortunately, learners from one or more of the following countries or regions will not be able to register for this course: Iran, Cuba and the Crimea region of Ukraine. While edX has sought licenses from the U.S. Office of Foreign Assets Control (OFAC) to offer our courses to learners in these countries and regions, the licenses we have received are not broad enough to allow us to offer this course in all locations. EdX truly regrets that U.S. sanctions prevent us from offering all of our courses to everyone, no matter where they live.