Skip to main content

PurdueX: Data Science for Smart Cities

Learn various scientific techniques that will allow the analysis, inference and prediction of large-scale data (e.g. GPS vehicular data, social media data, mobile phone data, individual social network data etc.) that are present in city networks.

16 weeks
6–9 hours per week
Instructor-paced
Instructor-led on a course schedule
This course is archived

About this course

Skip About this course

The availability of low cost and ubiquitous sensors in city infrastructure provides high granular data at unprecedented spatiotemporal scales. “Smart Cities” envision to utilize this data to provide a healthy, happy and sustainable urban ecosystem by integrating the information and communication technology (ICT), Internet of things (IoT) and citizen participation to effectively manage and utilize city infrastructure and services. “Data Science” is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge from data in various forms and provides fast and efficient understanding of the current dynamics of cities and ways to improve different services. This course will introduce scientific techniques that will allow the analysis, inference and prediction of large-scale data (e.g. GPS vehicular data, social media data, mobile phone data, individual social network data, etc.) that are present in city networks. Basics of the data science methods to analyze these datasets will be presented. The course will focus both on the methods and their application to smart-city problems. Python will be used to demonstrate the application of each method on datasets available to the instructor. Examples of problems that will be discussed include ridesharing platforms, smart and energy-efficient buildings, evacuation modeling, decision making during extreme events & urban resilience.

At a glance

  • Institution: PurdueX
  • Subject: Engineering
  • Level: Intermediate
  • Prerequisites:

    Undergraduate calculus and basic knowledge of statistical analysis.

  • Language: English
  • Video Transcript: English
  • Associated skills:Social Media, Scientific Methods, Information And Communications Technology, Decision Making, Internet Of Things (IoT), Energy-Efficient Buildings, Python (Programming Language), Infrastructure, Forecasting, Social Networks, Data Science, Algorithms

What you'll learn

Skip What you'll learn
  • Classify the different types of data generated by smart cities.
  • Apply the basics of various data mining techniques.
  • Map the data mining tool that is appropriate for various smart city applications.
  • Code, apply and solve the data mining algorithms using Python.
  • Interpret the results from the data mining tools and make connections to policy making as they relate to smart cities applications.

Unit 1. Introduction to Data Mining

Week 1: Introduction to the Course & Syllabus, Review of Statistical Methods

Instructor introduction, introduction to data mining, course overview, student introduction, introduction of statistical methods, modeling uncertainty, random variables, population and samples, and statistical inference

Week 2: Optimization, Data Pre-Processing

Introduction to optimization, optimization-basic concepts, optimization problem formulation, optimization algorithms, data and measurement, types of datasets, data quality, data pre-processing, and task identification

Week 3: Project Discussion/Introduction to Python

Introduction to Python, Python for data mining, optimization using Python, and data pre-processing using python

Unit 2. Data Mining Tasks

Week 4: Regression Analysis, Association Rule Mining

Introduction to regression analysis, Linear regression, Logistic regression models, Poisson regression models, applications of regression analysis to smart cities, introduction to associate rule mining, association rule mining applications to urban systems, and association rule mining approaches

Week 5: Association Rule Mining, Statistical Classification

A-priori algorithm, F-P growth algorithm, ECLAT, evaluation methods, introduction to the classification problem, Logistic regression, Naïve Bayes classifier, and Bayesian network classifier

Weeks 6 and 7: Decision Tree, Support Vector Machines

Introduction to decision trees, decision tree training, decision tree algorithms, practical issues with decision trees, introduction to support vector machines, support vector machines, ensemble classifiers, and classifier performance evaluation

Weeks 8 and 10: Introduction to Data Clustering, Clustering Algorithms: Partitional and Hierarchical

Introduction to data clustering, (dis)similarity measures, distribution (model)-based clustering algorithms, types of clustering algorithms, partitional clustering (k-means and its variants), and hierarchical clustering

Week 11: Other Clustering Approaches

Density-based clustering algorithms, cluster validity, characteristics of “data, clusters, and clustering algorithms”

Unit 3. Advanced Data Mining Techniques

Week 12: Neural Networks

Introduction to neural networks, a neuron model, learning an ANN model, multi-layer-feed-forward ANNs, ANN application to land use prediction

Week 13: Deep Learning

Introduction to deep learning, deep learning, and deep learning for smart cities

Week 14: Case studies of Data Science Applications for Smart Cities

Week 15: Virtual Exam and Project Submission

Who can take this course?

Unfortunately, learners residing in one or more of the following countries or regions will not be able to register for this course: Iran, Cuba and the Crimea region of Ukraine. While edX has sought licenses from the U.S. Office of Foreign Assets Control (OFAC) to offer our courses to learners in these countries and regions, the licenses we have received are not broad enough to allow us to offer this course in all locations. edX truly regrets that U.S. sanctions prevent us from offering all of our courses to everyone, no matter where they live.

Interested in this course for your business or team?

Train your employees in the most in-demand topics, with edX For Business.