Data Science for Construction, Architecture and Engineering
This course introduces data science skills targeting applications in the design, construction, and operations of buildings. You will learn practical coding within this context with an emphasis on basic Python programming and the Pandas library.
There is one session available:
Data Science for Construction, Architecture and Engineering
About this courseSkip About this course
The building industry is exploding with data sources that impact the energy performance of the built environment and health and well-being of occupants. Spreadsheets just don’t cut it anymore as the sole analytics tool for professionals in this field. Participating in mainstream data science courses might provide skills such as programming and statistics, however the applied context to buildings is missing, which is the most important part for beginners.
This course focuses on the development of data science skills for professionals specifically in the built environment sector. It targets architects, engineers, construction and facilities managers with little or no previous programming experience. An introduction to data science skills is given in the context of the building life cycle phases. Participants will use large, open data sets from the design, construction, and operations of buildings to learn and practice data science techniques.
Essentially this course is designed to add new tools and skills to supplement spreadsheets. Major technical topics include data loading, processing, visualization, and basic machine learning using the Python programming language, the Pandas data analytics and sci-kit learn machine learning libraries, and the web-based Colaboratory environment. In addition, the course will provide numerous learning paths for various built environment-related tasks to facilitate further growth.
At a glance
What you'll learnSkip What you'll learn
- Why data science is important for the built environment
- Why building industry professionals should learn how to code
- A jump start in the Python Programming Language
- Overview of the Pandas data analysis library
- Guidance in the loading, processing, and merging of data
- Visualization of data from buildings
- Basic machine learning concepts applied to building data
- Examples of parametric analysis for the integrated design process
- Examples of how to process time-series data from IoT sensors
- Examples of analysis of thermal comfort data from occupants
- Numerous starting points for using data science in other building-related tasks
Section 1: Introduction to Course and Python Fundamentals – In this introduction, an overview of key Python concepts is covered as well as the motivating factors for building industry professionals to learn to code. The NZEB at the NUS School of Design and Environment is introduced as an example of a building that uses various data science-related technologies in its design, construction, and operations.
Section 2: Introduction to the Pandas Data Analytics Library and Design Phase Application Example – The foundational functions of Pandas are demonstrated in the context of the integrated design process through the processing of data from parametric EnergyPlus models. Further future learning path examples are introduced for the Design Phase including building information modeling (BIM) using Revit or Rhino, spatial analytics, and building performance modeling Python libraries.
Section 3: Pandas Analysis of Time-Series Data from IoT and Construction Phase Application Example – Time-series analysis Pandas functions are demonstrated in the Construction Phase through the analysis of hourly IoT data from electrical energy meters. Further future learning path examples are introduced for the Construction Phase including project management, building management system (BMS) data analysis, and digital construction such as robotic fabrication.
Section 4: Statistics and Visualization Basics and Operations Phase Application Example – Various statistical aggregations and visualization techniques using Pandas and the Seaborn library are demonstrated on Operations Phase occupant comfort data from the ASHRAE Thermal Comfort Database II. Further future learning path examples are introduced for the Operations Phase including energy auditing, IoT analysis, and occupant detection and reinforcement learning.
Section 5: Introduction to Machine Learning for the Built Environment – This concluding section gives an overview of the motivations and opportunities for the use of prediction in the built environment. Prediction, classification, and clustering using the sci-kit learn library is demonstrated on electrical meter and occupant comfort data. The course is concluded with suggestions on more in-depth Python, Data Science, and Statistics courses on EDx.
Development of this curriculum was led by Dr. Clayton Miller with support from NUS students Ananya Joshi, Charlene Tan, Chun Fu, James Zhan, Mahmoud Abdelrahman, Matias Quintana, Miguel Martin, and Vanessa Neo.
Learner testimonialsSkip Learner testimonials
“The course broadened my understanding of how data science fundamentals could be integrated into my workflow as a building systems engineer. Working the in-class examples with real building data solidified the concepts in my mind, and the resources provided at the end of each module inspired me to learn more.” – Holly Brink, Senior Engineer at Arup in San Francisco, California, USA
“The overall content of this course is fabulous and I would recommend all the folks across the building industry taking this course. It not only taught me how data science could be implemented in the building science industry with real-world application but also inspired me to systematically increase my day-to-day work efficiency using the data analytical skills.” – Te Qi, Senior Environmental Designer at Atelier Ten in San Francisco, California, USA
“This course does a great job of blending all the basics of data science into a built environment narrative. Although it is a concise, excellent references to platforms and learning material are provided when the depth of certain topics extend beyond the scope of the course. The career path overview makes this content especially useful for students to get an idea of how these skills are used in real-world applications. All in all, it is hard to find a more tailor-made course for the AEC industry!” – Justin Zarb, Engineer at BuroHappold Engineering in Berlin, Germany
“Absolutely amazing! Not only is the content extremely relevant for my job, it's also presented in a way that make even the most difficult concepts understandable. Amazing team behind the course - always ready to answer any question on the discussion forum. Exercises are challenging but doable - just as they should be. I'm going to recommend this course to anyone who mentions, or even thinks of mentioning, the word 'data'.” - Timo Harboe Nielsen, Structural Engineer at Bjarke Ingels Group
“This is hand-down the most insightful and useful course I have taken on edX! I would highly recommend this course to everyone in the energy /architecture/ construction fields working with tedious tasks that involve large amounts of data. This course has been very well curated to provide introduction to managing such data smartly with Python. You will not only be introduced to Pandas, time-series analysis, data cleaning, and visualization, but you will also develop an interest in understanding how machine learning can be used to make smarter predictions. Throughout the course, you will work with real world relevant datasets. While the machine learning section towards the end of the course is only introductory, it is enough to pique your curiosity. I personally cannot wait to check out all the additional resources and Kaggle competition mentioned and build up on all I learnt in this course!” – Revati Deshpande, Energy Engineer, Portland, OR, USA
About the instructors
Frequently Asked QuestionsSkip Frequently Asked Questions
Do I need to know how to program before I start?
This course has been designed such that we start with the basics of Python and the Colaboratory environment in a way that someone who has never programmed could get started. Most participants without any programming experience do well, but the speed with which we start moving towards the Pandas package is fast. Therefore, doing a basic Python introduction course will be helpful for some users.
What software will I need?
The only software you need is a recent version of a web browser such as Chrome or Firefox. All of the coding and exercises can be done in an online platform.
What is the passing grade for the course?
An overall average for all assignments of 75% is required to pass the course. You need to be a Verified Certificate participant to do the assignments and receive scores for them.
Do I need to achieve 75% on each assignment?
No, you need an average grade for all assignments of 75%. This means you can do poorly or miss an assignment as long as you do well enough on other assignments to achieve 75% overall.