Skip to main content

Data Structures Fundamentals

Provided by The University of California, San Diego (UCSanDiegoX)
Intermediate
See prerequisites
Free

$150 USD for graded exams and assignments, plus a certificate

Learn about data structures that are used in computational thinking – both basic and advanced.

Start Date:

Before you start

Basic grasp of:

  • One programming language (C,C++,C#,Haskell,Java,JavaScript,Python2/3,Ruby,Scala)-loop, array, stack, recursion
  • Math-proof by induction and contradiction
  • The Algorithmic Design and Techniques class

What you will learn

  • Basics of data structures including their fundamental building blocks: arrays and linked lists
  • How to use Dynamic arrays
  • A very powerful and widely used technique called hashing and its applications
  • How to use priority queues to efficiently schedule jobs, in the context of a computer operating system or real life
  • Basic structure of binary search trees - AVL trees and Splay trees
  • Applications of data structures

Module 1: Basic Data Structures
In this module, you will learn about the basic data structures used throughout the rest of this course. We start this module by looking in detail at the fundamental building blocks: arrays and linked lists. From there, we build up two important data structures: stacks and queues. Next, we look at trees: examples of how they’re used in Computer Science, how they’re implemented, and the various ways they can be traversed. Once you’ve completed this module, you will be able to implement any of these data structures, as well as have a solid understanding of the costs of the operations, as well as the tradeoffs involved in using each data structure.

Module 2: Dynamic Arrays and Amortized Analysis
In this module, we discuss Dynamic Arrays: a way of using arrays when it is unknown ahead-of-time how many elements will be needed. Here, we also discuss amortized analysis: a method of determining the amortized cost of an operation over a sequence of operations. Amortized analysis is very often used to analyse performance of algorithms when the straightforward analysis produces unsatisfactory results, but amortized analysis helps to show that the algorithm is actually efficient. It is used both for Dynamic Arrays analysis and will also be used in the end of this course to analyze Splay trees.

Module 3: Priority Queues and Disjoint Set Union
We start this module by considering priority queues which are used to efficiently schedule jobs, either in the context of a computer operating system or in real life, to sort huge files, which is the most important building block for any Big Data processing algorithm, and to efficiently compute shortest paths in graphs, which is a topic we will cover in our next course. For this reason, priority queues have built-in implementations in many programming languages, including C++, Java, and Python. We will see that these implementations are based on a beautiful idea of storing a complete binary tree in an array that allows to implement all priority queue methods in just few lines of code. We will then switch to disjoint sets data structure that is used, for example, in dynamic graph connectivity and image processing. We will see again how simple and natural ideas lead to an implementation that is both easy to code and very efficient. By completing this module, you will be able to implement both these data structures efficiently from scratch.

Modules 4 and 5: Hash Tables
In this module you will learn about very powerful and widely used technique called hashing. Its applications include implementation of programming languages, file systems, pattern search, distributed key-value storage and many more. You will learn how to implement data structures to store and modify sets of objects and mappings from one type of objects to another one. You will see that naive implementations either consume huge amount of memory or are slow, and then you will learn to implement hash tables that use linear memory and work in O(1) on average!

Module 6: Binary Search Trees
In this module we study binary search trees, which are a data structure for doing searches on dynamically changing ordered sets. You will learn about many of the difficulties in accomplishing this task and the ways in which we can overcome them. In order to do this you will need to learn the basic structure of binary search trees, how to insert and delete without destroying this structure, and how to ensure that the tree remains balanced.

Overview

A good algorithm usually comes together with a set of good data structures that allow the algorithm to manipulate the data efficiently. In this course, part of the Algorithms and Data Structures MicroMasters program, we consider the common data structures that are used in various computational problems. You will learn how these data structures are implemented in different programming languages and will practice implementing them in our programming assignments. This will help you to understand what is going on inside a particular built-in implementation of a data structure and what to expect from it. You will also learn typical use cases for these data structures.

A few examples of questions that we are going to cover in this course are:

  1. What is a good strategy of resizing a dynamic array?
  2. How priority queues are implemented in C++, Java, and Python?
  3. How to implement a hash table so that the amortized running time of all operations is O(1) on average?
  4. What are good strategies to keep a binary tree balanced?

We look forward to seeing you in this course! We know it will make you a better programmer.

Learner testimonials

“I found the assignments challenging in the absolute best sense of the term, and therefore incredibly rewarding! I've been an educator before, and my own impression of the assignments was that they were extremely well designed: it was impossible to pass them without knowing what you were doing.”
-- Previous Student

Who can take this course?

Unfortunately, learners from one or more of the following countries or regions will not be able to register for this course: Iran, Cuba and the Crimea region of Ukraine. While edX has sought licenses from the U.S. Office of Foreign Assets Control (OFAC) to offer our courses to learners in these countries and regions, the licenses we have received are not broad enough to allow us to offer this course in all locations. EdX truly regrets that U.S. sanctions prevent us from offering all of our courses to everyone, no matter where they live.

Of all edX learners:
73% are employed
Of all edX learners:
45% have children
Based on internal survey results
407,747 people are learning on edX today