There is one session available:
Introduction to Designing Data Lakes on AWS
About this courseSkip About this course
Designing a data lake is challenging because of the scale and growth of data. Developers need to understand best practices to avoid common mistakes that could be hard to rectify. In this course we will cover the foundations of what a Data Lake is, how to ingest and organize data into the Data Lake, and dive into the data processing that can be done to optimize performance and costs when consuming the data at scale. This course is for professionals (Architects, System Administrators and DevOps) who need to design and build an architecture for secure and scalable Data Lake components. Students will learn about the use cases for a Data Lake and, contrast that with a traditional infrastructure of servers and storage.
At a glance
What you'll learnSkip What you'll learn
- Where to start with a Data Lake?
- How to build a secure and scalable Data Lake?
- What are the common components of a Data Lake?
- Why do you need a Data Lake and what it's value?
Week 1: Hello World, I mean, Hello Data Lakes!
- Video: Meet the Instructors
- Video: Introduction to Week 1
- Video: Why Data Lakes?
- Video: Characteristics of a Data Lake
- Video: Data Lake Components
- Reading: Data Lake Characteristics and Components
- Video: Comparison of a Data Lake to a Data Warehouse
- Reading: Data Lakes and Data Warehouses
- Video: Discussing sample Data Lake Architectures
- Quiz/Assessment: Week 1 quiz
Week 2: AWS data related services
- Video: Introduction to Week 2
- Video: AWS Data Lake related services
- Video: Amazon S3
- Video: AWS Glue Data Catalog
- Reading: S3 and Glue Data Catalog
- Video: AWS Services used for data movement
- Reading: Kinesis, API Gateway, etc
- Video: AWS Services for Data processing
- Video: AWS Services for Analytics
- Video: AWS Services used for Predictive Analytics and Machine Learning
- Reading: EMR, Glue Jobs, Lambda, Kinesis Analytics, Redshift
- Video: Introduction to AWS LakeFormation
- Reading: LakeFormation
- Lab: Get familiar with AWS Services and create your first simple data lake
Week 3: Ingesting the rivers
- Video: Introduction to Week 3
- Video: Use the right tool for the job
- Video: Understanding Data Structure and when to process data
- Video: Data Streaming ingestion with Amazon Kinesis Services
- Video: Diving Deep on Amazon Kinesis
- Demo: Batch Data Ingestion with AWS Transfer Family
- Reading: Batch Data Ingestion with AWS Services
- Video: Data Cataloging
- Demo: Using Glue Crawlers
- Reading: The importance of data cataloging
- Video: Reviewing the ingestion part of some Data Lake architectures
- Lab: Ingesting Web Logs
Week 4: Processing and Analyzing data that sits in the Data Lake
- Video: Introduction to Week 4
- Video: Data prep and AWS Glue jobs
- Video: File optimizations
- Demo: Using S3, Glue and Athena to get insights about NYC Taxi data
- Reading: Glue Jobs, Data Prep, Athena? Columnar Data Formats and Amazon Athena Optimizations
- Video: Introduction to Data Lake security
- Reading: Security and compliance
- Video: The power of data visualization
- Video: Introduction to Amazon QuickSight
- Demo: Amazon Quicksight
- Reading: Data visualization, Amazon QuickSight
- Video: Registry of Open Data on AWS
- Lab: Create an end-to-end datalake with AWS Services
- Video: Course wrap-up!
About the instructors
Frequently Asked QuestionsSkip Frequently Asked Questions
Q. Are there any costs associated with this course?
A. Learners can register for the course in an Audit track or Verified Certificate track. The Audit track is free, but limits the duration of access to 6 weeks from registration. The Verified Certificate track costs $169 and provides full access to course content for the duration. Please visit edx.org for more information.
In addition to course registration costs, this course provides optional hands-on exercises which may have an associated charge in your AWS account. Please familiarize yourself with the AWS Free Tier at aws.amazon.com/free/.
Please note that the AWS Free Tier also has a limit on the amount of resources that you can consume before you begin accruing charges. If you perform these hands-on exercises, there is a chance you may incur charges on your AWS account. Please visit the AWS Free Tier page for more information.
Q. Do I need a credit card to create an AWS Account?
A. Yes, you will need a credit card to activate your AWS account.
Q. How much time will this course require?
A. If following the weekly schedule, learners should plan to spend 2-4 hours per week on this course. However, learners may complete the course at their own pace.
Q. Will I receive a certificate for this course?
A. Learners enrolled in the Verified Certificate path will receive a certificate upon successful completion of the course.
Q. What is the grading policy for this course?
A. All learners may take weekly quizzes, which are not graded and allow unlimited retries.
Learners in the Verified Certificate track are able to take the final course assessment in the course. Passing the final assessment is required to obtain the Verified Certificate.
Learners in the Audit track will not have access to the final assessment, and will not be able to earn a certificate.
Q. How are discussions used in this course?
A. This course has discussion groups aligned to each week of the course. We encourage learners to ask questions or offer suggestions and feedback. AWS Instructors will monitor the discussion groups to answer questions specific to the exercises and topics covered in the course.
Q. Will this course help me prepare for an AWS Certification?
A. Earning an AWS Certification typically requires both knowledge and experience. While this course, if taken in isolation, will provide you with relevant information and skills, it likely will not equip you to earn an AWS Certification. For more information about AWS Certifications, including recommended training and experience requirements, visit aws.amazon.com/certification.