Skip to main content

LVx: Fundamentals of Deep Reinforcement Learning

Learn the theoretical foundations of Deep Learning through practical Python code.

Fundamentals of Deep Reinforcement Learning
8 weeks
2–6 hours per week
Self-paced
Progress at your own speed
This course is archived

About this course

Skip About this course

This course starts from the very beginnings of Reinforcement Learning and works its way up to a complete understanding of Q-learning, one of the core reinforcement learning algorithms.

In part II of this course, you'll use neural networks to implement Q-learning to produce powerful and effective learning agents (neural nets are the "Deep" in "Deep Reinforcement Learning").

At a glance

  • Institution: LVx
  • Subject: Computer Science
  • Level: Introductory
  • Prerequisites:
    • Requirements:

      • Proficiency with Python
      • Functions, classes, objects, loops
      • Basic familiarity with Jupyter notebooks

    Recommended Prerequisites:

    • Basic probability
      • Sampling from a normal distributon
      • Conditional probability notation
      • \mathbb{E}E - expectation
    • \SigmaΣ - the summation operator
  • Language: English
  • Video Transcript: English
  • Associated skills:Artificial Neural Networks, Deep Learning, Reinforcement Learning, Python (Programming Language), Q Learning

What you'll learn

Skip What you'll learn
  • The theoretical underpinnings of Reinforcement Learning ("RL").
  • How to implement each piece of theory to solve real problems in Python.
  • The core RL formula: The Bellman Equation
  • The Q-Learning algorithm, as well as many powerful improvements.
  • Enough to prepare you for implement Reinforcement Learning algorithms using Deep Neural Networks (Part II).

Each concept is presented with a video overview, and detailed Jupyter notebooks covering each aspect of theory and practice.

  • Introduction to Reinforcment Learning
  • Bandit Problems
    • Epsilon Greedy Agent
  • Markov Decision Processes
    • Episode Returns
    • Returns and Discount Factors
  • The Bellman Equation
  • Iterative Policy Evaluation and Improvement
  • Policy Evaluation and Iteration
  • Dynamic Programming
  • Q-Learning and Sampling Based Methods
  • Monte Carlo Rollouts vs. Temporal Difference Learning
  • On-Policy Learning vs. Off-Policy Learning
  • Q-Learning
  • What's Next

Who can take this course?

Unfortunately, learners residing in one or more of the following countries or regions will not be able to register for this course: Iran, Cuba and the Crimea region of Ukraine. While edX has sought licenses from the U.S. Office of Foreign Assets Control (OFAC) to offer our courses to learners in these countries and regions, the licenses we have received are not broad enough to allow us to offer this course in all locations. edX truly regrets that U.S. sanctions prevent us from offering all of our courses to everyone, no matter where they live.

Interested in this course for your business or team?

Train your employees in the most in-demand topics, with edX For Business.