CSCI 316 : Reinforcement Learning

Professor: Simon D. Levy
Schedule: MTWRF 1:30-3:30 Parmly 405
Office: Parmly 407B
Office Phone: 458-8419
E-mail: simon.d.levy@gmail.com
Office Hours: , 3:30-5:00 daily, and by appointment

Textbooks (optional): Although they are not required for the course, I have drawn on these books in creating the lecture slides. I suggest them to anyone interested in delving further into the material. The links will take you to a copy you can read online by providing your W&L login credentials (contact Prof. Jason Mickel if you need help with that).

Deep Reinforcement Learning Hands-On (2nd Edition). M. Lapan, Packt Publishing 2020. A practical introduction focusing on programming exercises, with minimal mathematical theory.

Foundations of Deep Reinforcement Learning: Theory and Practice in Python. L. Graesser & W.L. Keng, Addison-Wesley Professional 2019. Consistent with its name, this book provides a nice balance between mathematical theory (which can get heavy at times!) and programming.

Objectives

By the end of this course you will be able to

- Use the OpenAI Gym framework for testing Reinforcement Learning models.
- Use the PyTorch machine-learning framework to implement Deep Reinforcement Learning (DRL) models.
- Use the new Isaac Gym physics simulation and learning environment to build and train physically realistic robots.
- Think critically about the current limitations of DRL for robotics, and the prospects for the future.

Approach

As the name of the course suggests, the focus will be on DRL as a solution to problems in robotics, as opposed to a survey of the theory and practice of DRL itself. Each of the DRL algorithms we study will be motivated by the problems left unsolved by the previous algorithm:

1. Discrete observations, discrete control (Q-Learning)
2. Continuous observations, discrete control (Deep Q-Learning)
3. Continuous observations, continuous control (Advantage Actor/Critic)

Prerequisites

To do well in this class, you should have programming experience through the level of CSCI 209. I have found that the best predictor of success in any course is interest and dedication.

Grading

- Weekly quizzes, based on lecture notes: 50%
- Weekly problem sets: 30%
- Final project: 20%

The grading scale will be 93-100 A; 90-92 A-; 87-89 B+; 83-86 B; 80-82 B-; 77-79 C+; 73-76 C; 70-72 C-; 67-69 D+; 63-66 D; 60-62 D-; below 60 F.

Final Project

Thanks to support from the Spring Term Course Enhancement Fund, and help from IQ Center Director David Pfaff, our final project will be building and running the Real Ant robot. You will work in teams of three or four students to build, 3D print, and train your RealAnt. On the last day of spring term, Friday, 20 May, from 12 noon to 2 pm, you will have the opportunity to show off your RealAnt at the Spring Term Festival in the Harte Center (Leyburn Library).

An email with your final project proposal (mainly just an email from one team member with the names your team members and the project you want to work with), will be due from each team at 11:59 PM Friday 6 May.

Class Format

We will spend each of the first three weeks of the course focusing on a reinforcement learning “model of the week” (standard Q-Learning, Deep Q-Learning, Actor/Critic), and the software needed to support it. Each model will be motivated by the limitations of the previous model. To ensure that you are keeping up with the material, there will be a weekly quiz each Friday these first three weeks, based on the material in the lecture slides.

The format of each class meeting will be an hour of lecture / discussion followed by at least one hour of project work. Starting in the third week (10 May) the focus will switch the Real Ant final project. We will likely vary this schedule to accommodate additional work on projects as needed.

After-Hours Work

Because of the computational power required by DRL models, you will likely find it impractical to do much of the work in this course on your laptop. Our Advanced Lab (Parmly 413), open 24/7, has workstations adequate for most DRL tasks.

Tentative Schedule

	Monday	Tuesday	Wednesday	Thursday	Friday
25 April Week 1: Q-Learning	Course overview Intro to RL	Intro to RL * * * OpenAI Gym	Intro to RL * * * OpenAI Gym	Intro to RL *** PyTorch	Reading Quiz Due: PS1 * * *
02 May Week 2: DQN	DRL, Part I	DRL, Part I	DRL, Part II	DRL, Part II	Reading Quiz Due: PS2 Due: Project info
09 May Week 3: CNN	Policy Gradient, Part I	Policy Gradient, Part I	Policy Gradient, Part II	Policy Gradient, Part II	Reading Quiz Due: PS3
16 May Week 4: Policy Gradient	Final Project	Final Project	Final Project	Final Project	Final Project at Spring Term Festival Noon – 2pm Harte Center Leyburn Library