News
For a senior thesis, Alex Cai wrote a new textbook for "CS1840: Introduction to Reinforcement Learning."
Fourth-year computer science concentrators at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) have the option to write senior theses. Often taken for course credit through “CS91R: Supervised Reading and Research,” their theses seek to contribute to the general understanding of some problems within computer science.
Introduction to Reinforcement Learning
Alex Cai, A.B. '25, Computer Science
Advisor: Lucas Janson
• Please give a brief summary of your project.
I wrote a course textbook for “CS1840: Introduction to Reinforcement Learning.” Reinforcement learning (RL) is a branch of machine learning that focuses on sequential decision-making. Unlike other fields of machine learning, the output of the machine learning model affects its environment, just like in real life: think robotics, video games, or inventory management. How should an agent learn about the world and plan for the long-term effects of their actions?
• How did you come up with this idea for your final project?
I’ve been an undergraduate course assistant for CS1840 for three years, and I saw the need for an accessible reference tailored to our course material. I’m passionate about the field and about education and thought a course textbook would complement the material well!
• What real-world challenge does this project address?
My thesis is unique from existing textbooks in a few ways. It briefly introduces control theory, a “precursor” to RL that solves similar kinds of problems. It balances between theory and practice, as our course does, in contrast to existing textbooks that focus exclusively on one or the other. It is designed to be more beginner-friendly than existing textbooks and contains more background material on related areas of machine learning.
• What was the timeline of your project?
I suggested the idea for a course textbook after the first iteration of the course in my sophomore fall in 2022. As a junior the following fall, I released the first few chapters to students. By that spring, most of the core material was complete, and I had the privilege of having my textbook used for Cornell’s undergraduate course on RL. In my senior fall, a full draft was made available to the students of CS1840. I was thrilled that students found it helpful! The rest of my time in senior spring was spent revising.
• What part of the project proved the most challenging?
Staying on time! I kept setting deadlines for myself that were too ambitious and I couldn’t balance them with other commitments. Also, looking back, it was quite presumptuous of me to think that I understood RL well enough to write an authoritative source on it; many insights came later and gradually, especially as I worked on other RL projects. I also spent a while trying different typesetting tools that would let me produce both a website and a PDF using a single codebase that also contained Python source code for generating visualizations.
• What part of the project did you enjoy the most
Seeing it in action! I admit I got quite a rush at office hours when I could point to a section in the course notes that clearly answered a student’s question. The project grew a lot thanks to kind feedback from students and I’m really grateful that people found it helpful.
• What did you learn, or skills did you gain, through this project?
Teaching is one of the best ways to learn. I’ve gained a deep understanding of the fundamentals of RL and the algorithms we cover in the course. My technical communication skills have definitely also improved; I enjoyed coming up with different analogies or stories to convey the ideas of RL more effectively. By collecting bibliographic notes for each chapter, I learned a lot about the history of various RL algorithms and finally read a bunch of “classic” papers that shaped the field.
Topics: Academics, Computer Science
Cutting-edge science delivered direct to your inbox.
Join the Harvard SEAS mailing list.
Press Contact
Matt Goisman | mgoisman@g.harvard.edu