HW4 Spec
HW4 Spec
HW4 Spec
Introduction
Reinforcement learning is an important topic in the field of machine learning. It helps
agents learn the policy to achieve their goal by interacting with the environment, and
has achieved many outstanding performances in computer games. In this
assignment, you will utilize OpenAI Gym environments (i.e., Taxi-v3 and
CartPole-v0) and implement a RL algorithm, Q learning. Your agent will learn a policy
by interacting with the environment. Then, you will analyze the performance and
results and answer some questions.
Setup
We recommend you to use python 3.7 and all the packages you need are listed in
the requirements.txt. Please run the command to install the packages:
pip install -r requirements.txt
File structure
📂HW4
┣ 📜requirements.txt
┣ 📜plot.py
(for setup)
┣ 📜taxi.py
(for experiment)
┣ 📜cartpole.py
(part1 source code)
┗ 📜DQN.py
(part2 source code)
(part3 source code)
Implementation (50%)
You will implement some key sections of the Q learning algorithm and its variants.
These sections are specified with # Begin your code and # End your code. Please
read all the comments to comprehend the source code before implementation. Note
that do not modify the rest of the code.
During the training process, a .npy/.pt file of the Q table/network will be generated,
which will be overwritten every single step. After the training, there will be a test to
evaluate your agent, the “test” function will load the .npy/.pt file to test its
performance.
After testing, a reward record of the training process will be saved as a .npy file for
you. You need to use the following command to execute plot.py in order to produce
four graphs (taxi.png, cartpole.png, DQN.png, compare.png) showing your training
processes:
Report (50%)
● A report is required.
● The report should be written in English.
● Please save the report as a .pdf file. (font size: 12)
● Answer the questions in the report template in detail.
Submission
Please prepare your source code, tables, networks, reward records, graphs, and
report (.pdf) into STUDENTID_hw4.zip.
📂{student_id}_hw4.zip
┣ 📂Plots
┃ ┣📜taxi.png
┃ ┣📜cartpole.png
┃ ┣📜DQN.png
┃ ┗📜compare.png
┣ 📂Rewards
┃ ┣📜taxi_rewards.npy
┃ ┣📜cartpole_rewards.npy
📜
📂
┃ ┗ DQN_rewards.npy
📜
┣ Tables
📜
┃ ┣ taxi_table.npy
📜
┃ ┣ cartpole_table.npy
📜
┃ ┗ DQN.pt
📜
┣ taxi.py
📜
┣ cartpole.py
📜
┣ DQN.py
┗ report.pdf
e.g. 110123456_hw4.zip
Reference