Add A2C Implementation #86

bozo-bud · 2025-02-24T16:54:11Z

Description

Added files and functions needed to implement a new Actor to Critic algorithm using Stable-baseline 3.

Type of Change

Additions were made to the following files:
-rl_scripts/agents/base_agent.py
-rl_scripts/args/general_args.py
-rl_scripts/args/regirstrly_args.py
-rl_scripts/helpers/setup_helpers.py

The following files were added:
-rl_scripts/algorithms/a2c.py
-sb3_scripts/yml/a2c.yml

New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How has this change been tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Also, list any relevant details for your test configuration.

No test configurations have been added at this time.

Does change comply with project standards and guideline?

Standards and guidelines can be found on the Project Homepage.

ryanmccann1024

Overall I think it's good for a merge if there were no errors. Double check after addressing the comment and merging. Please lint scripts, if there are any errors in unit tests, it's ok to comment them out temporarily to merge.

rl_scripts/algorithms/a2c.py

Requests' status dictionary was not being reset.

Also, add test and plot modules for RL.

Also, change some script names for clarity or unnecessary length.

We were not tracking the algorithm name properly. Also, add an excel module.

Also change seed vs. reward funciton.

We didn't return properly for the SB3 callback, leading to a short episode 'hidden' episode leading to an impossible reward in the first episode in the next trial.

Have q_learning save in the same format as the bandits.

We weren't plotting for all algorithms.

Changed to box plots. Also, remove some commented code.

Also, add test and plot modules for RL.

Also, change some script names for clarity or unnecessary length.

…mplement # Conflicts: # reinforcement_learning/agents/base_agent.py # reinforcement_learning/args/registry_args.py # reinforcement_learning/plot/run_plots.py # sb3_scripts/yml/ppo.yml

bozo-bud added 2 commits February 24, 2025 11:36

"Added framework for A2C algorithm"

27c625c

"Added framework for A2C algorithm"

83c5e77

bozo-bud requested a review from ryanmccann1024 February 24, 2025 16:54

ryanmccann1024 force-pushed the ryan_drl_path_agents branch 2 times, most recently from 1f52b86 to 9afabc1 Compare March 4, 2025 18:57

ryanmccann1024 approved these changes Mar 18, 2025

View reviewed changes

rl_scripts/algorithms/a2c.py Show resolved Hide resolved

ryanmccann1024 and others added 24 commits March 21, 2025 15:05

Bug fix for SB3 training

20c1c95

Requests' status dictionary was not being reset.

Safety commit

9ab9440

Refactor helper scripts to utils module only

c54fc81

Also, add test and plot modules for RL.

Change rl module name to reinforcement_learning

96f0f96

Also, change some script names for clarity or unnecessary length.

Improve structure for RL plotting module.

341da1b

Add sim filters for the RL module exclusively

41494fd

Bug fixes for run_plots.py

be6f213

We were not tracking the algorithm name properly. Also, add an excel module.

Add support for plotting variance by seed or average rewards

54dbf1c

Add state-value heat map

4bdbd67

Remove comments

cd5a190

Update chart DPIs

14554c8

Also change seed vs. reward funciton.

Add sim times plot

a39343a

Add memory usage plot

6857e91

Add memory usage plot

4375b3f

Update requirements.txt

8b7b9f8

Bug fix for PPO trial reset

dec45b4

We didn't return properly for the SB3 callback, leading to a short episode 'hidden' episode leading to an impossible reward in the first episode in the next trial.

"Updated TODO list"

0ffb712

Update q_learning file output

739dade

Have q_learning save in the same format as the bandits.

Update plots for better readability

c2aef45

Lint plot scripts

3e2d875

Remove environment re-definition

fd3d1b3

Add files for tracking DRL reward variance

e6af9bf

Fix plot state values heat map

f157bd2

We weren't plotting for all algorithms.

Safety commit

2412b26

ryanmccann1024 and others added 13 commits March 21, 2025 15:10

Have PPO save episodic rewards

7464c1a

Have PPO save episodic rewards

1fa6643

Improve styling of all plots

e0656e2

Change plots for memory and sim times

a54afb9

Changed to box plots. Also, remove some commented code.

Add request holding time as a PPO observation

d060ac4

Reimplemented necessary imports.

49b46af

Safety commit

ff0865c

Refactor helper scripts to utils module only

86e3d6b

Also, add test and plot modules for RL.

Change rl module name to reinforcement_learning

9e2bcf7

Also, change some script names for clarity or unnecessary length.

Add sim filters for the RL module exclusively

9004e82

Lint plot scripts

d9183c4

Add request holding time as a PPO observation

310e28d

Merge remote-tracking branch 'origin/ryan_drl_path_agents' into a2c_i…

01712f2

…mplement # Conflicts: # reinforcement_learning/agents/base_agent.py # reinforcement_learning/args/registry_args.py # reinforcement_learning/plot/run_plots.py # sb3_scripts/yml/ppo.yml

bozo-bud merged commit f02e502 into ryan_drl_path_agents Mar 22, 2025
2 of 6 checks passed

ryanmccann1024 deleted the a2c_implement branch August 13, 2025 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add A2C Implementation #86

Add A2C Implementation #86

Uh oh!

bozo-bud commented Feb 24, 2025

Uh oh!

ryanmccann1024 left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add A2C Implementation #86

Add A2C Implementation #86

Uh oh!

Conversation

bozo-bud commented Feb 24, 2025

Description

Type of Change

How has this change been tested?

Does change comply with project standards and guideline?

Uh oh!

ryanmccann1024 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ryanmccann1024 left a comment •

edited

Loading