Add Normalization and Action Clipping Wrappers #586

cruz-lucas · 2025-03-24T20:30:15Z

I implemented new wrappers designed for on-device normalization and action clipping to address the CPU–GPU transfer issues when using gym’s normalization wrappers. This contribution addresses Issue #49. In particular, I added:

RunningMeanStd: A class to compute running mean and variance for normalization.
ClipVecAction: A wrapper that clips continuous actions to a specified range.
NormalizeVecObservation: A wrapper that normalizes observations using running statistics computed by RunningMeanStd.
NormalizeVecReward: A wrapper that normalizes rewards based on the running statistics of the accumulated (discounted) rewards.

I also added sanity check tests to verify the correct behavior of these wrappers. Please let me know if this doesn't align with Brax's design philosophy.

…nd clipping wrappers

btaba · 2025-04-10T18:57:06Z

Hi @cruz-lucas , is the running statistic pattern in agents not sufficient for your use-case? Advantages can also be normalized in the PPO agent. ClipVecAction is not implemented, but it's usually simple enough to include the one-line jp.clip in the environment itself. What are your thoughts here?

feat: add a minimal implementation and unit tests for normalization a…

749cc4d

…nd clipping wrappers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Normalization and Action Clipping Wrappers #586

Add Normalization and Action Clipping Wrappers #586

cruz-lucas commented Mar 24, 2025

btaba commented Apr 10, 2025

Add Normalization and Action Clipping Wrappers #586

Are you sure you want to change the base?

Add Normalization and Action Clipping Wrappers #586

Conversation

cruz-lucas commented Mar 24, 2025

btaba commented Apr 10, 2025