Releases: om-ai-lab/VLM-R1
Releases · om-ai-lab/VLM-R1
v0.2.0
v0.1.0
What's Changed
- Ruox/main jsonl dataloader by @xrc10 in #14
- docs: update README.md by @eltociear in #27
- fix model torch_dtype setting by @zhangqianqianhzlh in #49
- custom reward by @zhangqianqianhzlh in #55
- multi-node GPRO recipe by @xrc10 in #59
- add epsilon clipping for GRPO by @xrc10 in #61
- formats are not unique by @Amos1109 in #65
- Add num_iterations from original GRPO algorithm by @xrc10 in #78
- Update grpo_trainer.py by @davidluciolu in #58
- add yes_no_reward function by @KingSan666888 in #88
- fix default reward method by @zhangqianqianhzlh in #87
- fix multi match case by @zhangqianqianhzlh in #94
- convert other types of answers by @xrc10 in #95
- fix data loader and batching by @xrc10 in #98
- fix: pin transformers to v4.49.0 to resolve model loading issues by @chaoyuhao in #105
- llm reward by @Amos1109 in #115
- math_reward by @Amos1109 in #127
- support language training data by @zhangqianqianhzlh in #130
New Contributors
- @xrc10 made their first contribution in #14
- @eltociear made their first contribution in #27
- @zhangqianqianhzlh made their first contribution in #49
- @Amos1109 made their first contribution in #65
- @davidluciolu made their first contribution in #58
- @chaoyuhao made their first contribution in #105
Full Changelog: https://github.com/om-ai-lab/VLM-R1/commits/v0.1.0