Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

GAD-cell
Copy link
Contributor

@GAD-cell GAD-cell commented Jul 2, 2025

Since get_per_token_logps always returns None now, I had to implement VLM GRPO using grpo_accumulated_loss.

This PR only modifies the input and ensures proper output slicing within grpo_accumulated_loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant