You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
594 B
594 B
Agentic Reward Modeling
-
Research a bit more on this because I'm a bit outdated on the training side
- How does the dataset look like?
- How to evaluate the performance?