You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
6 lines
399 B
6 lines
399 B
# Self Verification
|
|
|
|
- [x] Investigate this term: it's word is mentioned in the autodiact's about section and also in the deepseek R1 paper (not so detailed), but not in blogs or code base. I think this word is important and should be investigated
|
|
- Lol a "Verifier" is just a synonym of **reward function**
|
|
- <https://docs.unsloth.ai/basics/reasoning-grpo-and-rl#reward-functions-verifier>
|