ReZero-Search-LLM-Agent-Fork/docs/archived/self-verification.md

6 lines
399 B

# Self Verification
- [x] Investigate this term: it's word is mentioned in the autodiact's about section and also in the deepseek R1 paper (not so detailed), but not in blogs or code base. I think this word is important and should be investigated
- Lol a "Verifier" is just a synonym of **reward function**
- <https://docs.unsloth.ai/basics/reasoning-grpo-and-rl#reward-functions-verifier>