Added logic to return 0 if the final message from the assistant does not contain answer tags (no matter how hard you try, you won't get anything if no result 💀)
- Added 'logs/' directory to .gitignore to exclude log files.
- Introduced log_chat_state function to log chat states and rewards to JSONL files.
- Updated reward functions to log chat states with validation results for better tracking and debugging.
- Updated eval.py to streamline model evaluation using vLLM and unsloth.
- Deleted eval.sh as its functionality is now integrated into eval.py.
- Updated .gitignore to exclude eval_logs directory.
- Break down rl_helpers into smaller modules
- Removed deprecated rl_helpers module to streamline the codebase.
- Enhance initial user prompt template inspired by Search-R1