ReZero-Search-LLM-Agent-Fork

Author	SHA1	Message	Date
thinhlpg	d0e6068055	fix: strengthen reward correctness logic to handle final message is not asnwer form assistant. Also update logs for reward functions for better debug - Added 'logs/' directory to .gitignore to exclude log files. - Introduced log_chat_state function to log chat states and rewards to JSONL files. - Updated reward functions to log chat states with validation results for better tracking and debugging.	9 months ago
thinhlpg	1bd609dfae	test: enhance reward correctness tests with validation logic - Updated test cases to include role and tag validation for assistant messages. - Ensured that only properly formatted messages with answer tags are accepted. - Added new test for validating various incorrect formats and their expected outcomes.	9 months ago
thinhlpg	133cb1ab90	test: add Qwen tokenizer adapter tests Implemented unit tests for the Qwen tokenizer adapter, including format handling, mask generation, and multi-turn conversation support	9 months ago
thinhlpg	3910ef343a	test: add unit tests for agent, reward functions, and tokenizer adapters	9 months ago