10 Commits (504f0c6c8e137cc56806b3e84760df05bbf8e8f5)

Author SHA1 Message Date
thinhlpg 504f0c6c8e feat: update reward_em_chunk to match only the LAST required paragraph of the reasoning chain and adjust related tests
3 months ago
thinhlpg 358875a035 feat: enhance reward_em_chunk function to match multiple paragraphs, add test
3 months ago
thinhlpg eebf914a81 refactor: moved modules from src/deepsearch to src/
3 months ago
thinhlpg 2fec4f2f42 refactor: change repo stucture (move code from src/ to src/deepsearch)
3 months ago
thinhlpg 77f121662f test: add tests for reward_retry function scenarios
3 months ago
thinhlpg 3081d6e36b test: added tests for new reward functions: search strategy and search diversity
3 months ago
thinhlpg d0e6068055 fix: strengthen reward correctness logic to handle final message is not asnwer form assistant. Also update logs for reward functions for better debug
3 months ago
thinhlpg 1bd609dfae test: enhance reward correctness tests with validation logic
3 months ago
thinhlpg 133cb1ab90 test: add Qwen tokenizer adapter tests
3 months ago
thinhlpg 3910ef343a test: add unit tests for agent, reward functions, and tokenizer adapters
3 months ago