18 Commits (f6b6cca2ceed4bcbddc988591110aa92459468f0)
 

Author SHA1 Message Date
thinhlpg f6b6cca2ce feat: add multiple reference notebooks for model training and inference
3 months ago
thinhlpg 04593fa8fd style: change line length to 119, organize imports
3 months ago
thinhlpg abb18b10d8 feat: add CLI inference script with search functionality
3 months ago
thinhlpg fe70896023 chore: add Makefile for installation, code quality checks, style formatting, cleanup, and other tasks
3 months ago
thinhlpg 60233f2113 chore: update .gitignore
3 months ago
thinhlpg fd32bcacfd chores: update worklog and research progress
3 months ago
thinhlpg 37730095a9 feat: add eval scripts that compare base model performance with the grpo trained model
3 months ago
thinhlpg 7f2f43aa46 chore: clean up notebooks
3 months ago
thinhlpg 3c2deaced9 refactor: restructure code base, better centralize logging logic
3 months ago
thinhlpg 04d56325bb feat: add new reward functions, add less dumb data generation logic, implement better logging
3 months ago
thinhlpg b22b02ea1d feat: changed `<reasoning>` tags to `<think>
3 months ago
thinhlpg 7d4de89186 chore: update worklog 250324
3 months ago
thinhlpg 1bdee261b6 feat: add draft data generation and documentation
3 months ago
thinhlpg f19354a8c9 chore: clean up notebook output
3 months ago
thinhlpg f60ab499eb chore: update worklog
3 months ago
thinhlpg a58722e16f feat: add initial project structure and core functionality
3 months ago
Thinh Le 91c2476c28 chore: initial commit - the ugliest code i've ever written 💀
3 months ago
Thinh Le bf32fdd897 Initial commit
3 months ago