ReZero-Search-LLM-Agent-Fork

Author	SHA1	Message	Date
thinhlpg	90b45c62ab	docs: update docs and notebooks for the past few days, (observation, debugging) - observation: model hallucniate the search result, docs about debugigng and adapting to r1 distil base model, notebooks on the detail of making training r1 distil works	8 months ago
thinhlpg	fd32bcacfd	chores: update worklog and research progress	8 months ago
thinhlpg	7d4de89186	chore: update worklog 250324 - Added `train_autodidact_1B.py` for quick test. - Update `00_worklog.md`, `dataset.md`, and `reward-functions.md` to reflect new training strategies and reward functions.	8 months ago
thinhlpg	a58722e16f	feat: add initial project structure and core functionality - Added initial files from AutoDiact as starting point - Enhanced `README.md` with project overview and setup instructions. . - Removed `ugly_code_file.py` as part of cleanup. - Added various documentation files and assets for project clarity. - Included Jupyter notebooks for training and experimentation.	9 months ago