- observation: model hallucniate the search result, docs about debugigng and adapting to r1 distil base model, notebooks on the detail of making training r1 distil works
- Added `train_autodidact_1B.py` for quick test.
- Update `00_worklog.md`, `dataset.md`, and `reward-functions.md` to reflect new training strategies and reward functions.
- Updated `00_worklog.md` to reflect optimizations for speed and quality in dataset generation.
- Introduced new documentation files: `choosing-llm-and-prompt-101.md`, `ds-pipeline-v0.md`, and `paraphrase-prompt.md` for better clarity on LLM choices and dataset pipeline.
- Added a Jupyter notebook `250324_generate_data_anatomy.ipynb` to explore the data generation process
- Added initial files from AutoDiact as starting point
- Enhanced `README.md` with project overview and setup instructions. .
- Removed `ugly_code_file.py` as part of cleanup.
- Added various documentation files and assets for project clarity.
- Included Jupyter notebooks for training and experimentation.