- Added `train_autodidact_1B.py` for quick test.
- Update `00_worklog.md`, `dataset.md`, and `reward-functions.md` to reflect new training strategies and reward functions.
- Updated `00_worklog.md` to reflect optimizations for speed and quality in dataset generation.
- Introduced new documentation files: `choosing-llm-and-prompt-101.md`, `ds-pipeline-v0.md`, and `paraphrase-prompt.md` for better clarity on LLM choices and dataset pipeline.
- Added a Jupyter notebook `250324_generate_data_anatomy.ipynb` to explore the data generation process
- Added initial files from AutoDiact as starting point
- Enhanced `README.md` with project overview and setup instructions. .
- Removed `ugly_code_file.py` as part of cleanup.
- Added various documentation files and assets for project clarity.
- Included Jupyter notebooks for training and experimentation.
Dropping this absolute disaster of a code file to break the paralysis.
No more overthinking, no more perfectionism—just write, make it work, and refine later.
Starting this repo with the most unreadable, unformatted, and ugly code possible.
The goal? Trick my brain into not caring about style—just build.
This mess exists to remind me that progress > perfection.
Ship first, clean up later.