From f60ab499ebd9a1347650e3531e766eae7106e054 Mon Sep 17 00:00:00 2001 From: thinhlpg Date: Mon, 24 Mar 2025 06:02:13 +0700 Subject: [PATCH] chore: update worklog --- docs/00_worklog.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/docs/00_worklog.md b/docs/00_worklog.md index 0e8a3cc..3b15ca6 100644 --- a/docs/00_worklog.md +++ b/docs/00_worklog.md @@ -2,6 +2,7 @@ ## Backlog +- [ ] @thinhlpg transfers the project to @bachvudinh - [ ] Modify `generate_dataset.py` (**ONLY AFTER** the simple training and benchmark works): - [ ] As a data dataset maker, I want to change from LLama 3.1 8B to API call, like claude, gemini or openai. Originally they use 3.1 8B for `Self-Bootstrapping` demonstration, but the dataset quality is low, for sure. - [ ] Experimenting with different chunking strategies @@ -19,17 +20,17 @@ ## 250324 -- [ ] @thinhlpg transfers the project to @bachvudinh - -## 250323 - -- [ ] Train the model -- [ ] Make the dataset -- [ ] Upload datasets to HF Hub +- [ ] Train the model v0 +- [ ] Make the dataset v0 +- [ ] Upload dataset v0 to HF Hub - Initial dataset from AutoDidact - Paraphrased sdataset - [ ] Make a simple gradio demo app +## 250323 + +- brain.exe and back.exe refused to work 😭 + ## 250322 - [x] Moving all the scattered and disorganized stuffs that've been working on for the past week into this repo.