Commit Graph

  • d57debe0d4 feat: compleate docker compose for windows machine main Artem-Darius Weber 2025-04-26 09:59:03 -0700
  • 3510af1fbd added Docker setup Artem-Darius Weber 2025-04-23 16:22:54 -0700
  • 20fb6779c3 docs: add GGUF thinhlpg 2025-04-17 16:31:13 +0700
  • 7cd4d18ee6 feat: enhance model loading with fallback to Hugging Face repo and improved error handling thinhlpg 2025-04-16 10:29:44 +0700
  • bac5f3b4f7 feat: update config and paths, update data genenration script thinhlpg 2025-04-16 03:06:55 +0000
  • bd1d7ced3b docs: update project description and authors in pyproject.toml; reorganize demo section in README thinhlpg 2025-04-16 03:05:32 +0000
  • 647f7781d5 fix: change data to correct format thinhlpg 2025-04-16 03:02:55 +0000
  • 62cc8137bf docs: add Experiments section to README with detailed run information thinhlpg 2025-04-15 08:57:54 +0000
  • c153652856 docs: update README with new image thinhlpg 2025-04-15 08:22:34 +0000
  • ad18169d77 docs: enhance README with demo GIF thinhlpg 2025-04-15 08:01:06 +0000
  • 89e07bc02d chore: chore: remove unused code and dependencies thinhlpg 2025-04-15 05:52:35 +0000
  • 5eabd121a3 docs: update README thinhlpg 2025-04-15 05:48:28 +0000
  • 0b4bf54833 feat: update demo from DeepSearch to ReZero, adjusting related logging and UI components thinhlpg 2025-04-15 05:19:52 +0000
  • 9738b80353 feat: update max generations and output length in evaluation scripts, add memory fraction to server launch thinhlpg 2025-04-15 05:04:33 +0000
  • 7ee65269fb feat: add new evaluation notebook for model testing and checkpoint evaluation thinhlpg 2025-04-15 05:02:29 +0000
  • bec864038b feat: increase max tokens and new tokens in evaluation scripts thinhlpg 2025-04-14 09:09:01 +0000
  • dfa420fa49 feat: expand Makefile with serving and evaluation commands thinhlpg 2025-04-14 07:28:10 +0000
  • 6ba963aca3 feat: streamline data preparation in Makefile with a single command thinhlpg 2025-04-14 06:31:23 +0000
  • 424459d840 feat: update evaluation scripts to enhance model configuration and dataset loading, including increased max tokens and added logging thinhlpg 2025-04-14 05:58:27 +0000
  • bf9f2c4102 docs: update README with setup instructions, quick demo, and data preparation steps for better clarity and usability thinhlpg 2025-04-14 03:13:20 +0000
  • d7cdb6c917 chore: remove unused scripts thinhlpg 2025-04-14 02:51:20 +0000
  • 1e7514f98e chore: remove outdated documentation files to clean up project structure thinhlpg 2025-04-14 02:49:25 +0000
  • 333d1e596e feat: add prepare-dev-data target and script for Musique dev data transformation thinhlpg 2025-04-13 20:04:35 +0000
  • 504f0c6c8e feat: update reward_em_chunk to match only the LAST required paragraph of the reasoning chain and adjust related tests thinhlpg 2025-04-11 18:39:18 +0000
  • 358875a035 feat: enhance reward_em_chunk function to match multiple paragraphs, add test thinhlpg 2025-04-11 17:21:51 +0000
  • 2df9f39fda feat: update model configuration (longer context) and dataset loading logic for improved performance and flexibility thinhlpg 2025-04-11 17:20:57 +0000
  • 4a1d45271d feat: add scripts for musique data processing thinhlpg 2025-04-11 17:18:18 +0000
  • 74aa673866 chores: add cook notebook for musique and model reasoning pattern thinhlpg 2025-04-11 00:59:12 +0000
  • 14ef79a4f5 feat: [WIP] add bench scripts thinhlpg 2025-04-10 06:48:14 +0000
  • bd02305efb chores: add cook notebooks thinhlpg 2025-04-09 07:07:13 +0000
  • d8e949ec7c feat: add Tavily search tab and integrate TavilyClient for web search functionality thinhlpg 2025-04-09 06:11:07 +0000
  • 41b7889a30 feat: integrate QA dataset loading and display gold answers in Gradio interface thinhlpg 2025-04-09 03:28:46 +0000
  • 7376f596a5 feat: add Gradio demo for DeepSearch and update configuration settings thinhlpg 2025-04-09 03:10:43 +0000
  • 7ff3623102 chore: update .gitignore, modify Makefile for installation, and add pyproject.toml for project configuration thinhlpg 2025-04-08 05:58:11 +0000
  • eebf914a81 refactor: moved modules from src/deepsearch to src/ thinhlpg 2025-04-08 05:58:03 +0000
  • 0f662d4330 refactor: moved FlashRAG submodule from src/ to third_party/ thinhlpg 2025-04-08 05:57:51 +0000
  • 55f34b8503 feat: add FlashRAG as submodule thinhlpg 2025-04-06 22:26:09 +0700
  • 2fec4f2f42 refactor: change repo stucture (move code from src/ to src/deepsearch) thinhlpg 2025-04-06 22:22:32 +0700
  • e3163081a0 docs: add experiment log for llama-3.2-3b-instruct experiments thinhlpg 2025-04-06 21:55:35 +0700
  • 010957cd99 feat: disable randomization option to get_qa_dataset function by default thinhlpg 2025-04-04 14:56:31 +0700
  • 56911a73f9 Update README.md automaticcat 2025-04-04 11:52:39 +0700
  • 1a18cd7bfd feat: update training and evaluation configurations (editable agent generation scripts) thinhlpg 2025-04-04 10:11:23 +0700
  • 77f121662f test: add tests for reward_retry function scenarios thinhlpg 2025-04-04 09:59:07 +0700
  • c8714e0f6b feat: enhance reward_retry function to handle missing answer tags thinhlpg 2025-04-04 09:58:44 +0700
  • bf480574a2 fix: minor bug thinhlpg 2025-04-04 00:54:40 +0700
  • 3081d6e36b test: added tests for new reward functions: search strategy and search diversity thinhlpg 2025-04-04 00:28:04 +0700
  • 4de31e0f30 feat: expand reward functions with new strategies and diversity checks thinhlpg 2025-04-04 00:27:40 +0700
  • d0e6068055 fix: strengthen reward correctness logic to handle final message is not asnwer form assistant. Also update logs for reward functions for better debug thinhlpg 2025-04-03 23:23:42 +0700
  • 1bd609dfae test: enhance reward correctness tests with validation logic thinhlpg 2025-04-03 23:03:09 +0700
  • 338655e563 feat: refine user prompt logic for improved clarity and structure thinhlpg 2025-04-03 22:56:22 +0700
  • 6d994feeb2 feat: enhance evaluation scripts for base and LoRA models thinhlpg 2025-04-03 22:45:06 +0700
  • da60b52bd1 feat: refactor download and upload scripts for improved argument handling (more notebook friendly :D) thinhlpg 2025-04-03 18:19:55 +0700
  • fa3c0562fe feat: add evaluation scripts for base and LoRA models thinhlpg 2025-04-03 18:18:02 +0700
  • 1047e2fa1c chore: update .gitignore and requirements for unsloth versions thinhlpg 2025-04-03 18:17:33 +0700
  • 83f86869f6 chore: update .gitignore and add new toys data files thinhlpg 2025-04-03 16:55:22 +0700
  • 133cb1ab90 test: add Qwen tokenizer adapter tests thinhlpg 2025-04-03 16:53:09 +0700
  • 6efe01d5ff chore: update Makefile and requirements for testing thinhlpg 2025-04-03 16:51:18 +0700
  • af7f38c792 feat: add code for qwen architecture thinhlpg 2025-04-03 14:48:44 +0700
  • e7915a6a8e feat: add util script to upload/download checkpoints thinhlpg 2025-04-03 14:32:25 +0700
  • 9009440663 chore: disable logging, enable torch complie thinhlpg 2025-04-03 13:32:19 +0700
  • d2f03b96ab feat: enhance evaluation script and remove deprecated shell script thinhlpg 2025-04-03 10:37:21 +0700
  • 908768458c chore: update Makefile and requirements for testing thinhlpg 2025-04-03 10:28:32 +0700
  • 90b45c62ab docs: update docs and notebooks for the past few days, (observation, debugging) thinhlpg 2025-04-03 10:27:17 +0700
  • 3910ef343a test: add unit tests for agent, reward functions, and tokenizer adapters thinhlpg 2025-04-03 10:20:40 +0700
  • 31dcbf5d8a feat: refactor whole code base, add logic for training R1 distil base models, change some template and reward logics thinhlpg 2025-04-03 10:19:06 +0700
  • c90c03267e feat: change user prompt template to search-r1 inspried format thinhlpg 2025-04-01 06:55:38 +0700
  • 58dcf9a99d refactor: simplify inference script by removing logger, load 16 bit model intead of raw lora finetuned thinhlpg 2025-04-01 04:52:13 +0700
  • da79e986b6 feat: add new script and functionality in train script to save model in 16 bit format thinhlpg 2025-04-01 04:51:24 +0700
  • f6b6cca2ce feat: add multiple reference notebooks for model training and inference thinhlpg 2025-04-01 04:18:39 +0700
  • 04593fa8fd style: change line length to 119, organize imports thinhlpg 2025-04-01 04:08:31 +0700
  • abb18b10d8 feat: add CLI inference script with search functionality thinhlpg 2025-04-01 04:03:38 +0700
  • fe70896023 chore: add Makefile for installation, code quality checks, style formatting, cleanup, and other tasks thinhlpg 2025-04-01 03:58:55 +0700
  • 60233f2113 chore: update .gitignore thinhlpg 2025-04-01 03:56:48 +0700
  • fd32bcacfd chores: update worklog and research progress thinhlpg 2025-03-27 16:35:20 +0700
  • 37730095a9 feat: add eval scripts that compare base model performance with the grpo trained model thinhlpg 2025-03-27 16:34:29 +0700
  • 7f2f43aa46 chore: clean up notebooks thinhlpg 2025-03-27 16:33:25 +0700
  • 3c2deaced9 refactor: restructure code base, better centralize logging logic thinhlpg 2025-03-27 16:29:24 +0700
  • 04d56325bb feat: add new reward functions, add less dumb data generation logic, implement better logging thinhlpg 2025-03-25 21:47:15 +0700
  • b22b02ea1d feat: changed `<reasoning>` tags to `<think> thinhlpg 2025-03-25 16:44:51 +0700
  • 7d4de89186 chore: update worklog 250324 thinhlpg 2025-03-25 10:31:10 +0700
  • 1bdee261b6 feat: add draft data generation and documentation thinhlpg 2025-03-24 09:02:54 +0700
  • f19354a8c9 chore: clean up notebook output thinhlpg 2025-03-24 09:01:00 +0700
  • f60ab499eb chore: update worklog thinhlpg 2025-03-24 06:02:13 +0700
  • a58722e16f feat: add initial project structure and core functionality thinhlpg 2025-03-23 02:33:21 +0700
  • 91c2476c28 chore: initial commit - the ugliest code i've ever written 💀 Thinh Le 2025-03-21 18:42:40 +0700
  • bf32fdd897 Initial commit Thinh Le 2025-03-20 14:45:50 +0700