ReZero-Search-LLM-Agent-Fork

d57debe0d4 feat: compleate docker compose for windows machine main Artem-Darius Weber 2025-04-26 09:59:03 -0700
3510af1fbd added Docker setup Artem-Darius Weber 2025-04-23 16:22:54 -0700
20fb6779c3 docs: add GGUF thinhlpg 2025-04-17 16:31:13 +0700
7cd4d18ee6 feat: enhance model loading with fallback to Hugging Face repo and improved error handling thinhlpg 2025-04-16 10:29:44 +0700
bac5f3b4f7 feat: update config and paths, update data genenration script thinhlpg 2025-04-16 03:06:55 +0000
bd1d7ced3b docs: update project description and authors in pyproject.toml; reorganize demo section in README thinhlpg 2025-04-16 03:05:32 +0000
647f7781d5 fix: change data to correct format thinhlpg 2025-04-16 03:02:55 +0000
62cc8137bf docs: add Experiments section to README with detailed run information thinhlpg 2025-04-15 08:57:54 +0000
c153652856 docs: update README with new image thinhlpg 2025-04-15 08:22:34 +0000
ad18169d77 docs: enhance README with demo GIF thinhlpg 2025-04-15 08:01:06 +0000
89e07bc02d chore: chore: remove unused code and dependencies thinhlpg 2025-04-15 05:52:35 +0000
5eabd121a3 docs: update README thinhlpg 2025-04-15 05:48:28 +0000
0b4bf54833 feat: update demo from DeepSearch to ReZero, adjusting related logging and UI components thinhlpg 2025-04-15 05:19:52 +0000
9738b80353 feat: update max generations and output length in evaluation scripts, add memory fraction to server launch thinhlpg 2025-04-15 05:04:33 +0000
7ee65269fb feat: add new evaluation notebook for model testing and checkpoint evaluation thinhlpg 2025-04-15 05:02:29 +0000
bec864038b feat: increase max tokens and new tokens in evaluation scripts thinhlpg 2025-04-14 09:09:01 +0000
dfa420fa49 feat: expand Makefile with serving and evaluation commands thinhlpg 2025-04-14 07:28:10 +0000
6ba963aca3 feat: streamline data preparation in Makefile with a single command thinhlpg 2025-04-14 06:31:23 +0000
424459d840 feat: update evaluation scripts to enhance model configuration and dataset loading, including increased max tokens and added logging thinhlpg 2025-04-14 05:58:27 +0000
bf9f2c4102 docs: update README with setup instructions, quick demo, and data preparation steps for better clarity and usability thinhlpg 2025-04-14 03:13:20 +0000
d7cdb6c917 chore: remove unused scripts thinhlpg 2025-04-14 02:51:20 +0000
1e7514f98e chore: remove outdated documentation files to clean up project structure thinhlpg 2025-04-14 02:49:25 +0000
333d1e596e feat: add prepare-dev-data target and script for Musique dev data transformation thinhlpg 2025-04-13 20:04:35 +0000
504f0c6c8e feat: update reward_em_chunk to match only the LAST required paragraph of the reasoning chain and adjust related tests thinhlpg 2025-04-11 18:39:18 +0000
358875a035 feat: enhance reward_em_chunk function to match multiple paragraphs, add test thinhlpg 2025-04-11 17:21:51 +0000
2df9f39fda feat: update model configuration (longer context) and dataset loading logic for improved performance and flexibility thinhlpg 2025-04-11 17:20:57 +0000
4a1d45271d feat: add scripts for musique data processing thinhlpg 2025-04-11 17:18:18 +0000
74aa673866 chores: add cook notebook for musique and model reasoning pattern thinhlpg 2025-04-11 00:59:12 +0000
14ef79a4f5 feat: [WIP] add bench scripts thinhlpg 2025-04-10 06:48:14 +0000
bd02305efb chores: add cook notebooks thinhlpg 2025-04-09 07:07:13 +0000
d8e949ec7c feat: add Tavily search tab and integrate TavilyClient for web search functionality thinhlpg 2025-04-09 06:11:07 +0000
41b7889a30 feat: integrate QA dataset loading and display gold answers in Gradio interface thinhlpg 2025-04-09 03:28:46 +0000
7376f596a5 feat: add Gradio demo for DeepSearch and update configuration settings thinhlpg 2025-04-09 03:10:43 +0000
7ff3623102 chore: update .gitignore, modify Makefile for installation, and add pyproject.toml for project configuration thinhlpg 2025-04-08 05:58:11 +0000
eebf914a81 refactor: moved modules from src/deepsearch to src/ thinhlpg 2025-04-08 05:58:03 +0000
0f662d4330 refactor: moved FlashRAG submodule from src/ to third_party/ thinhlpg 2025-04-08 05:57:51 +0000
55f34b8503 feat: add FlashRAG as submodule thinhlpg 2025-04-06 22:26:09 +0700
2fec4f2f42 refactor: change repo stucture (move code from src/ to src/deepsearch) thinhlpg 2025-04-06 22:22:32 +0700
e3163081a0 docs: add experiment log for llama-3.2-3b-instruct experiments thinhlpg 2025-04-06 21:55:35 +0700
010957cd99 feat: disable randomization option to get_qa_dataset function by default thinhlpg 2025-04-04 14:56:31 +0700
56911a73f9 Update README.md automaticcat 2025-04-04 11:52:39 +0700
1a18cd7bfd feat: update training and evaluation configurations (editable agent generation scripts) thinhlpg 2025-04-04 10:11:23 +0700
77f121662f test: add tests for reward_retry function scenarios thinhlpg 2025-04-04 09:59:07 +0700
c8714e0f6b feat: enhance reward_retry function to handle missing answer tags thinhlpg 2025-04-04 09:58:44 +0700
bf480574a2 fix: minor bug thinhlpg 2025-04-04 00:54:40 +0700
3081d6e36b test: added tests for new reward functions: search strategy and search diversity thinhlpg 2025-04-04 00:28:04 +0700
4de31e0f30 feat: expand reward functions with new strategies and diversity checks thinhlpg 2025-04-04 00:27:40 +0700
d0e6068055 fix: strengthen reward correctness logic to handle final message is not asnwer form assistant. Also update logs for reward functions for better debug thinhlpg 2025-04-03 23:23:42 +0700
1bd609dfae test: enhance reward correctness tests with validation logic thinhlpg 2025-04-03 23:03:09 +0700
338655e563 feat: refine user prompt logic for improved clarity and structure thinhlpg 2025-04-03 22:56:22 +0700
6d994feeb2 feat: enhance evaluation scripts for base and LoRA models thinhlpg 2025-04-03 22:45:06 +0700
da60b52bd1 feat: refactor download and upload scripts for improved argument handling (more notebook friendly :D) thinhlpg 2025-04-03 18:19:55 +0700
fa3c0562fe feat: add evaluation scripts for base and LoRA models thinhlpg 2025-04-03 18:18:02 +0700
1047e2fa1c chore: update .gitignore and requirements for unsloth versions thinhlpg 2025-04-03 18:17:33 +0700
83f86869f6 chore: update .gitignore and add new toys data files thinhlpg 2025-04-03 16:55:22 +0700
133cb1ab90 test: add Qwen tokenizer adapter tests thinhlpg 2025-04-03 16:53:09 +0700
6efe01d5ff chore: update Makefile and requirements for testing thinhlpg 2025-04-03 16:51:18 +0700
af7f38c792 feat: add code for qwen architecture thinhlpg 2025-04-03 14:48:44 +0700
e7915a6a8e feat: add util script to upload/download checkpoints thinhlpg 2025-04-03 14:32:25 +0700
9009440663 chore: disable logging, enable torch complie thinhlpg 2025-04-03 13:32:19 +0700
d2f03b96ab feat: enhance evaluation script and remove deprecated shell script thinhlpg 2025-04-03 10:37:21 +0700
908768458c chore: update Makefile and requirements for testing thinhlpg 2025-04-03 10:28:32 +0700
90b45c62ab docs: update docs and notebooks for the past few days, (observation, debugging) thinhlpg 2025-04-03 10:27:17 +0700
3910ef343a test: add unit tests for agent, reward functions, and tokenizer adapters thinhlpg 2025-04-03 10:20:40 +0700
31dcbf5d8a feat: refactor whole code base, add logic for training R1 distil base models, change some template and reward logics thinhlpg 2025-04-03 10:19:06 +0700
c90c03267e feat: change user prompt template to search-r1 inspried format thinhlpg 2025-04-01 06:55:38 +0700
58dcf9a99d refactor: simplify inference script by removing logger, load 16 bit model intead of raw lora finetuned thinhlpg 2025-04-01 04:52:13 +0700
da79e986b6 feat: add new script and functionality in train script to save model in 16 bit format thinhlpg 2025-04-01 04:51:24 +0700
f6b6cca2ce feat: add multiple reference notebooks for model training and inference thinhlpg 2025-04-01 04:18:39 +0700
04593fa8fd style: change line length to 119, organize imports thinhlpg 2025-04-01 04:08:31 +0700
abb18b10d8 feat: add CLI inference script with search functionality thinhlpg 2025-04-01 04:03:38 +0700
fe70896023 chore: add Makefile for installation, code quality checks, style formatting, cleanup, and other tasks thinhlpg 2025-04-01 03:58:55 +0700
60233f2113 chore: update .gitignore thinhlpg 2025-04-01 03:56:48 +0700
fd32bcacfd chores: update worklog and research progress thinhlpg 2025-03-27 16:35:20 +0700
37730095a9 feat: add eval scripts that compare base model performance with the grpo trained model thinhlpg 2025-03-27 16:34:29 +0700
7f2f43aa46 chore: clean up notebooks thinhlpg 2025-03-27 16:33:25 +0700
3c2deaced9 refactor: restructure code base, better centralize logging logic thinhlpg 2025-03-27 16:29:24 +0700
04d56325bb feat: add new reward functions, add less dumb data generation logic, implement better logging thinhlpg 2025-03-25 21:47:15 +0700
b22b02ea1d feat: changed `<reasoning>` tags to `<think> thinhlpg 2025-03-25 16:44:51 +0700
7d4de89186 chore: update worklog 250324 thinhlpg 2025-03-25 10:31:10 +0700
1bdee261b6 feat: add draft data generation and documentation thinhlpg 2025-03-24 09:02:54 +0700
f19354a8c9 chore: clean up notebook output thinhlpg 2025-03-24 09:01:00 +0700
f60ab499eb chore: update worklog thinhlpg 2025-03-24 06:02:13 +0700
a58722e16f feat: add initial project structure and core functionality thinhlpg 2025-03-23 02:33:21 +0700
91c2476c28 chore: initial commit - the ugliest code i've ever written 💀 Thinh Le 2025-03-21 18:42:40 +0700
bf32fdd897 Initial commit Thinh Le 2025-03-20 14:45:50 +0700

Commit Graph Select branches Hide Pull Requests main Mono Color

Commit Graph

Select branches

Hide Pull Requests

main