You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
thinhlpg 5eabd121a3
docs: update README
4 weeks ago
data chore: update .gitignore and add new toys data files 1 month ago
notebooks feat: add new evaluation notebook for model testing and checkpoint evaluation 4 weeks ago
scripts feat: update max generations and output length in evaluation scripts, add memory fraction to server launch 4 weeks ago
src feat: increase max tokens and new tokens in evaluation scripts 4 weeks ago
tests feat: update reward_em_chunk to match only the LAST required paragraph of the reasoning chain and adjust related tests 4 weeks ago
third_party refactor: moved FlashRAG submodule from src/ to third_party/ 1 month ago
.env.example refactor: restructure code base, better centralize logging logic 1 month ago
.gitignore feat: update max generations and output length in evaluation scripts, add memory fraction to server launch 4 weeks ago
.gitmodules refactor: moved FlashRAG submodule from src/ to third_party/ 1 month ago
Makefile feat: expand Makefile with serving and evaluation commands 4 weeks ago
README.md docs: update README 4 weeks ago
app.py feat: update demo from DeepSearch to ReZero, adjusting related logging and UI components 4 weeks ago
config.py feat: update model configuration (longer context) and dataset loading logic for improved performance and flexibility 4 weeks ago
pyproject.toml feat: add scripts for musique data processing 4 weeks ago
train_grpo.py feat: update model configuration (longer context) and dataset loading logic for improved performance and flexibility 4 weeks ago

README.md

ReZero: Enhancing LLM search ability by trying one-more-time

ReZeroer

ReZero trains a small language model to develop effective search behaviors instead of memorizing static data. It interacts with multiple synthetic search engines, each with unique retrieval mechanisms, to refine queries and persist in searching until it finds exact answers. The project focuses on reinforcement learning, preventing overfitting, and optimizing for efficiency in real-world search applications.

Quick Demo | Setup | Data and Training | Models | References | Acknowledgements

Quick Demo 🚀

Run the interactive web interface to see ReZero in action:

python app.py

This will launch a Gradio interface where you can interact with the model and test different search behaviors.

Setup 🛠️

Clone and install:

git clone https://github.com/menloresearch/ReZero
cd ReZero
pip install -e .

Data and Training 🧠

All necessary training data is included in the data/ folder. To train:

python train_grpo.py

Models 🤖

You can find our models on Hugging Face 🤗! We're committed to open-source and easy access for the research community.

Model Backbone Size Link
ReZero-v0.1 Llama-3.2-3B 3B 🤗 Menlo/ReZero-v0.1-llama-3.2-3b-it-grpo-250404

References 📖

Acknowledgements 🤝

  • This project is kickstarted from the source code of AutoDidact