From 5eabd121a3deae0c07213bd588abfb625b3939c8 Mon Sep 17 00:00:00 2001 From: thinhlpg Date: Tue, 15 Apr 2025 05:48:28 +0000 Subject: [PATCH] docs: update README --- README.md | 157 +++++++----------------------------------------------- 1 file changed, 20 insertions(+), 137 deletions(-) diff --git a/README.md b/README.md index 7353e82..5393fac 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,18 @@ -# DeepSearch - A Hard Working Search Engine 🔍 +
-DeepSearcher +# ReZero: Enhancing LLM search ability by trying one-more-time -DeepSearch trains a small language model to develop effective search behaviors instead of memorizing static data. It interacts with multiple synthetic search engines, each with unique retrieval mechanisms, to refine queries and persist in searching until it finds exact answers. The project focuses on reinforcement learning, preventing overfitting, and optimizing for efficiency in real-world search applications. +ReZeroer + +ReZero trains a small language model to develop effective search behaviors instead of memorizing static data. It interacts with multiple synthetic search engines, each with unique retrieval mechanisms, to refine queries and persist in searching until it finds exact answers. The project focuses on reinforcement learning, preventing overfitting, and optimizing for efficiency in real-world search applications. + +[**Quick Demo**](#quick-demo-) | [**Setup**](#setup-️) | [**Data and Training**](#data-and-training-) | [**Models**](#models-) | [**References**](#references-) | [**Acknowledgements**](#acknowledgements-) + +
## Quick Demo 🚀 -Run the interactive web interface to see DeepSearch in action: +Run the interactive web interface to see ReZero in action: ```bash python app.py @@ -14,157 +20,34 @@ python app.py This will launch a Gradio interface where you can interact with the model and test different search behaviors. -You can also evaluate model performance: - -```bash -# Using the evaluation scripts -python scripts/eval_lora.py --lora_path "/path/to/lora" -python scripts/eval_base.py -``` - ## Setup 🛠️ -1. Clone the repository with submodules: - -```bash -git clone --recurse-submodules [repository-url] -cd DeepSearch -``` - -2. Set up your environment variables: - -```bash -cp .env.example .env -# Edit .env to add your HuggingFace token and OpenRouter API key -``` - -3. Install dependencies using the development setup: +Clone and install: ```bash -make install +git clone https://github.com/menloresearch/ReZero +cd ReZero +pip install -e . ``` -This installs the project in editable mode along with all dependencies specified in pyproject.toml, including: - -- transformers -- unsloth -- gradio -- langchain -- and other required packages +## Data and Training 🧠 -## Data Preparation 📊 - -DeepSearch uses the Musique dataset for training and evaluation. - -### Download and prepare all data in one step - -```bash -make prepare-all-musique -``` - -### Step-by-step data preparation - -1. Download the Musique dataset: - - ```bash - make download-musique - ``` - -2. Prepare the JSONL files for training: - - ```bash - make prepare-musique-jsonl - ``` - -3. Extract paragraphs for indexing: - - ```bash - make extract-musique-paragraphs - ``` - -4. Build the FAISS index: - - ```bash - make build-musique-index - ``` - -5. Prepare development data: - - ```bash - make prepare-dev-data - ``` - -6. Validate data preparation: - - ```bash - make check-data - ``` - -## Training 🧠 - -Train the model using the GRPO (General Reinforcement Learning from Outer Preferences) approach: +All necessary training data is included in the `data/` folder. To train: ```bash python train_grpo.py ``` -You can monitor training progress with TensorBoard: - -```bash -make tensorboard -``` - -List available training runs: - -```bash -make list-runs -``` - -## Development 💻 - -### Run tests - -```bash -make test -``` - -### Code quality and style - -```bash -# Format code -make style - -# Check code quality -make quality - -# Auto-fix issues -make fix -``` - -### Clean up - -```bash -make clean -``` - ## Models 🤖 You can find our models on Hugging Face 🤗! We're committed to open-source and easy access for the research community. | Model | Backbone | Size | Link | |-------|----------|------|------| -| - | - | - | - | - -## Datasets 📚 - -We've released our datasets on Hugging Face 🤗 to support reproducibility and further research. - -| Dataset | Description | Size | Link | -|---------|-------------|------|------| -| - | - | - | - | -| - | - | - | - | -| - | - | - | - | +| ReZero-v0.1 | Llama-3.2-3B | 3B | [🤗 Menlo/ReZero-v0.1-llama-3.2-3b-it-grpo-250404](https://huggingface.co/Menlo/ReZero-v0.1-llama-3.2-3b-it-grpo-250404) | ## References 📖 -- This project is kickstarted from [AutoDidact](https://github.com/dCaples/AutoDidact) +## Acknowledgements 🤝 + +- This project is kickstarted from the source code of [AutoDidact](https://github.com/dCaples/AutoDidact)