You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
thinhlpg 9009440663
chore: disable logging, enable torch complie
1 month ago
data feat: add initial project structure and core functionality 2 months ago
docs docs: update docs and notebooks for the past few days, (observation, debugging) 1 month ago
notebooks docs: update docs and notebooks for the past few days, (observation, debugging) 1 month ago
scripts feat: add new script and functionality in train script to save model in 16 bit format 1 month ago
src chore: disable logging, enable torch complie 1 month ago
tests test: add unit tests for agent, reward functions, and tokenizer adapters 1 month ago
.env.example refactor: restructure code base, better centralize logging logic 1 month ago
.gitignore feat: enhance evaluation script and remove deprecated shell script 1 month ago
Makefile chore: update Makefile and requirements for testing 1 month ago
README.md feat: add eval scripts that compare base model performance with the grpo trained model 1 month ago
eval.py feat: enhance evaluation script and remove deprecated shell script 1 month ago
inference.py feat: enhance evaluation script and remove deprecated shell script 1 month ago
requirements.txt chore: update Makefile and requirements for testing 1 month ago
train.sh refactor: restructure code base, better centralize logging logic 1 month ago
train_grpo.py feat: refactor whole code base, add logic for training R1 distil base models, change some template and reward logics 1 month ago

README.md

DeepSearch - A Hard Working Search Engine 🔍

DeepSearch trains a small language model to develop effective search behaviors instead of memorizing static data. It interacts with multiple synthetic search engines, each with unique retrieval mechanisms, to refine queries and persist in searching until it finds exact answers. The project focuses on reinforcement learning, preventing overfitting, and optimizing for efficiency in real-world search applications.

Project Whiteboard

Setup

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Evaluation

Compare base model with LoRA-enhanced model performance:

# Quick run with defaults
./eval.sh

# Custom run
./eval.sh --lora_path "/path/to/lora" --temperature 0.7

Direct Python usage:

python eval.py --lora_path "/path/to/lora" --temperature 0.7

The tool generates a results file with accuracy metrics and improvement statistics.

Models

You can find our models on Hugging Face 🤗! We're committed to open-source and easy access for the research community.

Model Backbone Size Link
- - - -

Datasets

We've released our datasets on Hugging Face 🤗 to support reproducibility and further research.

Dataset Description Size Link
- - - -
- - - -
- - - -

References

Personal Notes

  • This is research code, so I'm prioritizing speed over code quality for now. Expect things to be messy—both the code and commit history. Roasting is welcome, but don't judge me too hard; I'll clean it up later. I don't know what I don't know, but I'm eager (and desperate) to learn and improve, so any constructive feedback is highly appreciated! 💖