fixed the best practices !

pull/1052/head
harshalmore31 6 days ago
parent cedc7a0062
commit 1fecc98118

@ -383,30 +383,28 @@ print(result)
## Best Practices ## Best Practices
1. **Document Processing Strategy**: | Category | Best Practice | Details |
- **Chunking**: Split large documents into 200-500 token chunks for optimal retrieval |----------|---------------|---------|
- **Overlap**: Use 20-50 token overlap between chunks to maintain context | **Document Processing Strategy** | | |
- **Preprocessing**: Clean and normalize text before indexing | | Chunking | Split large documents into 200-500 token chunks for optimal retrieval |
| | Overlap | Use 20-50 token overlap between chunks to maintain context |
2. **Collection Organization**: | | Preprocessing | Clean and normalize text before indexing |
- Use separate collections for different document types (technical docs, policies, etc.) | **Collection Organization** | | |
- Implement consistent naming conventions for collections | | Separation | Use separate collections for different document types (technical docs, policies, etc.) |
- Consider document lifecycle and update strategies | | Naming | Implement consistent naming conventions for collections |
| | Lifecycle | Consider document lifecycle and update strategies |
3. **Embedding Model Selection**: | **Embedding Model Selection** | | |
- **Development**: Use `all-MiniLM-L6-v2` for fast iteration | | Development | Use `all-MiniLM-L6-v2` for fast iteration |
- **Production**: Use `text-embedding-3-small` or `text-embedding-3-large` for quality | | Production | Use `text-embedding-3-small` or `text-embedding-3-large` for quality |
- **Specialized Domains**: Consider domain-specific embedding models | | Specialized | Consider domain-specific embedding models for specialized domains |
| **Performance Optimization** | | |
4. **Performance Optimization**: | | Retrieval Count | Start with 3-5 results, adjust based on performance testing |
- **Retrieval Count**: Start with 3-5 results, adjust based on performance testing | | Batch Operations | Use `batch_add()` for efficient bulk document ingestion |
- **Batch Operations**: Use `batch_add()` for efficient bulk document ingestion | | Metadata Strategy | Store relevant metadata for enhanced filtering and context |
- **Metadata Strategy**: Store relevant metadata for enhanced filtering and context | **Production Deployment** | | |
| | Storage | Use Qdrant Cloud or self-hosted server for persistent storage |
5. **Production Deployment**: | | Error Handling | Implement proper error handling and retry mechanisms |
- Use Qdrant Cloud or self-hosted server for persistent storage | | Monitoring | Monitor performance metrics and embedding quality |
- Implement proper error handling and retry mechanisms
- Monitor performance metrics and embedding quality
## Performance Tips ## Performance Tips

Loading…
Cancel
Save