You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
643 lines
22 KiB
643 lines
22 KiB
# Milvus Cloud RAG Integration with Swarms
|
|
|
|
## Overview
|
|
|
|
Milvus Cloud (also known as Zilliz Cloud) is a fully managed cloud service for Milvus, the world's most advanced open-source vector database. It provides enterprise-grade vector database capabilities with automatic scaling, high availability, and comprehensive security features. Milvus Cloud is designed for production-scale RAG applications that require robust performance, reliability, and minimal operational overhead.
|
|
|
|
## Key Features
|
|
|
|
- **Fully Managed Service**: No infrastructure management required
|
|
- **Auto-scaling**: Automatic scaling based on workload demands
|
|
- **High Availability**: Built-in redundancy and disaster recovery
|
|
- **Multiple Index Types**: Support for various indexing algorithms (IVF, HNSW, ANNOY, etc.)
|
|
- **Rich Metadata Filtering**: Advanced filtering capabilities with complex expressions
|
|
- **Multi-tenancy**: Secure isolation between different applications
|
|
- **Global Distribution**: Available in multiple cloud regions worldwide
|
|
- **Enterprise Security**: End-to-end encryption and compliance certifications
|
|
|
|
## Architecture
|
|
|
|
Milvus Cloud integrates with Swarms agents as a scalable, managed vector database solution:
|
|
|
|
```
|
|
[Agent] -> [Milvus Cloud Memory] -> [Managed Vector DB] -> [Similarity Search] -> [Retrieved Context]
|
|
```
|
|
|
|
The system leverages Milvus Cloud's distributed architecture to provide high-performance vector operations with enterprise-grade reliability.
|
|
|
|
## Setup & Configuration
|
|
|
|
### Installation
|
|
|
|
```bash
|
|
pip install pymilvus[cloud]
|
|
pip install swarms
|
|
pip install litellm
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
```bash
|
|
# Milvus Cloud credentials
|
|
export MILVUS_CLOUD_URI="https://your-cluster.api.milvuscloud.com"
|
|
export MILVUS_CLOUD_TOKEN="your-api-token"
|
|
|
|
# Optional: Database name (default: "default")
|
|
export MILVUS_DATABASE="your-database"
|
|
|
|
# OpenAI API key for LLM
|
|
export OPENAI_API_KEY="your-openai-api-key"
|
|
```
|
|
|
|
### Dependencies
|
|
|
|
- `pymilvus>=2.3.0`
|
|
- `swarms`
|
|
- `litellm`
|
|
- `numpy`
|
|
|
|
## Code Example
|
|
|
|
```python
|
|
"""
|
|
Milvus Cloud RAG Integration with Swarms Agent
|
|
|
|
This example demonstrates how to integrate Milvus Cloud as a managed vector database
|
|
for RAG operations with Swarms agents using LiteLLM embeddings.
|
|
"""
|
|
|
|
import os
|
|
from typing import List, Dict, Any, Optional
|
|
import numpy as np
|
|
from pymilvus import (
|
|
connections, Collection, FieldSchema, CollectionSchema,
|
|
DataType, utility, MilvusClient
|
|
)
|
|
from swarms import Agent
|
|
from litellm import embedding
|
|
|
|
class MilvusCloudMemory:
|
|
"""Milvus Cloud-based memory system for RAG operations"""
|
|
|
|
def __init__(self,
|
|
collection_name: str = "swarms_knowledge_base",
|
|
embedding_model: str = "text-embedding-3-small",
|
|
dimension: int = 1536,
|
|
index_type: str = "HNSW",
|
|
metric_type: str = "COSINE"):
|
|
"""
|
|
Initialize Milvus Cloud memory system
|
|
|
|
Args:
|
|
collection_name: Name of the Milvus collection
|
|
embedding_model: LiteLLM embedding model name
|
|
dimension: Vector dimension (1536 for text-embedding-3-small)
|
|
index_type: Index type (HNSW, IVF_FLAT, IVF_SQ8, etc.)
|
|
metric_type: Distance metric (COSINE, L2, IP)
|
|
"""
|
|
self.collection_name = collection_name
|
|
self.embedding_model = embedding_model
|
|
self.dimension = dimension
|
|
self.index_type = index_type
|
|
self.metric_type = metric_type
|
|
|
|
# Initialize Milvus Cloud connection
|
|
self.client = self._connect_to_cloud()
|
|
|
|
# Create collection if it doesn't exist
|
|
self.collection = self._create_or_get_collection()
|
|
|
|
def _connect_to_cloud(self):
|
|
"""Connect to Milvus Cloud using credentials"""
|
|
uri = os.getenv("MILVUS_CLOUD_URI")
|
|
token = os.getenv("MILVUS_CLOUD_TOKEN")
|
|
|
|
if not uri or not token:
|
|
raise ValueError("MILVUS_CLOUD_URI and MILVUS_CLOUD_TOKEN must be set")
|
|
|
|
# Using MilvusClient for simplified operations
|
|
client = MilvusClient(
|
|
uri=uri,
|
|
token=token
|
|
)
|
|
|
|
print(f"Connected to Milvus Cloud: {uri}")
|
|
return client
|
|
|
|
def _create_or_get_collection(self):
|
|
"""Create or get the collection with appropriate schema"""
|
|
|
|
# Check if collection exists
|
|
if self.client.has_collection(collection_name=self.collection_name):
|
|
print(f"Collection '{self.collection_name}' already exists")
|
|
return self.client.get_collection(self.collection_name)
|
|
|
|
# Define collection schema
|
|
schema = self.client.create_schema(
|
|
auto_id=True,
|
|
enable_dynamic_field=True
|
|
)
|
|
|
|
# Add fields
|
|
schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True)
|
|
schema.add_field(field_name="embedding", datatype=DataType.FLOAT_VECTOR, dim=self.dimension)
|
|
schema.add_field(field_name="text", datatype=DataType.VARCHAR, max_length=65535)
|
|
schema.add_field(field_name="metadata", datatype=DataType.JSON)
|
|
|
|
# Create collection
|
|
self.client.create_collection(
|
|
collection_name=self.collection_name,
|
|
schema=schema
|
|
)
|
|
|
|
# Create index on vector field
|
|
index_params = {
|
|
"index_type": self.index_type,
|
|
"metric_type": self.metric_type,
|
|
"params": self._get_index_params()
|
|
}
|
|
|
|
self.client.create_index(
|
|
collection_name=self.collection_name,
|
|
field_name="embedding",
|
|
index_params=index_params
|
|
)
|
|
|
|
print(f"Created collection '{self.collection_name}' with {self.index_type} index")
|
|
return self.client.get_collection(self.collection_name)
|
|
|
|
def _get_index_params(self):
|
|
"""Get index parameters based on index type"""
|
|
if self.index_type == "HNSW":
|
|
return {"M": 16, "efConstruction": 200}
|
|
elif self.index_type == "IVF_FLAT":
|
|
return {"nlist": 128}
|
|
elif self.index_type == "IVF_SQ8":
|
|
return {"nlist": 128}
|
|
elif self.index_type == "ANNOY":
|
|
return {"n_trees": 8}
|
|
else:
|
|
return {}
|
|
|
|
def _get_embeddings(self, texts: List[str]) -> List[List[float]]:
|
|
"""Generate embeddings using LiteLLM"""
|
|
response = embedding(
|
|
model=self.embedding_model,
|
|
input=texts
|
|
)
|
|
return [item["embedding"] for item in response["data"]]
|
|
|
|
def add_documents(self, documents: List[str], metadata: List[Dict] = None) -> List[int]:
|
|
"""Add multiple documents to Milvus Cloud"""
|
|
if metadata is None:
|
|
metadata = [{}] * len(documents)
|
|
|
|
# Generate embeddings
|
|
embeddings = self._get_embeddings(documents)
|
|
|
|
# Prepare data for insertion
|
|
data = [
|
|
{
|
|
"embedding": emb,
|
|
"text": doc,
|
|
"metadata": meta
|
|
}
|
|
for emb, doc, meta in zip(embeddings, documents, metadata)
|
|
]
|
|
|
|
# Insert data
|
|
result = self.client.insert(
|
|
collection_name=self.collection_name,
|
|
data=data
|
|
)
|
|
|
|
# Flush to ensure data is written
|
|
self.client.flush(collection_name=self.collection_name)
|
|
|
|
print(f"Added {len(documents)} documents to Milvus Cloud")
|
|
return result["ids"] if "ids" in result else []
|
|
|
|
def add_document(self, document: str, metadata: Dict = None) -> int:
|
|
"""Add a single document to Milvus Cloud"""
|
|
result = self.add_documents([document], [metadata or {}])
|
|
return result[0] if result else None
|
|
|
|
def search(self,
|
|
query: str,
|
|
limit: int = 3,
|
|
filter_expr: str = None,
|
|
output_fields: List[str] = None) -> Dict[str, Any]:
|
|
"""Search for similar documents in Milvus Cloud"""
|
|
|
|
# Generate query embedding
|
|
query_embedding = self._get_embeddings([query])[0]
|
|
|
|
# Set default output fields
|
|
if output_fields is None:
|
|
output_fields = ["text", "metadata"]
|
|
|
|
# Prepare search parameters
|
|
search_params = {
|
|
"metric_type": self.metric_type,
|
|
"params": self._get_search_params()
|
|
}
|
|
|
|
# Perform search
|
|
results = self.client.search(
|
|
collection_name=self.collection_name,
|
|
data=[query_embedding],
|
|
anns_field="embedding",
|
|
search_params=search_params,
|
|
limit=limit,
|
|
expr=filter_expr,
|
|
output_fields=output_fields
|
|
)[0] # Get first (and only) query result
|
|
|
|
# Format results
|
|
formatted_results = {
|
|
"documents": [],
|
|
"metadata": [],
|
|
"scores": [],
|
|
"ids": []
|
|
}
|
|
|
|
for result in results:
|
|
formatted_results["documents"].append(result.get("text", ""))
|
|
formatted_results["metadata"].append(result.get("metadata", {}))
|
|
formatted_results["scores"].append(float(result["distance"]))
|
|
formatted_results["ids"].append(result["id"])
|
|
|
|
return formatted_results
|
|
|
|
def _get_search_params(self):
|
|
"""Get search parameters based on index type"""
|
|
if self.index_type == "HNSW":
|
|
return {"ef": 100}
|
|
elif self.index_type in ["IVF_FLAT", "IVF_SQ8"]:
|
|
return {"nprobe": 16}
|
|
else:
|
|
return {}
|
|
|
|
def delete_documents(self, filter_expr: str) -> int:
|
|
"""Delete documents matching the filter expression"""
|
|
result = self.client.delete(
|
|
collection_name=self.collection_name,
|
|
filter=filter_expr
|
|
)
|
|
print(f"Deleted documents matching: {filter_expr}")
|
|
return result
|
|
|
|
def get_collection_stats(self) -> Dict[str, Any]:
|
|
"""Get collection statistics"""
|
|
stats = self.client.get_collection_stats(collection_name=self.collection_name)
|
|
return {
|
|
"row_count": stats["row_count"],
|
|
"data_size": stats.get("data_size", "N/A"),
|
|
"index_size": stats.get("index_size", "N/A")
|
|
}
|
|
|
|
# Initialize Milvus Cloud memory
|
|
memory = MilvusCloudMemory(
|
|
collection_name="swarms_rag_demo",
|
|
embedding_model="text-embedding-3-small",
|
|
dimension=1536,
|
|
index_type="HNSW", # High performance for similarity search
|
|
metric_type="COSINE"
|
|
)
|
|
|
|
# Sample documents for the knowledge base
|
|
documents = [
|
|
"Milvus Cloud is a fully managed vector database service with enterprise-grade features.",
|
|
"RAG combines retrieval and generation to provide more accurate and contextual AI responses.",
|
|
"Vector embeddings enable semantic search across unstructured data like text and images.",
|
|
"The Swarms framework integrates with multiple vector databases including Milvus Cloud.",
|
|
"LiteLLM provides a unified interface for different embedding models and providers.",
|
|
"Milvus supports various index types including HNSW, IVF, and ANNOY for different use cases.",
|
|
"Auto-scaling in Milvus Cloud ensures optimal performance without manual intervention.",
|
|
"Enterprise security features include end-to-end encryption and compliance certifications.",
|
|
]
|
|
|
|
# Document metadata with rich attributes for filtering
|
|
metadatas = [
|
|
{"category": "database", "topic": "milvus_cloud", "difficulty": "beginner", "type": "overview"},
|
|
{"category": "ai", "topic": "rag", "difficulty": "intermediate", "type": "concept"},
|
|
{"category": "ai", "topic": "embeddings", "difficulty": "intermediate", "type": "concept"},
|
|
{"category": "framework", "topic": "swarms", "difficulty": "beginner", "type": "integration"},
|
|
{"category": "library", "topic": "litellm", "difficulty": "beginner", "type": "tool"},
|
|
{"category": "indexing", "topic": "algorithms", "difficulty": "advanced", "type": "technical"},
|
|
{"category": "scaling", "topic": "cloud", "difficulty": "intermediate", "type": "feature"},
|
|
{"category": "security", "topic": "enterprise", "difficulty": "advanced", "type": "feature"},
|
|
]
|
|
|
|
# Add documents to Milvus Cloud
|
|
print("Adding documents to Milvus Cloud...")
|
|
doc_ids = memory.add_documents(documents, metadatas)
|
|
print(f"Successfully added {len(doc_ids)} documents")
|
|
|
|
# Display collection statistics
|
|
stats = memory.get_collection_stats()
|
|
print(f"Collection stats: {stats}")
|
|
|
|
# Create Swarms agent with Milvus Cloud RAG
|
|
agent = Agent(
|
|
agent_name="MilvusCloud-RAG-Agent",
|
|
agent_description="Enterprise agent with Milvus Cloud-powered RAG for scalable knowledge retrieval",
|
|
model_name="gpt-4o",
|
|
max_loops=1,
|
|
dynamic_temperature_enabled=True,
|
|
)
|
|
|
|
def query_with_milvus_rag(query_text: str,
|
|
limit: int = 3,
|
|
filter_expr: str = None):
|
|
"""Query with RAG using Milvus Cloud for enterprise-scale retrieval"""
|
|
print(f"\nQuerying: {query_text}")
|
|
if filter_expr:
|
|
print(f"Filter: {filter_expr}")
|
|
|
|
# Retrieve relevant documents using Milvus Cloud
|
|
results = memory.search(
|
|
query=query_text,
|
|
limit=limit,
|
|
filter_expr=filter_expr
|
|
)
|
|
|
|
if not results["documents"]:
|
|
print("No relevant documents found")
|
|
return agent.run(query_text)
|
|
|
|
# Prepare context from retrieved documents
|
|
context = "\n".join([
|
|
f"Document {i+1}: {doc}"
|
|
for i, doc in enumerate(results["documents"])
|
|
])
|
|
|
|
# Display retrieved documents with metadata
|
|
print("Retrieved documents:")
|
|
for i, (doc, score, meta) in enumerate(zip(
|
|
results["documents"], results["scores"], results["metadata"]
|
|
)):
|
|
print(f" {i+1}. (Score: {score:.4f}) Category: {meta.get('category', 'N/A')}")
|
|
print(f" {doc[:100]}...")
|
|
|
|
# Enhanced prompt with context
|
|
enhanced_prompt = f"""
|
|
Based on the following retrieved context from our knowledge base, please answer the question:
|
|
|
|
Context:
|
|
{context}
|
|
|
|
Question: {query_text}
|
|
|
|
Please provide a comprehensive answer based primarily on the context provided.
|
|
"""
|
|
|
|
# Run agent with enhanced prompt
|
|
response = agent.run(enhanced_prompt)
|
|
return response
|
|
|
|
# Example usage and testing
|
|
if __name__ == "__main__":
|
|
# Test basic queries
|
|
queries = [
|
|
"What is Milvus Cloud and what makes it enterprise-ready?",
|
|
"How does RAG improve AI responses?",
|
|
"What are the different index types supported by Milvus?",
|
|
"What security features does Milvus Cloud provide?",
|
|
]
|
|
|
|
print("=== Basic RAG Queries ===")
|
|
for query in queries:
|
|
response = query_with_milvus_rag(query, limit=3)
|
|
print(f"Answer: {response}\n")
|
|
print("-" * 80)
|
|
|
|
# Test filtered queries using metadata
|
|
print("\n=== Filtered Queries ===")
|
|
|
|
# Query only advanced topics
|
|
response = query_with_milvus_rag(
|
|
"What are some advanced features?",
|
|
limit=2,
|
|
filter_expr='metadata["difficulty"] == "advanced"'
|
|
)
|
|
print(f"Advanced features: {response}\n")
|
|
|
|
# Query only concepts
|
|
response = query_with_milvus_rag(
|
|
"Explain key AI concepts",
|
|
limit=2,
|
|
filter_expr='metadata["type"] == "concept"'
|
|
)
|
|
print(f"AI concepts: {response}\n")
|
|
|
|
# Query database-related documents
|
|
response = query_with_milvus_rag(
|
|
"Tell me about database capabilities",
|
|
limit=3,
|
|
filter_expr='metadata["category"] == "database" or metadata["category"] == "indexing"'
|
|
)
|
|
print(f"Database capabilities: {response}\n")
|
|
|
|
# Demonstrate adding new documents with metadata
|
|
print("=== Adding New Document ===")
|
|
new_doc = "Milvus Cloud provides automatic backup and disaster recovery for enterprise data protection."
|
|
new_metadata = {
|
|
"category": "backup",
|
|
"topic": "disaster_recovery",
|
|
"difficulty": "intermediate",
|
|
"type": "feature"
|
|
}
|
|
memory.add_document(new_doc, new_metadata)
|
|
|
|
# Query about the new document
|
|
response = query_with_milvus_rag("What backup features are available?")
|
|
print(f"Backup features: {response}\n")
|
|
|
|
# Display final collection statistics
|
|
final_stats = memory.get_collection_stats()
|
|
print(f"Final collection stats: {final_stats}")
|
|
|
|
# Example of deleting documents (use with caution)
|
|
# memory.delete_documents('metadata["category"] == "test"')
|
|
```
|
|
|
|
## Use Cases
|
|
|
|
### 1. Enterprise Knowledge Management
|
|
- **Scenario**: Large-scale corporate knowledge bases with millions of documents
|
|
- **Benefits**: Auto-scaling, high availability, enterprise security
|
|
- **Best For**: Fortune 500 companies, global organizations
|
|
|
|
### 2. Production RAG Applications
|
|
- **Scenario**: Customer-facing AI applications requiring 99.9% uptime
|
|
- **Benefits**: Managed infrastructure, automatic scaling, disaster recovery
|
|
- **Best For**: SaaS platforms, customer support systems
|
|
|
|
### 3. Multi-tenant Applications
|
|
- **Scenario**: Serving multiple customers with isolated data
|
|
- **Benefits**: Built-in multi-tenancy, secure data isolation
|
|
- **Best For**: AI platform providers, B2B SaaS solutions
|
|
|
|
### 4. Global AI Applications
|
|
- **Scenario**: Applications serving users worldwide
|
|
- **Benefits**: Global distribution, edge optimization
|
|
- **Best For**: International companies, global services
|
|
|
|
## Performance Characteristics
|
|
|
|
### Scaling
|
|
- **Auto-scaling**: Automatic compute and storage scaling based on workload
|
|
- **Horizontal Scaling**: Support for billions of vectors across multiple nodes
|
|
- **Vertical Scaling**: On-demand resource allocation for compute-intensive tasks
|
|
|
|
### Performance Metrics
|
|
- **Query Latency**: < 10ms for 95th percentile queries
|
|
- **Throughput**: 10,000+ QPS depending on configuration
|
|
- **Availability**: 99.9% uptime SLA
|
|
- **Consistency**: Tunable consistency levels
|
|
|
|
### Index Types Performance
|
|
|
|
| Index Type | Use Case | Performance | Memory | Accuracy |
|
|
|------------|----------|-------------|---------|----------|
|
|
| **HNSW** | High-performance similarity search | Ultra-fast | Medium | Very High |
|
|
| **IVF_FLAT** | Large datasets with exact results | Fast | High | Perfect |
|
|
| **IVF_SQ8** | Memory-efficient large datasets | Fast | Low | High |
|
|
| **ANNOY** | Read-heavy workloads | Very Fast | Low | High |
|
|
|
|
## Cloud vs Local Deployment
|
|
|
|
### Milvus Cloud Advantages
|
|
- **Fully Managed**: Zero infrastructure management
|
|
- **Enterprise Features**: Advanced security, compliance, monitoring
|
|
- **Global Scale**: Multi-region deployment capabilities
|
|
- **Cost Optimization**: Pay-per-use pricing model
|
|
- **Professional Support**: 24/7 technical support
|
|
|
|
### Configuration Options
|
|
```python
|
|
# Production configuration with advanced features
|
|
memory = MilvusCloudMemory(
|
|
collection_name="production_knowledge_base",
|
|
embedding_model="text-embedding-3-small",
|
|
dimension=1536,
|
|
index_type="HNSW", # Best for similarity search
|
|
metric_type="COSINE"
|
|
)
|
|
|
|
# Development configuration
|
|
memory = MilvusCloudMemory(
|
|
collection_name="dev_knowledge_base",
|
|
embedding_model="text-embedding-3-small",
|
|
dimension=1536,
|
|
index_type="IVF_FLAT", # Balanced performance
|
|
metric_type="L2"
|
|
)
|
|
```
|
|
|
|
## Advanced Features
|
|
|
|
### Rich Metadata Filtering
|
|
```python
|
|
# Complex filter expressions
|
|
filter_expr = '''
|
|
(metadata["category"] == "ai" and metadata["difficulty"] == "advanced")
|
|
or (metadata["topic"] == "embeddings" and metadata["type"] == "concept")
|
|
'''
|
|
|
|
results = memory.search(
|
|
query="advanced AI concepts",
|
|
limit=5,
|
|
filter_expr=filter_expr
|
|
)
|
|
```
|
|
|
|
### Hybrid Search
|
|
```python
|
|
# Combine vector similarity with metadata filtering
|
|
results = memory.search(
|
|
query="machine learning algorithms",
|
|
limit=10,
|
|
filter_expr='metadata["category"] in ["ai", "ml"] and metadata["difficulty"] != "beginner"'
|
|
)
|
|
```
|
|
|
|
### Collection Management
|
|
```python
|
|
# Create multiple collections for different domains
|
|
medical_memory = MilvusCloudMemory(
|
|
collection_name="medical_knowledge",
|
|
embedding_model="text-embedding-3-small"
|
|
)
|
|
|
|
legal_memory = MilvusCloudMemory(
|
|
collection_name="legal_documents",
|
|
embedding_model="text-embedding-3-small"
|
|
)
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Index Selection**: Choose HNSW for similarity search, IVF for large datasets
|
|
2. **Metadata Design**: Design rich metadata schema for effective filtering
|
|
3. **Batch Operations**: Use batch operations for better throughput
|
|
4. **Connection Pooling**: Implement connection pooling for production applications
|
|
5. **Error Handling**: Implement robust error handling and retry logic
|
|
6. **Monitoring**: Set up monitoring and alerting for performance metrics
|
|
7. **Cost Optimization**: Monitor usage and optimize collection configurations
|
|
8. **Security**: Follow security best practices for authentication and data access
|
|
|
|
## Monitoring and Observability
|
|
|
|
### Key Metrics to Monitor
|
|
- Query latency percentiles (p50, p95, p99)
|
|
- Query throughput (QPS)
|
|
- Error rates and types
|
|
- Collection size and growth
|
|
- Resource utilization
|
|
|
|
### Alerting Setup
|
|
```python
|
|
# Example monitoring integration
|
|
import logging
|
|
|
|
logger = logging.getLogger("milvus_rag")
|
|
|
|
def monitored_search(memory, query, **kwargs):
|
|
start_time = time.time()
|
|
try:
|
|
results = memory.search(query, **kwargs)
|
|
duration = time.time() - start_time
|
|
logger.info(f"Search completed in {duration:.3f}s, found {len(results['documents'])} results")
|
|
return results
|
|
except Exception as e:
|
|
logger.error(f"Search failed: {e}")
|
|
raise
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
1. **Connection Errors**
|
|
- Verify MILVUS_CLOUD_URI and MILVUS_CLOUD_TOKEN
|
|
- Check network connectivity and firewall settings
|
|
- Confirm cloud region accessibility
|
|
|
|
2. **Performance Issues**
|
|
- Monitor collection size and index type appropriateness
|
|
- Check query complexity and filter expressions
|
|
- Review auto-scaling configuration
|
|
|
|
3. **Search Accuracy Issues**
|
|
- Verify embedding model consistency
|
|
- Check vector normalization if using cosine similarity
|
|
- Review index parameters and search parameters
|
|
|
|
4. **Quota and Billing Issues**
|
|
- Monitor usage against plan limits
|
|
- Review auto-scaling settings
|
|
- Check billing alerts and notifications
|
|
|
|
This comprehensive guide provides everything needed to integrate Milvus Cloud with Swarms agents for enterprise-scale RAG applications using the unified LiteLLM embeddings approach. |