# Smart Database Swarm

A fully autonomous database management system powered by hierarchical multi-agent workflow using the Swarms framework.

## Overview

The Smart Database Swarm is an intelligent database management system that uses specialized AI agents to handle different aspects of database operations. The system follows a hierarchical architecture where a Database Director coordinates specialized worker agents to execute complex database tasks.

## Architecture

### Hierarchical Structure

```
Database Director (Coordinator)
├── Database Creator (Creates databases)
├── Table Manager (Manages table schemas)
├── Data Operations (Handles data insertion/updates)
└── Query Specialist (Executes queries and retrieval)
```

### Agent Specializations

1. **Database Director**: Orchestrates all database operations and coordinates specialist agents
2. **Database Creator**: Specializes in creating and initializing databases
3. **Table Manager**: Expert in table creation, schema design, and structure management
4. **Data Operations**: Handles data insertion, updates, and manipulation
5. **Query Specialist**: Manages database queries, data retrieval, and optimization

## Features

- **Autonomous Database Management**: Complete database lifecycle management
- **Intelligent Task Distribution**: Automatic assignment of tasks to appropriate specialists
- **Schema Validation**: Ensures proper table structures and data integrity
- **Security**: Built-in SQL injection prevention and query validation
- **Performance Optimization**: Query optimization and efficient data operations
- **Comprehensive Error Handling**: Robust error management and reporting
- **Multi-format Data Support**: JSON-based data insertion and flexible query parameters

## Database Tools

### Core Functions

1. **`create_database(database_name, database_path)`**: Creates new SQLite databases
2. **`create_table(database_path, table_name, schema)`**: Creates tables with specified schemas
3. **`insert_data(database_path, table_name, data)`**: Inserts data into tables
4. **`query_database(database_path, query, params)`**: Executes SELECT queries
5. **`update_table_data(database_path, table_name, update_data, where_clause)`**: Updates existing data
6. **`get_database_schema(database_path)`**: Retrieves comprehensive schema information

## Usage Examples

### Basic Usage

```python
from smart_database_swarm import smart_database_swarm

# Simple database creation and setup
task = """
Create a user management database:
1. Create database 'user_system'
2. Create users table with id, username, email, created_at
3. Insert 5 sample users
4. Query all users ordered by creation date
"""

result = smart_database_swarm.run(task=task)
print(result)
```

### E-commerce System

```python
# Complex e-commerce database system
ecommerce_task = """
Create a comprehensive e-commerce database system:

1. Create database 'ecommerce_store'
2. Create tables:
   - customers (id, name, email, phone, address, created_at)
   - products (id, name, description, price, category, stock, created_at)
   - orders (id, customer_id, order_date, total_amount, status)
   - order_items (id, order_id, product_id, quantity, unit_price)

3. Insert sample data:
   - 10 customers with realistic information
   - 20 products across different categories
   - 15 orders with multiple items each

4. Execute analytical queries:
   - Top selling products by quantity
   - Customer lifetime value analysis
   - Monthly sales trends
   - Inventory levels by category
"""

result = smart_database_swarm.run(task=ecommerce_task)
```

### Data Analysis and Reporting

```python
# Advanced data analysis
analysis_task = """
Analyze the existing databases and provide insights:

1. Get schema information for all databases
2. Generate data quality reports
3. Identify optimization opportunities
4. Create performance metrics dashboard
5. Suggest database improvements

Query patterns:
- Customer segmentation analysis
- Product performance metrics
- Order fulfillment statistics
- Revenue analysis by time periods
"""

result = smart_database_swarm.run(task=analysis_task)
```

## Data Formats

### Table Schema Definition

```python
# Column definitions with types and constraints
schema = "id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT NOT NULL, email TEXT UNIQUE, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP"
```

### Data Insertion Formats

#### Format 1: List of Dictionaries
```json
[
  {"name": "John Doe", "email": "john@example.com"},
  {"name": "Jane Smith", "email": "jane@example.com"}
]
```

#### Format 2: Columns and Values
```json
{
  "columns": ["name", "email"],
  "values": [
    ["John Doe", "john@example.com"],
    ["Jane Smith", "jane@example.com"]
  ]
}
```

### Update Operations

```json
{
  "salary": 75000,
  "department": "Engineering",
  "last_updated": "2024-01-15"
}
```

## Advanced Features

### Security

- **SQL Injection Prevention**: Parameterized queries and input validation
- **Query Validation**: Only SELECT queries allowed for query operations
- **Input Sanitization**: Automatic cleaning and validation of inputs

### Performance

- **Connection Management**: Efficient database connection handling
- **Query Optimization**: Intelligent query planning and execution
- **Batch Operations**: Support for bulk data operations

### Error Handling

- **Comprehensive Error Messages**: Detailed error reporting and solutions
- **Graceful Degradation**: System continues operating despite individual failures
- **Transaction Safety**: Atomic operations with rollback capabilities

## Best Practices

### Database Design

1. **Use Proper Data Types**: Choose appropriate SQL data types for your data
2. **Implement Constraints**: Use PRIMARY KEY, FOREIGN KEY, and CHECK constraints
3. **Normalize Data**: Follow database normalization principles
4. **Index Strategy**: Create indexes for frequently queried columns

### Agent Coordination

1. **Clear Task Definitions**: Provide specific, actionable task descriptions
2. **Sequential Operations**: Allow agents to complete dependencies before next steps
3. **Comprehensive Requirements**: Include all necessary details in task descriptions
4. **Result Validation**: Review agent outputs for completeness and accuracy

### Data Operations

1. **Backup Before Updates**: Always backup data before major modifications
2. **Test Queries**: Validate queries on sample data before production execution
3. **Monitor Performance**: Track query execution times and optimize as needed
4. **Validate Data**: Ensure data integrity through proper validation

## File Structure

```
examples/guides/smart_database/
├── smart_database_swarm.py    # Main implementation
├── README.md                  # This documentation
└── databases/                 # Generated databases (auto-created)
```

## Dependencies

- `swarms`: Core framework for multi-agent systems
- `sqlite3`: Database operations (built-in Python)
- `json`: Data serialization (built-in Python)
- `pathlib`: File path operations (built-in Python)
- `loguru`: Minimal logging functionality

## Running the System

```bash
# Navigate to the smart_database directory
cd examples/guides/smart_database

# Run the demonstration
python smart_database_swarm.py

# The system will create databases in ./databases/ directory
# Check the generated databases and results
```

## Expected Output

The system will create:

1. **Databases**: SQLite database files in `./databases/` directory
2. **Detailed Results**: JSON-formatted operation results
3. **Agent Coordination**: Logs showing how tasks are distributed
4. **Performance Metrics**: Execution times and success statistics

## Troubleshooting

### Common Issues

1. **Database Not Found**: Ensure database path is correct and accessible
2. **Schema Errors**: Verify SQL syntax in table creation statements
3. **Data Format Issues**: Check JSON formatting for data insertion
4. **Permission Errors**: Ensure write permissions for database directory

### Debug Mode

Enable verbose logging to see detailed agent interactions:

```python
smart_database_swarm.verbose = True
result = smart_database_swarm.run(task=your_task)
```

## Contributing

To extend the Smart Database Swarm:

1. **Add New Tools**: Create additional database operation functions
2. **Enhance Agents**: Improve agent prompts and capabilities
3. **Add Database Types**: Support for PostgreSQL, MySQL, etc.
4. **Performance Optimization**: Implement caching and connection pooling

## License

This project is part of the Swarms framework and follows the same licensing terms.