swarms/docs/swarms_memory/pinecone.md

# PineconeMemory Documentation

The `PineconeMemory` class provides a robust interface for integrating Pinecone-based Retrieval-Augmented Generation (RAG) systems. It allows for adding documents to a Pinecone index and querying the index for similar documents. The class supports custom embedding models, preprocessing functions, and other customizations to suit different use cases.


#### Parameters

| Parameter            | Type                                          | Default                           | Description                                                                                          |
|----------------------|-----------------------------------------------|-----------------------------------|------------------------------------------------------------------------------------------------------|
| `api_key`            | `str`                                         | -                                 | Pinecone API key.                                                                                    |
| `environment`        | `str`                                         | -                                 | Pinecone environment.                                                                                |
| `index_name`         | `str`                                         | -                                 | Name of the Pinecone index to use.                                                                   |
| `dimension`          | `int`                                         | `768`                             | Dimension of the document embeddings.                                                                |
| `embedding_model`    | `Optional[Any]`                               | `None`                            | Custom embedding model. Defaults to `SentenceTransformer('all-MiniLM-L6-v2')`.                       |
| `embedding_function` | `Optional[Callable[[str], List[float]]]`      | `None`                            | Custom embedding function. Defaults to `_default_embedding_function`.                                |
| `preprocess_function`| `Optional[Callable[[str], str]]`              | `None`                            | Custom preprocessing function. Defaults to `_default_preprocess_function`.                           |
| `postprocess_function`| `Optional[Callable[[List[Dict[str, Any]]], List[Dict[str, Any]]]]`| `None`              | Custom postprocessing function. Defaults to `_default_postprocess_function`.                         |
| `metric`             | `str`                                         | `'cosine'`                        | Distance metric for Pinecone index.                                                                  |
| `pod_type`           | `str`                                         | `'p1'`                            | Pinecone pod type.                                                                                   |
| `namespace`          | `str`                                         | `''`                              | Pinecone namespace.                                                                                  |
| `logger_config`      | `Optional[Dict[str, Any]]`                    | `None`                            | Configuration for the logger. Defaults to logging to `rag_wrapper.log` and console output.           |

### Methods

#### `_setup_logger`

```python
def _setup_logger(self, config: Optional[Dict[str, Any]] = None)
```

Sets up the logger with the given configuration.

#### `_default_embedding_function`

```python
def _default_embedding_function(self, text: str) -> List[float]
```

Generates embeddings using the default SentenceTransformer model.

#### `_default_preprocess_function`

```python
def _default_preprocess_function(self, text: str) -> str
```

Preprocesses the input text by stripping whitespace.

#### `_default_postprocess_function`

```python
def _default_postprocess_function(self, results: List[Dict[str, Any]]) -> List[Dict[str, Any]]
```

Postprocesses the query results.

#### `add`

Adds a document to the Pinecone index.

| Parameter | Type                  | Default | Description                                   |
|-----------|-----------------------|---------|-----------------------------------------------|
| `doc`     | `str`                 | -       | The document to be added.                     |
| `metadata`| `Optional[Dict[str, Any]]` | `None`  | Additional metadata for the document.         |

#### `query`

Queries the Pinecone index for similar documents.

| Parameter | Type                    | Default | Description                                   |
|-----------|-------------------------|---------|-----------------------------------------------|
| `query`   | `str`                   | -       | The query string.                             |
| `top_k`   | `int`                   | `5`     | The number of top results to return.          |
| `filter`  | `Optional[Dict[str, Any]]` | `None`  | Metadata filter for the query.                |

## Usage


The `PineconeMemory` class is initialized with the necessary parameters to configure Pinecone and the embedding model. It supports a variety of custom configurations to suit different needs.

#### Example

```python
from swarms_memory import PineconeMemory

# Initialize PineconeMemory
memory = PineconeMemory(
    api_key="your-api-key",
    environment="us-west1-gcp",
    index_name="example-index",
    dimension=768
)
```

### Adding Documents

Documents can be added to the Pinecone index using the `add` method. The method accepts a document string and optional metadata.

#### Example

```python
doc = "This is a sample document to be added to the Pinecone index."
metadata = {"author": "John Doe", "date": "2024-07-08"}

memory.add(doc, metadata)
```

### Querying Documents

The `query` method allows for querying the Pinecone index for similar documents based on a query string. It returns the top `k` most similar documents.

#### Example

```python
query = "Sample query to find similar documents."
results = memory.query(query, top_k=5)

for result in results:
    print(result)
```

## Additional Information and Tips

### Custom Embedding and Preprocessing Functions

Custom embedding and preprocessing functions can be provided during initialization to tailor the document processing to specific requirements.

#### Example

```python
def custom_embedding_function(text: str) -> List[float]:
    # Custom embedding logic
    return [0.1, 0.2, 0.3]

def custom_preprocess_function(text: str) -> str:
    # Custom preprocessing logic
    return text.lower()

memory = PineconeMemory(
    api_key="your-api-key",
    environment="us-west1-gcp",
    index_name="example-index",
    embedding_function=custom_embedding_function,
    preprocess_function=custom_preprocess_function
)
```

### Logger Configuration

The logger can be configured to suit different logging needs. The default configuration logs to a file and the console.

#### Example

```python
logger_config = {
    "handlers": [
        {"sink": "custom_log.log", "rotation": "1 MB"},
        {"sink": lambda msg: print(msg, end="")},
    ]
}

memory = PineconeMemory(
    api_key="your-api-key",
    environment="us-west1-gcp",
    index_name="example-index",
    logger_config=logger_config
)
```

## References and Resources

- [Pinecone Documentation](https://docs.pinecone.io/)
- [SentenceTransformers Documentation](https://www.sbert.net/)
- [Loguru Documentation](https://loguru.readthedocs.io/en/stable/)

For further exploration and examples, refer to the official documentation and resources provided by Pinecone, SentenceTransformers, and Loguru.

This concludes the detailed documentation for the `PineconeMemory` class. The class offers a flexible and powerful interface for leveraging Pinecone's capabilities in retrieval-augmented generation systems. By supporting custom embeddings, preprocessing, and postprocessing functions, it can be tailored to a wide range of applications.
[5.4.8] 5 months ago			`# PineconeMemory Documentation`

			The `PineconeMemory` class provides a robust interface for integrating Pinecone-based Retrieval-Augmented Generation (RAG) systems. It allows for adding documents to a Pinecone index and querying the index for similar documents. The class supports custom embedding models, preprocessing functions, and other customizations to suit different use cases.



			`#### Parameters`

			`\| Parameter \| Type \| Default \| Description \|`
			`\|----------------------\|-----------------------------------------------\|-----------------------------------\|------------------------------------------------------------------------------------------------------\|`
			\| `api_key` \| `str` \| - \| Pinecone API key. \|
			\| `environment` \| `str` \| - \| Pinecone environment. \|
			\| `index_name` \| `str` \| - \| Name of the Pinecone index to use. \|
			\| `dimension` \| `int` \| `768` \| Dimension of the document embeddings. \|
			\| `embedding_model` \| `Optional[Any]` \| `None` \| Custom embedding model. Defaults to `SentenceTransformer('all-MiniLM-L6-v2')`. \|
			\| `embedding_function` \| `Optional[Callable[[str], List[float]]]` \| `None` \| Custom embedding function. Defaults to `_default_embedding_function`. \|
			\| `preprocess_function`\| `Optional[Callable[[str], str]]` \| `None` \| Custom preprocessing function. Defaults to `_default_preprocess_function`. \|
			\| `postprocess_function`\| `Optional[Callable[[List[Dict[str, Any]]], List[Dict[str, Any]]]]`\| `None` \| Custom postprocessing function. Defaults to `_default_postprocess_function`. \|
			\| `metric` \| `str` \| `'cosine'` \| Distance metric for Pinecone index. \|
			\| `pod_type` \| `str` \| `'p1'` \| Pinecone pod type. \|
			\| `namespace` \| `str` \| `''` \| Pinecone namespace. \|
			\| `logger_config` \| `Optional[Dict[str, Any]]` \| `None` \| Configuration for the logger. Defaults to logging to `rag_wrapper.log` and console output. \|

			`### Methods`

			#### `_setup_logger`

			```python
			`def _setup_logger(self, config: Optional[Dict[str, Any]] = None)`
			```

			`Sets up the logger with the given configuration.`

			#### `_default_embedding_function`

			```python
			`def _default_embedding_function(self, text: str) -> List[float]`
			```

			`Generates embeddings using the default SentenceTransformer model.`

			#### `_default_preprocess_function`

			```python
			`def _default_preprocess_function(self, text: str) -> str`
			```

			`Preprocesses the input text by stripping whitespace.`

			#### `_default_postprocess_function`

			```python
			`def _default_postprocess_function(self, results: List[Dict[str, Any]]) -> List[Dict[str, Any]]`
			```

			`Postprocesses the query results.`

			#### `add`

			`Adds a document to the Pinecone index.`

			`\| Parameter \| Type \| Default \| Description \|`
			`\|-----------\|-----------------------\|---------\|-----------------------------------------------\|`
			\| `doc` \| `str` \| - \| The document to be added. \|
			\| `metadata`\| `Optional[Dict[str, Any]]` \| `None` \| Additional metadata for the document. \|

			#### `query`

			`Queries the Pinecone index for similar documents.`

			`\| Parameter \| Type \| Default \| Description \|`
			`\|-----------\|-------------------------\|---------\|-----------------------------------------------\|`
			\| `query` \| `str` \| - \| The query string. \|
			\| `top_k` \| `int` \| `5` \| The number of top results to return. \|
			\| `filter` \| `Optional[Dict[str, Any]]` \| `None` \| Metadata filter for the query. \|

			`## Usage`


			The `PineconeMemory` class is initialized with the necessary parameters to configure Pinecone and the embedding model. It supports a variety of custom configurations to suit different needs.

			`#### Example`

			```python
			`from swarms_memory import PineconeMemory`

			`# Initialize PineconeMemory`
			`memory = PineconeMemory(`
			`api_key="your-api-key",`
			`environment="us-west1-gcp",`
			`index_name="example-index",`
			`dimension=768`
			`)`
			```

			`### Adding Documents`

			Documents can be added to the Pinecone index using the `add` method. The method accepts a document string and optional metadata.

			`#### Example`

			```python
			`doc = "This is a sample document to be added to the Pinecone index."`
			`metadata = {"author": "John Doe", "date": "2024-07-08"}`

			`memory.add(doc, metadata)`
			```

			`### Querying Documents`

			The `query` method allows for querying the Pinecone index for similar documents based on a query string. It returns the top `k` most similar documents.

			`#### Example`

			```python
			`query = "Sample query to find similar documents."`
			`results = memory.query(query, top_k=5)`

			`for result in results:`
			`print(result)`
			```

			`## Additional Information and Tips`

			`### Custom Embedding and Preprocessing Functions`

			`Custom embedding and preprocessing functions can be provided during initialization to tailor the document processing to specific requirements.`

			`#### Example`

			```python
			`def custom_embedding_function(text: str) -> List[float]:`
			`# Custom embedding logic`
			`return [0.1, 0.2, 0.3]`

			`def custom_preprocess_function(text: str) -> str:`
			`# Custom preprocessing logic`
			`return text.lower()`

			`memory = PineconeMemory(`
			`api_key="your-api-key",`
			`environment="us-west1-gcp",`
			`index_name="example-index",`
			`embedding_function=custom_embedding_function,`
			`preprocess_function=custom_preprocess_function`
			`)`
			```

			`### Logger Configuration`

			`The logger can be configured to suit different logging needs. The default configuration logs to a file and the console.`

			`#### Example`

			```python
			`logger_config = {`
			`"handlers": [`
			`{"sink": "custom_log.log", "rotation": "1 MB"},`
			`{"sink": lambda msg: print(msg, end="")},`
			`]`
			`}`

			`memory = PineconeMemory(`
			`api_key="your-api-key",`
			`environment="us-west1-gcp",`
			`index_name="example-index",`
			`logger_config=logger_config`
			`)`
			```

			`## References and Resources`

			`- [Pinecone Documentation](https://docs.pinecone.io/)`
			`- [SentenceTransformers Documentation](https://www.sbert.net/)`
			`- [Loguru Documentation](https://loguru.readthedocs.io/en/stable/)`

			`For further exploration and examples, refer to the official documentation and resources provided by Pinecone, SentenceTransformers, and Loguru.`

			This concludes the detailed documentation for the `PineconeMemory` class. The class offers a flexible and powerful interface for leveraging Pinecone's capabilities in retrieval-augmented generation systems. By supporting custom embeddings, preprocessing, and postprocessing functions, it can be tailored to a wide range of applications.