@ -0,0 +1,11 @@
|
||||
---
|
||||
version: 2
|
||||
build:
|
||||
os: ubuntu-22.04
|
||||
tools:
|
||||
python: "3.11"
|
||||
mkdocs:
|
||||
configuration: docs/mkdocs.yml
|
||||
python:
|
||||
install:
|
||||
- requirements: docs/requirements.txt
|
@ -0,0 +1,978 @@
|
||||
## Building Analyst Agents with Swarms to write Business Reports
|
||||
|
||||
> Jupyter Notebook accompanying this post is accessible at: [Business Analyst Agent Notebook](https://github.com/kyegomez/swarms/blob/master/playground/business-analyst-agent.ipynb)
|
||||
|
||||
Solving a business problem often involves preparing a Business Case Report. This report comprehensively analyzes the problem, evaluates potential solutions, and provides evidence-based recommendations and an implementation plan to effectively address the issue and drive business value. While the process of preparing one requires an experienced business analyst, the workflow can be augmented using AI agents. Two candidates stick out as areas to work on:
|
||||
|
||||
- Developing an outline to solve the problem
|
||||
- Doing background research and gathering data
|
||||
|
||||
In this post, we will explore how Swarms agents can be used to tackle a busuiness problem by outlining the solution, conducting background research and generating a preliminary report.
|
||||
|
||||
Before we proceed, this blog uses 3 API tools. Please obtain the following keys and store them in a `.env` file in the same folder as this file.
|
||||
|
||||
- **[OpenAI API](https://openai.com/blog/openai-api)** as `OPENAI_API_KEY`
|
||||
- **[TavilyAI API](https://app.tavily.com/home)** `TAVILY_API_KEY`
|
||||
- **[KayAI API](https://www.kay.ai/)** as `KAY_API_KEY`
|
||||
|
||||
```python
|
||||
import dotenv
|
||||
dotenv.load_dotenv() # Load environment variables from .env file
|
||||
```
|
||||
|
||||
### Developing an Outline to solve the problem
|
||||
|
||||
Assume the business problem is: **How do we improve Nike's revenue in Q3 2024?** We first create a planning agent to break down the problem into dependent sub-problems.
|
||||
|
||||
|
||||
#### Step 1. Defining the Data Model and Tool Schema
|
||||
|
||||
Using Pydantic, we define a structure to help the agent generate sub-problems.
|
||||
|
||||
- **QueryType:** Questions are either standalone or involve a combination of multiple others
|
||||
- **Query:** Defines structure of a question.
|
||||
- **QueryPlan:** Allows generation of a dependency graph of sub-questions
|
||||
|
||||
|
||||
```python
|
||||
import enum
|
||||
from typing import List
|
||||
from pydantic import Field, BaseModel
|
||||
|
||||
class QueryType(str, enum.Enum):
|
||||
"""Enumeration representing the types of queries that can be asked to a question answer system."""
|
||||
|
||||
SINGLE_QUESTION = "SINGLE"
|
||||
MERGE_MULTIPLE_RESPONSES = "MERGE_MULTIPLE_RESPONSES"
|
||||
|
||||
class Query(BaseModel):
|
||||
"""Class representing a single question in a query plan."""
|
||||
|
||||
id: int = Field(..., description="Unique id of the query")
|
||||
question: str = Field(
|
||||
...,
|
||||
description="Question asked using a question answering system",
|
||||
)
|
||||
dependencies: List[int] = Field(
|
||||
default_factory=list,
|
||||
description="List of sub questions that need to be answered before asking this question",
|
||||
)
|
||||
node_type: QueryType = Field(
|
||||
default=QueryType.SINGLE_QUESTION,
|
||||
description="Type of question, either a single question or a multi-question merge",
|
||||
)
|
||||
|
||||
class QueryPlan(BaseModel):
|
||||
"""Container class representing a tree of questions to ask a question answering system."""
|
||||
|
||||
query_graph: List[Query] = Field(
|
||||
..., description="The query graph representing the plan"
|
||||
)
|
||||
|
||||
def _dependencies(self, ids: List[int]) -> List[Query]:
|
||||
"""Returns the dependencies of a query given their ids."""
|
||||
|
||||
return [q for q in self.query_graph if q.id in ids]
|
||||
```
|
||||
|
||||
Also, a `tool_schema` needs to be defined. It is an instance of `QueryPlan` and is used to initialize the agent.
|
||||
|
||||
```python
|
||||
tool_schema = QueryPlan(
|
||||
query_graph = [query.dict() for query in [
|
||||
Query(
|
||||
id=1,
|
||||
question="How do we improve Nike's revenue in Q3 2024?",
|
||||
dependencies=[2],
|
||||
node_type=QueryType('SINGLE')
|
||||
),
|
||||
# ... other queries ...
|
||||
]]
|
||||
)
|
||||
```
|
||||
|
||||
#### Step 2. Defining the Planning Agent
|
||||
|
||||
We specify the query, task specification and an appropriate system prompt.
|
||||
|
||||
```python
|
||||
from swarms import OpenAIChat
|
||||
from swarms import Agent
|
||||
|
||||
query = "How do we improve Nike's revenue in Q3 2024?"
|
||||
task = f"Consider: {query}. Generate just the correct query plan in JSON format."
|
||||
system_prompt = (
|
||||
"You are a world class query planning algorithm "
|
||||
"capable of breaking apart questions into its "
|
||||
"dependency queries such that the answers can be "
|
||||
"used to inform the parent question. Do not answer "
|
||||
"the questions, simply provide a correct compute "
|
||||
"graph with good specific questions to ask and relevant "
|
||||
"dependencies. Before you call the function, think "
|
||||
"step-by-step to get a better understanding of the problem."
|
||||
)
|
||||
llm = OpenAIChat(
|
||||
temperature=0.0, model_name="gpt-4", max_tokens=4000
|
||||
)
|
||||
```
|
||||
|
||||
Then, we proceed with agent definition.
|
||||
|
||||
```python
|
||||
# Initialize the agent
|
||||
agent = Agent(
|
||||
agent_name="Query Planner",
|
||||
system_prompt=system_prompt,
|
||||
# Set the tool schema to the JSON string -- this is the key difference
|
||||
tool_schema=tool_schema,
|
||||
llm=llm,
|
||||
max_loops=1,
|
||||
autosave=True,
|
||||
dashboard=False,
|
||||
streaming_on=True,
|
||||
verbose=True,
|
||||
interactive=False,
|
||||
# Set the output type to the tool schema which is a BaseModel
|
||||
output_type=tool_schema, # or dict, or str
|
||||
metadata_output_type="json",
|
||||
# List of schemas that the agent can handle
|
||||
list_tool_schemas=[tool_schema],
|
||||
function_calling_format_type="OpenAI",
|
||||
function_calling_type="json", # or soon yaml
|
||||
)
|
||||
```
|
||||
|
||||
#### Step 3. Obtaining Outline from Planning Agent
|
||||
|
||||
We now run the agent, and since its output is in JSON format, we can load it as a dictionary.
|
||||
|
||||
```python
|
||||
generated_data = agent.run(task)
|
||||
```
|
||||
|
||||
At times the agent could return extra content other than JSON. Below function will filter it out.
|
||||
|
||||
```python
|
||||
def process_json_output(content):
|
||||
# Find the index of the first occurrence of '```json\n'
|
||||
start_index = content.find('```json\n')
|
||||
if start_index == -1:
|
||||
# If '```json\n' is not found, return the original content
|
||||
return content
|
||||
# Return the part of the content after '```json\n' and remove the '```' at the end
|
||||
return content[start_index + len('```json\n'):].rstrip('`')
|
||||
|
||||
# Use the function to clean up the output
|
||||
json_content = process_json_output(generated_data.content)
|
||||
|
||||
import json
|
||||
|
||||
# Load the JSON string into a Python object
|
||||
json_object = json.loads(json_content)
|
||||
|
||||
# Convert the Python object back to a JSON string
|
||||
json_content = json.dumps(json_object, indent=2)
|
||||
|
||||
# Print the JSON string
|
||||
print(json_content)
|
||||
```
|
||||
|
||||
Below is the output this produces
|
||||
|
||||
```json
|
||||
{
|
||||
"main_query": "How do we improve Nike's revenue in Q3 2024?",
|
||||
"sub_queries": [
|
||||
{
|
||||
"id": "1",
|
||||
"query": "What is Nike's current revenue trend?"
|
||||
},
|
||||
{
|
||||
"id": "2",
|
||||
"query": "What are the projected market trends for the sports apparel industry in 2024?"
|
||||
},
|
||||
{
|
||||
"id": "3",
|
||||
"query": "What are the current successful strategies being used by Nike's competitors?",
|
||||
"dependencies": [
|
||||
"2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "4",
|
||||
"query": "What are the current and projected economic conditions in Nike's major markets?",
|
||||
"dependencies": [
|
||||
"2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "5",
|
||||
"query": "What are the current consumer preferences in the sports apparel industry?",
|
||||
"dependencies": [
|
||||
"2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "6",
|
||||
"query": "What are the potential areas of improvement in Nike's current business model?",
|
||||
"dependencies": [
|
||||
"1"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "7",
|
||||
"query": "What are the potential new markets for Nike to explore in 2024?",
|
||||
"dependencies": [
|
||||
"2",
|
||||
"4"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "8",
|
||||
"query": "What are the potential new products or services Nike could introduce in 2024?",
|
||||
"dependencies": [
|
||||
"5"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "9",
|
||||
"query": "What are the potential marketing strategies Nike could use to increase its revenue in Q3 2024?",
|
||||
"dependencies": [
|
||||
"3",
|
||||
"5",
|
||||
"7",
|
||||
"8"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "10",
|
||||
"query": "What are the potential cost-saving strategies Nike could implement to increase its net revenue in Q3 2024?",
|
||||
"dependencies": [
|
||||
"6"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
The JSON dictionary is not convenient for humans to process. We make a directed graph out of it.
|
||||
|
||||
```python
|
||||
import networkx as nx
|
||||
import matplotlib.pyplot as plt
|
||||
import textwrap
|
||||
import random
|
||||
|
||||
# Create a directed graph
|
||||
G = nx.DiGraph()
|
||||
|
||||
# Define a color map
|
||||
color_map = {}
|
||||
|
||||
# Add nodes and edges to the graph
|
||||
for sub_query in json_object['sub_queries']:
|
||||
# Check if 'dependencies' key exists in sub_query, if not, initialize it as an empty list
|
||||
if 'dependencies' not in sub_query:
|
||||
sub_query['dependencies'] = []
|
||||
# Assign a random color for each node
|
||||
color_map[sub_query['id']] = "#{:06x}".format(random.randint(0, 0xFFFFFF))
|
||||
G.add_node(sub_query['id'], label=textwrap.fill(sub_query['query'], width=20))
|
||||
for dependency in sub_query['dependencies']:
|
||||
G.add_edge(dependency, sub_query['id'])
|
||||
|
||||
# Draw the graph
|
||||
pos = nx.spring_layout(G)
|
||||
nx.draw(G, pos, with_labels=True, node_size=800, node_color=[color_map[node] for node in G.nodes()], node_shape="o", alpha=0.5, linewidths=40)
|
||||
|
||||
# Prepare labels for legend
|
||||
labels = nx.get_node_attributes(G, 'label')
|
||||
handles = [plt.Line2D([0], [0], marker='o', color=color_map[node], label=f"{node}: {label}", markersize=10, linestyle='None') for node, label in labels.items()]
|
||||
|
||||
# Create a legend
|
||||
plt.legend(handles=handles, title="Queries", bbox_to_anchor=(1.05, 1), loc='upper left')
|
||||
|
||||
plt.show()
|
||||
```
|
||||
|
||||
This produces the below diagram which makes the plan much more convenient to understand.
|
||||
|
||||

|
||||
|
||||
### Doing Background Research and Gathering Data
|
||||
|
||||
At this point, we have solved the first half of the problem. We have an outline consisting of sub-problems to to tackled to solve our business problem. This will form the overall structure of our report. We now need to research information for each sub-problem in order to write an informed report. This mechanically intensive and is the aspect that will most benefit from Agentic intervention.
|
||||
|
||||
Essentially, we can spawn parallel agents to gather the data. Each agent will have 2 tools:
|
||||
|
||||
- Internet access
|
||||
- Financial data retrieval
|
||||
|
||||
As they run parallely, they will add their knowledge into a common long-term memory. We will then spawn a separate report writing agent with access to this memory to generate our business case report.
|
||||
|
||||
#### Step 4. Defining Tools for Worker Agents
|
||||
|
||||
Let us first define the 2 tools.
|
||||
|
||||
```python
|
||||
import os
|
||||
from typing import List, Dict
|
||||
|
||||
from swarms import tool
|
||||
|
||||
os.environ['TAVILY_API_KEY'] = os.getenv('TAVILY_API_KEY')
|
||||
os.environ["KAY_API_KEY"] = os.getenv('KAY_API_KEY')
|
||||
|
||||
from langchain_community.tools.tavily_search import TavilySearchResults
|
||||
from langchain_core.pydantic_v1 import BaseModel, Field
|
||||
|
||||
from kay.rag.retrievers import KayRetriever
|
||||
|
||||
@tool
|
||||
def browser(query: str) -> str:
|
||||
"""
|
||||
Search the query in the browser with the Tavily API tool.
|
||||
Args:
|
||||
query (str): The query to search in the browser.
|
||||
Returns:
|
||||
str: The search results
|
||||
"""
|
||||
internet_search = TavilySearchResults()
|
||||
results = internet_search.invoke({"query": query})
|
||||
response = ''
|
||||
for result in results:
|
||||
response += (result['content'] + '\n')
|
||||
return response
|
||||
|
||||
@tool
|
||||
def kay_retriever(query: str) -> str:
|
||||
"""
|
||||
Search the financial data query with the KayAI API tool.
|
||||
Args:
|
||||
query (str): The query to search in the KayRetriever.
|
||||
Returns:
|
||||
str: The first context retrieved as a string.
|
||||
"""
|
||||
# Initialize the retriever
|
||||
retriever = KayRetriever(dataset_id = "company", data_types=["10-K", "10-Q", "8-K", "PressRelease"])
|
||||
# Query the retriever
|
||||
context = retriever.query(query=query,num_context=1)
|
||||
return context[0]['chunk_embed_text']
|
||||
```
|
||||
|
||||
#### Step 5. Defining Long-Term Memory
|
||||
|
||||
As mentioned previously, the worker agents running parallely, will pool their knowledge into a common memory. Let us define that.
|
||||
|
||||
```python
|
||||
import logging
|
||||
import os
|
||||
import uuid
|
||||
from typing import Callable, List, Optional
|
||||
|
||||
import chromadb
|
||||
import numpy as np
|
||||
from dotenv import load_dotenv
|
||||
|
||||
from swarms.utils.data_to_text import data_to_text
|
||||
from swarms.utils.markdown_message import display_markdown_message
|
||||
from swarms.memory.base_vectordb import AbstractVectorDatabase
|
||||
|
||||
|
||||
# Results storage using local ChromaDB
|
||||
class ChromaDB(AbstractVectorDatabase):
|
||||
"""
|
||||
|
||||
ChromaDB database
|
||||
|
||||
Args:
|
||||
metric (str): The similarity metric to use.
|
||||
output (str): The name of the collection to store the results in.
|
||||
limit_tokens (int, optional): The maximum number of tokens to use for the query. Defaults to 1000.
|
||||
n_results (int, optional): The number of results to retrieve. Defaults to 2.
|
||||
|
||||
Methods:
|
||||
add: _description_
|
||||
query: _description_
|
||||
|
||||
Examples:
|
||||
>>> chromadb = ChromaDB(
|
||||
>>> metric="cosine",
|
||||
>>> output="results",
|
||||
>>> llm="gpt3",
|
||||
>>> openai_api_key=OPENAI_API_KEY,
|
||||
>>> )
|
||||
>>> chromadb.add(task, result, result_id)
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
metric: str = "cosine",
|
||||
output_dir: str = "swarms",
|
||||
limit_tokens: Optional[int] = 1000,
|
||||
n_results: int = 3,
|
||||
embedding_function: Callable = None,
|
||||
docs_folder: str = None,
|
||||
verbose: bool = False,
|
||||
*args,
|
||||
**kwargs,
|
||||
):
|
||||
self.metric = metric
|
||||
self.output_dir = output_dir
|
||||
self.limit_tokens = limit_tokens
|
||||
self.n_results = n_results
|
||||
self.docs_folder = docs_folder
|
||||
self.verbose = verbose
|
||||
|
||||
# Disable ChromaDB logging
|
||||
if verbose:
|
||||
logging.getLogger("chromadb").setLevel(logging.INFO)
|
||||
|
||||
# Create Chroma collection
|
||||
chroma_persist_dir = "chroma"
|
||||
chroma_client = chromadb.PersistentClient(
|
||||
settings=chromadb.config.Settings(
|
||||
persist_directory=chroma_persist_dir,
|
||||
),
|
||||
*args,
|
||||
**kwargs,
|
||||
)
|
||||
|
||||
# Embedding model
|
||||
if embedding_function:
|
||||
self.embedding_function = embedding_function
|
||||
else:
|
||||
self.embedding_function = None
|
||||
|
||||
# Create ChromaDB client
|
||||
self.client = chromadb.Client()
|
||||
|
||||
# Create Chroma collection
|
||||
self.collection = chroma_client.get_or_create_collection(
|
||||
name=output_dir,
|
||||
metadata={"hnsw:space": metric},
|
||||
embedding_function=self.embedding_function,
|
||||
# data_loader=self.data_loader,
|
||||
*args,
|
||||
**kwargs,
|
||||
)
|
||||
display_markdown_message(
|
||||
"ChromaDB collection created:"
|
||||
f" {self.collection.name} with metric: {self.metric} and"
|
||||
f" output directory: {self.output_dir}"
|
||||
)
|
||||
|
||||
# If docs
|
||||
if docs_folder:
|
||||
display_markdown_message(
|
||||
f"Traversing directory: {docs_folder}"
|
||||
)
|
||||
self.traverse_directory()
|
||||
|
||||
def add(
|
||||
self,
|
||||
document: str,
|
||||
*args,
|
||||
**kwargs,
|
||||
):
|
||||
"""
|
||||
Add a document to the ChromaDB collection.
|
||||
|
||||
Args:
|
||||
document (str): The document to be added.
|
||||
condition (bool, optional): The condition to check before adding the document. Defaults to True.
|
||||
|
||||
Returns:
|
||||
str: The ID of the added document.
|
||||
"""
|
||||
try:
|
||||
doc_id = str(uuid.uuid4())
|
||||
self.collection.add(
|
||||
ids=[doc_id],
|
||||
documents=[document],
|
||||
*args,
|
||||
**kwargs,
|
||||
)
|
||||
print('-----------------')
|
||||
print("Document added successfully")
|
||||
print('-----------------')
|
||||
return doc_id
|
||||
except Exception as e:
|
||||
raise Exception(f"Failed to add document: {str(e)}")
|
||||
|
||||
def query(
|
||||
self,
|
||||
query_text: str,
|
||||
*args,
|
||||
**kwargs,
|
||||
):
|
||||
"""
|
||||
Query documents from the ChromaDB collection.
|
||||
|
||||
Args:
|
||||
query (str): The query string.
|
||||
n_docs (int, optional): The number of documents to retrieve. Defaults to 1.
|
||||
|
||||
Returns:
|
||||
dict: The retrieved documents.
|
||||
"""
|
||||
try:
|
||||
docs = self.collection.query(
|
||||
query_texts=[query_text],
|
||||
n_results=self.n_results,
|
||||
*args,
|
||||
**kwargs,
|
||||
)["documents"]
|
||||
return docs[0]
|
||||
except Exception as e:
|
||||
raise Exception(f"Failed to query documents: {str(e)}")
|
||||
|
||||
def traverse_directory(self):
|
||||
"""
|
||||
Traverse through every file in the given directory and its subdirectories,
|
||||
and return the paths of all files.
|
||||
Parameters:
|
||||
- directory_name (str): The name of the directory to traverse.
|
||||
Returns:
|
||||
- list: A list of paths to each file in the directory and its subdirectories.
|
||||
"""
|
||||
added_to_db = False
|
||||
|
||||
for root, dirs, files in os.walk(self.docs_folder):
|
||||
for file in files:
|
||||
file = os.path.join(self.docs_folder, file)
|
||||
_, ext = os.path.splitext(file)
|
||||
data = data_to_text(file)
|
||||
added_to_db = self.add([data])
|
||||
print(f"{file} added to Database")
|
||||
|
||||
return added_to_db
|
||||
```
|
||||
|
||||
We can now proceed to initialize the memory.
|
||||
|
||||
```python
|
||||
from chromadb.utils import embedding_functions
|
||||
default_ef = embedding_functions.DefaultEmbeddingFunction()
|
||||
|
||||
memory = ChromaDB(
|
||||
metric="cosine",
|
||||
n_results=3,
|
||||
output_dir="results",
|
||||
embedding_function=default_ef
|
||||
)
|
||||
```
|
||||
|
||||
#### Step 6. Defining Worker Agents
|
||||
|
||||
The Worker Agent sub-classes the `Agent` class. The only different between these 2 is in how the `run()` method works. In the `Agent` class, `run()` simply returns the set of tool commands to run, but does not execute it. We, however, desire this. In addition, after we run our tools, we get the relevant information as output. We want to add this information to our memory. Hence, to incorporate these 2 changes, we define `WorkerAgent` as follows.
|
||||
|
||||
```python
|
||||
class WorkerAgent(Agent):
|
||||
def __init__(self, *args, **kwargs):
|
||||
super().__init__(*args, **kwargs)
|
||||
|
||||
def run(self, task, *args, **kwargs):
|
||||
response = super().run(task, *args, **kwargs)
|
||||
print(response.content)
|
||||
|
||||
json_dict = json.loads(process_json_output(response.content))
|
||||
|
||||
#print(json.dumps(json_dict, indent=2))
|
||||
|
||||
if response!=None:
|
||||
try:
|
||||
commands = json_dict["commands"]
|
||||
except:
|
||||
commands = [json_dict['command']]
|
||||
|
||||
for command in commands:
|
||||
tool_name = command["name"]
|
||||
|
||||
if tool_name not in ['browser', 'kay_retriever']:
|
||||
continue
|
||||
|
||||
query = command["args"]["query"]
|
||||
|
||||
# Get the tool by its name
|
||||
tool = globals()[tool_name]
|
||||
tool_response = tool(query)
|
||||
|
||||
# Add tool's output to long term memory
|
||||
self.long_term_memory.add(tool_response)
|
||||
```
|
||||
|
||||
We can then instantiate an object of the `WorkerAgent` class.
|
||||
|
||||
```python
|
||||
worker_agent = WorkerAgent(
|
||||
agent_name="Worker Agent",
|
||||
system_prompt=(
|
||||
"Autonomous agent that can interact with browser, "
|
||||
"financial data retriever and other agents. Be Helpful "
|
||||
"and Kind. Use the tools provided to assist the user. "
|
||||
"Generate the plan with list of commands in JSON format."
|
||||
),
|
||||
llm=OpenAIChat(
|
||||
temperature=0.0, model_name="gpt-4", max_tokens=4000
|
||||
),
|
||||
max_loops="auto",
|
||||
autosave=True,
|
||||
dashboard=False,
|
||||
streaming_on=True,
|
||||
verbose=True,
|
||||
stopping_token="<DONE>",
|
||||
interactive=True,
|
||||
tools=[browser, kay_retriever],
|
||||
long_term_memory=memory,
|
||||
code_interpreter=True,
|
||||
)
|
||||
```
|
||||
|
||||
#### Step 7. Running the Worker Agents
|
||||
|
||||
At this point, we need to setup a concurrent workflow. While the order of adding tasks to the workflow doesn't matter (since they will all run concurrently late when executed), we can take some time to define an order for these tasks. This order will come in handy later when writing the report using our Writer Agent.
|
||||
|
||||
The order we will follow is Breadth First Traversal (BFT) of the sub-queries in the graph we had made earlier (shown below again for reference). BFT makes sense to be used here because we want all the dependent parent questions to be answered before answering the child question. Also, since we could have independent subgraphs, we will also perform BFT separately on each subgraph.
|
||||
|
||||

|
||||
|
||||
Below is the code that produces the order of processing sub-queries.
|
||||
|
||||
```python
|
||||
from collections import deque, defaultdict
|
||||
|
||||
# Define the graph nodes
|
||||
nodes = json_object['sub_queries']
|
||||
|
||||
# Create a graph from the nodes
|
||||
graph = defaultdict(list)
|
||||
for node in nodes:
|
||||
for dependency in node['dependencies']:
|
||||
graph[dependency].append(node['id'])
|
||||
|
||||
# Find all nodes with no dependencies (potential starting points)
|
||||
start_nodes = [node['id'] for node in nodes if not node['dependencies']]
|
||||
|
||||
# Adjust the BFT function to handle dependencies correctly
|
||||
def bft_corrected(start, graph, nodes_info):
|
||||
visited = set()
|
||||
queue = deque([start])
|
||||
order = []
|
||||
|
||||
while queue:
|
||||
node = queue.popleft()
|
||||
if node not in visited:
|
||||
# Check if all dependencies of the current node are visited
|
||||
node_dependencies = [n['id'] for n in nodes if n['id'] == node][0]
|
||||
dependencies_met = all(dep in visited for dep in nodes_info[node_dependencies]['dependencies'])
|
||||
|
||||
if dependencies_met:
|
||||
visited.add(node)
|
||||
order.append(node)
|
||||
# Add only nodes to the queue whose dependencies are fully met
|
||||
for next_node in graph[node]:
|
||||
if all(dep in visited for dep in nodes_info[next_node]['dependencies']):
|
||||
queue.append(next_node)
|
||||
else:
|
||||
# Requeue the node to check dependencies later
|
||||
queue.append(node)
|
||||
|
||||
return order
|
||||
|
||||
# Dictionary to access node information quickly
|
||||
nodes_info = {node['id']: node for node in nodes}
|
||||
|
||||
# Perform BFT for each unvisited start node using the corrected BFS function
|
||||
visited_global = set()
|
||||
bfs_order = []
|
||||
|
||||
for start in start_nodes:
|
||||
if start not in visited_global:
|
||||
order = bft_corrected(start, graph, nodes_info)
|
||||
bfs_order.extend(order)
|
||||
visited_global.update(order)
|
||||
|
||||
print("BFT Order:", bfs_order)
|
||||
```
|
||||
|
||||
This produces the following output.
|
||||
|
||||
```python
|
||||
BFT Order: ['1', '6', '10', '2', '3', '4', '5', '7', '8', '9']
|
||||
```
|
||||
|
||||
Now, let's define our `ConcurrentWorkflow` and run it.
|
||||
|
||||
```python
|
||||
import os
|
||||
from dotenv import load_dotenv
|
||||
from swarms import Agent, ConcurrentWorkflow, OpenAIChat, Task
|
||||
|
||||
# Create a workflow
|
||||
workflow = ConcurrentWorkflow(max_workers=5)
|
||||
task_list = []
|
||||
|
||||
for node in bfs_order:
|
||||
sub_query =nodes_info[node]['query']
|
||||
task = Task(worker_agent, sub_query)
|
||||
print('-----------------')
|
||||
print("Added task: ", sub_query)
|
||||
print('-----------------')
|
||||
task_list.append(task)
|
||||
|
||||
workflow.add(tasks=task_list)
|
||||
|
||||
# Run the workflow
|
||||
workflow.run()
|
||||
```
|
||||
|
||||
Below is part of the output this workflow produces. We clearly see the thought process of the agent and the plan it came up to solve a particular sub-query. In addition, we see the tool-calling schema it produces in `"command"`.
|
||||
|
||||
```python
|
||||
...
|
||||
...
|
||||
content='\n{\n "thoughts": {\n "text": "To find out Nike\'s current revenue trend, I will use the financial data retriever tool to search for \'Nike revenue trend\'.",\n "reasoning": "The financial data retriever tool allows me to search for specific financial data, so I can look up the current revenue trend of Nike.", \n "plan": "Use the financial data retriever tool to search for \'Nike revenue trend\'. Parse the result to get the current revenue trend and format that into a readable report."\n },\n "command": {\n "name": "kay_retriever", \n "args": {\n "query": "Nike revenue trend"\n }\n }\n}\n```' response_metadata={'token_usage': {'completion_tokens': 152, 'prompt_tokens': 1527, 'total_tokens': 1679}, 'model_name': 'gpt-4', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}
|
||||
Saved agent state to: Worker Agent_state.json
|
||||
|
||||
{
|
||||
"thoughts": {
|
||||
"text": "To find out Nike's current revenue trend, I will use the financial data retriever tool to search for 'Nike revenue trend'.",
|
||||
"reasoning": "The financial data retriever tool allows me to search for specific financial data, so I can look up the current revenue trend of Nike.",
|
||||
"plan": "Use the financial data retriever tool to search for 'Nike revenue trend'. Parse the result to get the current revenue trend and format that into a readable report."
|
||||
},
|
||||
"command": {
|
||||
"name": "kay_retriever",
|
||||
"args": {
|
||||
"query": "Nike revenue trend"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
-----------------
|
||||
Document added successfully
|
||||
-----------------
|
||||
...
|
||||
...
|
||||
```
|
||||
|
||||
Here, `"name"` pertains to the name of the tool to be called and `"args"` is the arguments to be passed to the tool call. Like mentioned before, we modify `Agent`'s default behaviour in `WorkerAgent`. Hence, the tool call is executed here and its results (information from web pages and Kay Retriever API) are added to long-term memory. We get confirmation for this from the message `Document added successfully`.
|
||||
|
||||
|
||||
#### Step 7. Generating the report using Writer Agent
|
||||
|
||||
At this point, our Worker Agents have gathered all the background information required to generate the report. We have also defined a coherent structure to write the report, which is following the BFT order to answering the sub-queries. Now it's time to define a Writer Agent and call it sequentially in the order of sub-queries.
|
||||
|
||||
```python
|
||||
from swarms import Agent, OpenAIChat, tool
|
||||
|
||||
agent = Agent(
|
||||
agent_name="Writer Agent",
|
||||
agent_description=(
|
||||
"This agent writes reports based on information in long-term memory"
|
||||
),
|
||||
system_prompt=(
|
||||
"You are a world-class financial report writer. "
|
||||
"Write analytical and accurate responses using memory to answer the query. "
|
||||
"Do not mention use of long-term memory in the report. "
|
||||
"Do not mention Writer Agent in response."
|
||||
"Return only response content in strict markdown format."
|
||||
),
|
||||
llm=OpenAIChat(temperature=0.2, model='gpt-3.5-turbo'),
|
||||
max_loops=1,
|
||||
autosave=True,
|
||||
verbose=True,
|
||||
long_term_memory=memory,
|
||||
)
|
||||
```
|
||||
|
||||
The report individual sections of the report will be collected in a list.
|
||||
|
||||
```python
|
||||
report = []
|
||||
```
|
||||
|
||||
Let us now run the writer agent.
|
||||
|
||||
```python
|
||||
for node in bfs_order:
|
||||
sub_query =nodes_info[node]['query']
|
||||
print("Running task: ", sub_query)
|
||||
out = agent.run(f"Consider: {sub_query}. Write response in strict markdown format using long-term memory. Do not mention Writer Agent in response.")
|
||||
print(out)
|
||||
try:
|
||||
report.append(out.content)
|
||||
except:
|
||||
pass
|
||||
```
|
||||
|
||||
Now, we need to clean up the repoort a bit to make it render professionally.
|
||||
|
||||
```python
|
||||
# Remove any content before the first "#" as that signals start of heading
|
||||
# Anything before this usually contains filler content
|
||||
stripped_report = [entry[entry.find('#'):] if '#' in entry else entry for entry in report]
|
||||
report = stripped_report
|
||||
|
||||
# At times the LLM outputs \\n instead of \n
|
||||
cleaned_report = [entry.replace("\\n", "\n") for entry in report]
|
||||
import re
|
||||
|
||||
# Function to clean up unnecessary metadata from the report entries
|
||||
def clean_report(report):
|
||||
cleaned_report = []
|
||||
for entry in report:
|
||||
# This pattern matches 'response_metadata={' followed by any characters that are not '}' (non-greedy),
|
||||
# possibly nested inside other braces, until the closing '}'.
|
||||
cleaned_entry = re.sub(r"response_metadata=\{[^{}]*(?:\{[^{}]*\}[^{}]*)*\}", "", entry, flags=re.DOTALL)
|
||||
cleaned_report.append(cleaned_entry)
|
||||
return cleaned_report
|
||||
|
||||
# Apply the cleaning function to the markdown report
|
||||
cleaned_report = clean_report(cleaned_report)
|
||||
```
|
||||
|
||||
After cleaning, we append parts of the report together to get out final report.
|
||||
|
||||
```python
|
||||
final_report = ' \n '.join(cleaned_report)
|
||||
```
|
||||
|
||||
In Jupyter Notebook, we can use the below code to render it in Markdown.
|
||||
|
||||
```python
|
||||
from IPython.display import display, Markdown
|
||||
|
||||
display(Markdown(final_report))
|
||||
```
|
||||
|
||||
|
||||
## Final Generated Report
|
||||
|
||||
|
||||
### Nike's Current Revenue Trend
|
||||
|
||||
Nike's current revenue trend has been steadily increasing over the past few years. In the most recent fiscal year, Nike reported a revenue of $37.4 billion, which was a 7% increase from the previous year. This growth can be attributed to strong sales in key markets, successful marketing campaigns, and a focus on innovation in product development. Overall, Nike continues to demonstrate strong financial performance and is well-positioned for future growth.
|
||||
### Potential Areas of Improvement in Nike's Business Model
|
||||
|
||||
1. **Sustainability Practices**: Nike could further enhance its sustainability efforts by reducing its carbon footprint, using more eco-friendly materials, and ensuring ethical labor practices throughout its supply chain.
|
||||
|
||||
2. **Diversification of Product Portfolio**: While Nike is known for its athletic footwear and apparel, diversifying into new product categories or expanding into untapped markets could help drive growth and mitigate risks associated with a single product line.
|
||||
|
||||
3. **E-commerce Strategy**: Improving the online shopping experience, investing in digital marketing, and leveraging data analytics to personalize customer interactions could boost online sales and customer loyalty.
|
||||
|
||||
4. **Innovation and R&D**: Continuously investing in research and development to stay ahead of competitors, introduce new technologies, and enhance product performance could help maintain Nike's competitive edge in the market.
|
||||
|
||||
5. **Brand Image and Reputation**: Strengthening brand image through effective marketing campaigns, community engagement, and transparent communication with stakeholders can help build trust and loyalty among consumers.
|
||||
### Potential Cost-Saving Strategies for Nike to Increase Net Revenue in Q3 2024
|
||||
|
||||
1. **Supply Chain Optimization**: Streamlining the supply chain, reducing transportation costs, and improving inventory management can lead to significant cost savings for Nike.
|
||||
|
||||
2. **Operational Efficiency**: Implementing lean manufacturing practices, reducing waste, and optimizing production processes can help lower production costs and improve overall efficiency.
|
||||
|
||||
3. **Outsourcing Non-Core Functions**: Outsourcing non-core functions such as IT services, customer support, or logistics can help reduce overhead costs and focus resources on core business activities.
|
||||
|
||||
4. **Energy Efficiency**: Investing in energy-efficient technologies, renewable energy sources, and sustainable practices can lower utility costs and demonstrate a commitment to environmental responsibility.
|
||||
|
||||
5. **Negotiating Supplier Contracts**: Negotiating better terms with suppliers, leveraging economies of scale, and exploring alternative sourcing options can help lower procurement costs and improve margins.
|
||||
|
||||
By implementing these cost-saving strategies, Nike can improve its bottom line and increase net revenue in Q3 2024.
|
||||
### Projected Market Trends for the Sports Apparel Industry in 2024
|
||||
|
||||
1. **Sustainable Fashion**: Consumers are increasingly demanding eco-friendly and sustainable products, leading to a rise in sustainable sportswear options in the market.
|
||||
|
||||
2. **Digital Transformation**: The sports apparel industry is expected to continue its shift towards digital platforms, with a focus on e-commerce, personalized shopping experiences, and digital marketing strategies.
|
||||
|
||||
3. **Athleisure Wear**: The trend of athleisure wear, which combines athletic and leisure clothing, is projected to remain popular in 2024 as consumers seek comfort and versatility in their apparel choices.
|
||||
|
||||
4. **Innovative Materials**: Advances in technology and material science are likely to drive the development of innovative fabrics and performance-enhancing materials in sports apparel, catering to the demand for high-quality and functional products.
|
||||
|
||||
5. **Health and Wellness Focus**: With a growing emphasis on health and wellness, sports apparel brands are expected to incorporate features that promote comfort, performance, and overall well-being in their products.
|
||||
|
||||
Overall, the sports apparel industry in 2024 is anticipated to be characterized by sustainability, digitalization, innovation, and a focus on consumer health and lifestyle trends.
|
||||
### Current Successful Strategies Used by Nike's Competitors
|
||||
|
||||
1. **Adidas**: Adidas has been successful in leveraging collaborations with celebrities and designers to create limited-edition collections that generate hype and drive sales. They have also focused on sustainability initiatives, such as using recycled materials in their products, to appeal to environmentally conscious consumers.
|
||||
|
||||
2. **Under Armour**: Under Armour has differentiated itself by targeting performance-driven athletes and emphasizing technological innovation in their products. They have also invested heavily in digital marketing and e-commerce to reach a wider audience and enhance the customer shopping experience.
|
||||
|
||||
3. **Puma**: Puma has successfully capitalized on the athleisure trend by offering stylish and versatile sportswear that can be worn both in and out of the gym. They have also focused on building partnerships with influencers and sponsoring high-profile athletes to increase brand visibility and credibility.
|
||||
|
||||
4. **Lululemon**: Lululemon has excelled in creating a strong community around its brand, hosting events, classes, and collaborations to engage with customers beyond just selling products. They have also prioritized customer experience by offering personalized services and creating a seamless omnichannel shopping experience.
|
||||
|
||||
5. **New Balance**: New Balance has carved out a niche in the market by emphasizing quality craftsmanship, heritage, and authenticity in their products. They have also focused on customization and personalization options for customers, allowing them to create unique and tailored footwear and apparel.
|
||||
|
||||
Overall, Nike's competitors have found success through a combination of innovative product offerings, strategic marketing initiatives, and a focus on customer engagement and experience.
|
||||
### Current and Projected Economic Conditions in Nike's Major Markets
|
||||
|
||||
1. **United States**: The United States, being one of Nike's largest markets, is currently experiencing moderate economic growth driven by consumer spending, low unemployment rates, and a rebound in manufacturing. However, uncertainties surrounding trade policies, inflation, and interest rates could impact consumer confidence and spending in the near future.
|
||||
|
||||
2. **China**: China remains a key market for Nike, with a growing middle class and increasing demand for sportswear and athletic footwear. Despite recent trade tensions with the U.S., China's economy is projected to continue expanding, driven by domestic consumption, infrastructure investments, and technological advancements.
|
||||
|
||||
3. **Europe**: Economic conditions in Europe vary across countries, with some experiencing sluggish growth due to Brexit uncertainties, political instability, and trade tensions. However, overall consumer confidence is improving, and the sports apparel market is expected to grow, driven by e-commerce and sustainability trends.
|
||||
|
||||
4. **Emerging Markets**: Nike's presence in emerging markets such as India, Brazil, and Southeast Asia provides opportunities for growth, given the rising disposable incomes, urbanization, and increasing focus on health and fitness. However, challenges such as currency fluctuations, regulatory changes, and competition from local brands could impact Nike's performance in these markets.
|
||||
|
||||
Overall, Nike's major markets exhibit a mix of opportunities and challenges, with economic conditions influenced by global trends, geopolitical factors, and consumer preferences."
|
||||
### Current Consumer Preferences in the Sports Apparel Industry
|
||||
|
||||
1. **Sustainability**: Consumers are increasingly seeking eco-friendly and sustainable options in sports apparel, driving brands to focus on using recycled materials, reducing waste, and promoting ethical practices.
|
||||
|
||||
2. **Athleisure**: The trend of athleisure wear continues to be popular, with consumers looking for versatile and comfortable clothing that can be worn both during workouts and in everyday life.
|
||||
|
||||
3. **Performance and Functionality**: Consumers prioritize performance-enhancing features in sports apparel, such as moisture-wicking fabrics, breathable materials, and ergonomic designs that enhance comfort and mobility.
|
||||
|
||||
4. **Personalization**: Customization options, personalized fit, and unique design elements are appealing to consumers who seek individuality and exclusivity in their sports apparel choices.
|
||||
|
||||
5. **Brand Transparency**: Consumers value transparency in brand practices, including supply chain transparency, ethical sourcing, and clear communication on product quality and manufacturing processes.
|
||||
|
||||
Overall, consumer preferences in the sports apparel industry are shifting towards sustainability, versatility, performance, personalization, and transparency, influencing brand strategies and product offerings.
|
||||
### Potential New Markets for Nike to Explore in 2024
|
||||
|
||||
1. **India**: With a growing population, increasing disposable incomes, and a rising interest in health and fitness, India presents a significant opportunity for Nike to expand its presence and tap into a large consumer base.
|
||||
|
||||
2. **Africa**: The African market, particularly countries with emerging economies and a young population, offers potential for Nike to introduce its products and capitalize on the growing demand for sportswear and athletic footwear.
|
||||
|
||||
3. **Middle East**: Countries in the Middle East, known for their luxury shopping destinations and a growing interest in sports and fitness activities, could be strategic markets for Nike to target and establish a strong foothold.
|
||||
|
||||
4. **Latin America**: Markets in Latin America, such as Brazil, Mexico, and Argentina, present opportunities for Nike to cater to a diverse consumer base and leverage the region's passion for sports and active lifestyles.
|
||||
|
||||
5. **Southeast Asia**: Rapid urbanization, increasing urban middle-class population, and a trend towards health and wellness in countries like Indonesia, Thailand, and Vietnam make Southeast Asia an attractive region for Nike to explore and expand its market reach.
|
||||
|
||||
By exploring these new markets in 2024, Nike can diversify its geographical presence, reach untapped consumer segments, and drive growth in emerging economies.
|
||||
### Potential New Products or Services Nike Could Introduce in 2024
|
||||
|
||||
1. **Smart Apparel**: Nike could explore the integration of technology into its apparel, such as smart fabrics that monitor performance metrics, provide feedback, or enhance comfort during workouts.
|
||||
|
||||
2. **Athletic Accessories**: Introducing a line of athletic accessories like gym bags, water bottles, or fitness trackers could complement Nike's existing product offerings and provide additional value to customers.
|
||||
|
||||
3. **Customization Platforms**: Offering personalized design options for footwear and apparel through online customization platforms could appeal to consumers seeking unique and tailored products.
|
||||
|
||||
4. **Athletic Recovery Gear**: Developing recovery-focused products like compression wear, recovery sandals, or massage tools could cater to athletes and fitness enthusiasts looking to enhance post-workout recovery.
|
||||
|
||||
5. **Sustainable Collections**: Launching sustainable collections made from eco-friendly materials, recycled fabrics, or biodegradable components could align with consumer preferences for environmentally conscious products.
|
||||
|
||||
By introducing these new products or services in 2024, Nike can innovate its product portfolio, cater to evolving consumer needs, and differentiate itself in the competitive sports apparel market.
|
||||
### Potential Marketing Strategies for Nike to Increase Revenue in Q3 2024
|
||||
|
||||
1. **Influencer Partnerships**: Collaborating with popular athletes, celebrities, or social media influencers to promote Nike products can help reach a wider audience and drive sales.
|
||||
|
||||
2. **Interactive Campaigns**: Launching interactive marketing campaigns, contests, or events that engage customers and create buzz around new product releases can generate excitement and increase brand visibility.
|
||||
|
||||
3. **Social Media Engagement**: Leveraging social media platforms to connect with consumers, share user-generated content, and respond to feedback can build brand loyalty and encourage repeat purchases.
|
||||
|
||||
4. **Localized Marketing**: Tailoring marketing messages, promotions, and product offerings to specific regions or target demographics can enhance relevance and appeal to diverse consumer groups.
|
||||
|
||||
5. **Customer Loyalty Programs**: Implementing loyalty programs, exclusive offers, or rewards for repeat customers can incentivize brand loyalty, increase retention rates, and drive higher lifetime customer value.
|
||||
|
||||
By employing these marketing strategies in Q3 2024, Nike can enhance its brand presence, attract new customers, and ultimately boost revenue growth.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
@ -0,0 +1,42 @@
|
||||
## **Applications of Swarms: Revolutionizing Customer Support**
|
||||
|
||||
---
|
||||
|
||||
**Introduction**:
|
||||
In today's fast-paced digital world, responsive and efficient customer support is a linchpin for business success. The introduction of AI-driven swarms in the customer support domain can transform the way businesses interact with and assist their customers. By leveraging the combined power of multiple AI agents working in concert, businesses can achieve unprecedented levels of efficiency, customer satisfaction, and operational cost savings.
|
||||
|
||||
---
|
||||
|
||||
### **The Benefits of Using Swarms for Customer Support:**
|
||||
|
||||
1. **24/7 Availability**: Swarms never sleep. Customers receive instantaneous support at any hour, ensuring constant satisfaction and loyalty.
|
||||
|
||||
2. **Infinite Scalability**: Whether it's ten inquiries or ten thousand, swarms can handle fluctuating volumes with ease, eliminating the need for vast human teams and minimizing response times.
|
||||
|
||||
3. **Adaptive Intelligence**: Swarms learn collectively, meaning that a solution found for one customer can be instantly applied to benefit all. This leads to constantly improving support experiences, evolving with every interaction.
|
||||
|
||||
---
|
||||
|
||||
### **Features - Reinventing Customer Support**:
|
||||
|
||||
- **AI Inbox Monitor**: Continuously scans email inboxes, identifying and categorizing support requests for swift responses.
|
||||
|
||||
- **Intelligent Debugging**: Proactively helps customers by diagnosing and troubleshooting underlying issues.
|
||||
|
||||
- **Automated Refunds & Coupons**: Seamless integration with payment systems like Stripe allows for instant issuance of refunds or coupons if a problem remains unresolved.
|
||||
|
||||
- **Full System Integration**: Holistically connects with CRM, email systems, and payment portals, ensuring a cohesive and unified support experience.
|
||||
|
||||
- **Conversational Excellence**: With advanced LLMs (Language Model Transformers), the swarm agents can engage in natural, human-like conversations, enhancing customer comfort and trust.
|
||||
|
||||
- **Rule-based Operation**: By working with rule engines, swarms ensure that all actions adhere to company guidelines, ensuring consistent, error-free support.
|
||||
|
||||
- **Turing Test Ready**: Crafted to meet and exceed the Turing Test standards, ensuring that every customer interaction feels genuine and personal.
|
||||
|
||||
---
|
||||
|
||||
**Conclusion**:
|
||||
Swarms are not just another technological advancement; they represent the future of customer support. Their ability to provide round-the-clock, scalable, and continuously improving support can redefine customer experience standards. By adopting swarms, businesses can stay ahead of the curve, ensuring unparalleled customer loyalty and satisfaction.
|
||||
|
||||
**Experience the future of customer support. Dive into the swarm revolution.**
|
||||
|
@ -0,0 +1,105 @@
|
||||
## Usage Documentation: Discord Bot with Advanced Features
|
||||
|
||||
---
|
||||
|
||||
### Overview:
|
||||
|
||||
This code provides a structure for a Discord bot with advanced features such as voice channel interactions, image generation, and text-based interactions using OpenAI models.
|
||||
|
||||
---
|
||||
|
||||
### Setup:
|
||||
|
||||
1. Ensure that the necessary libraries are installed:
|
||||
```bash
|
||||
pip install discord.py python-dotenv dalle3 invoke openai
|
||||
```
|
||||
|
||||
2. Create a `.env` file in the same directory as your bot script and add the following:
|
||||
```
|
||||
DISCORD_TOKEN=your_discord_bot_token
|
||||
STORAGE_SERVICE=your_storage_service_endpoint
|
||||
SAVE_DIRECTORY=path_to_save_generated_images
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Bot Class and its Methods:
|
||||
|
||||
#### `__init__(self, agent, llm, command_prefix="!")`:
|
||||
|
||||
Initializes the bot with the given agent, language model (`llm`), and a command prefix (default is `!`).
|
||||
|
||||
#### `add_command(self, name, func)`:
|
||||
|
||||
Allows you to dynamically add new commands to the bot. The `name` is the command's name and `func` is the function to execute when the command is called.
|
||||
|
||||
#### `run(self)`:
|
||||
|
||||
Starts the bot using the `DISCORD_TOKEN` from the `.env` file.
|
||||
|
||||
---
|
||||
|
||||
### Commands:
|
||||
|
||||
1. **!greet**: Greets the user.
|
||||
|
||||
2. **!help_me**: Provides a list of commands and their descriptions.
|
||||
|
||||
3. **!join**: Joins the voice channel the user is in.
|
||||
|
||||
4. **!leave**: Leaves the voice channel the bot is currently in.
|
||||
|
||||
5. **!listen**: Starts listening to voice in the current voice channel and records the audio.
|
||||
|
||||
6. **!generate_image [prompt]**: Generates images based on the provided prompt using the DALL-E3 model.
|
||||
|
||||
7. **!send_text [text] [use_agent=True]**: Sends the provided text to the worker (either the agent or the LLM) and returns the response.
|
||||
|
||||
---
|
||||
|
||||
### Usage:
|
||||
|
||||
Initialize the `llm` (Language Learning Model) with your OpenAI API key:
|
||||
|
||||
```python
|
||||
from swarms.models import OpenAIChat
|
||||
|
||||
llm = OpenAIChat(
|
||||
openai_api_key="Your_OpenAI_API_Key",
|
||||
temperature=0.5,
|
||||
)
|
||||
```
|
||||
|
||||
Initialize the bot with the `llm`:
|
||||
|
||||
```python
|
||||
from apps.discord import Bot
|
||||
|
||||
bot = Bot(llm=llm)
|
||||
```
|
||||
|
||||
Send a task to the bot:
|
||||
|
||||
```python
|
||||
task = "What were the winning Boston Marathon times for the past 5 years (ending in 2022)? Generate a table of the year, name, country of origin, and times."
|
||||
bot.send_text(task)
|
||||
```
|
||||
|
||||
Start the bot:
|
||||
|
||||
```python
|
||||
bot.run()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Additional Notes:
|
||||
|
||||
- The bot makes use of the `dalle3` library for image generation. Ensure you have the model and necessary setup for it.
|
||||
|
||||
- For the storage service, you might want to integrate with a cloud service like Google Cloud Storage or AWS S3 to store and retrieve generated images. The given code assumes a method `.upload()` for the storage service to upload files.
|
||||
|
||||
- Ensure that you've granted the bot necessary permissions on Discord, especially if you want to use voice channel features.
|
||||
|
||||
- Handle API keys and tokens securely. Avoid hardcoding them directly into your code. Use environment variables or secure secret management tools.
|
@ -0,0 +1,7 @@
|
||||
.md-typeset__table {
|
||||
min-width: 100%;
|
||||
}
|
||||
|
||||
.md-typeset table:not([class]) {
|
||||
display: table;
|
||||
}
|
After Width: | Height: | Size: 200 KiB |
After Width: | Height: | Size: 122 KiB |
After Width: | Height: | Size: 390 KiB |
After Width: | Height: | Size: 40 KiB |
After Width: | Height: | Size: 283 KiB |
After Width: | Height: | Size: 235 KiB |
After Width: | Height: | Size: 148 KiB |
@ -0,0 +1,70 @@
|
||||
|
||||
# Swarm Ecosystem
|
||||
|
||||
Welcome to the Swarm Ecosystem, a comprehensive suite of tools and frameworks designed to empower developers to orhestrate swarms of autonomous agents for a variety of applications. Dive into our ecosystem below:
|
||||
|
||||
| Project | Description | Link |
|
||||
| ------- | ----------- | ---- |
|
||||
| **Swarms Framework** | A Python-based framework that enables the creation, deployment, and scaling of reliable swarms of autonomous agents aimed at automating complex workflows. | [Swarms Framework](https://github.com/kyegomez/swarms) |
|
||||
| **Swarms Cloud** | A cloud-based service offering Swarms-as-a-Service with guaranteed 100% uptime, cutting-edge performance, and enterprise-grade reliability for seamless scaling and management of swarms. | [Swarms Cloud](https://github.com/kyegomez/swarms-core) |
|
||||
| **Swarms Core** | Provides backend utilities focusing on concurrency, multi-threading, and advanced execution strategies, developed in Rust for maximum efficiency and performance. | [Swarms Core](https://github.com/kyegomez/swarms-core) |
|
||||
| **Swarm Foundation Models** | A dedicated repository for the creation, optimization, and training of groundbreaking swarming models. Features innovative models like PSO with transformers, ant colony optimizations, and more, aiming to surpass traditional architectures like Transformers and SSMs. Open for community contributions and ideas. | [Swarm Foundation Models](https://github.com/kyegomez/swarms-pytorch) |
|
||||
| **Swarm Platform** | The Swarms dashboard Platform | [Swarm Platform](https://swarms.world/) |
|
||||
| **Swarms JS** | Swarms Framework in JS. Orchestrate any agents and enable multi-agent collaboration between various agents! | [Swarm JS](https://github.com/kyegomez/swarms-js) |
|
||||
|
||||
|
||||
|
||||
----
|
||||
|
||||
## 🫶 Contributions:
|
||||
|
||||
The easiest way to contribute is to pick any issue with the `good first issue` tag 💪. Read the Contributing guidelines [here](/CONTRIBUTING.md). Bug Report? [File here](https://github.com/swarms/gateway/issues) | Feature Request? [File here](https://github.com/swarms/gateway/issues)
|
||||
|
||||
Swarms is an open-source project, and contributions are VERY welcome. If you want to contribute, you can create new features, fix bugs, or improve the infrastructure. Please refer to the [CONTRIBUTING.md](https://github.com/kyegomez/swarms/blob/master/CONTRIBUTING.md) and our [contributing board](https://github.com/users/kyegomez/projects/1) to participate in Roadmap discussions!
|
||||
|
||||
<a href="https://github.com/kyegomez/swarms/graphs/contributors">
|
||||
<img src="https://contrib.rocks/image?repo=kyegomez/swarms" />
|
||||
</a>
|
||||
|
||||
<a href="https://github.com/kyegomez/swarms/graphs/contributors">
|
||||
<img src="https://contrib.rocks/image?repo=kyegomez/swarms-cloud" />
|
||||
</a>
|
||||
|
||||
<a href="https://github.com/kyegomez/swarms/graphs/contributors">
|
||||
<img src="https://contrib.rocks/image?repo=kyegomez/swarms-platform" />
|
||||
</a>
|
||||
|
||||
<a href="https://github.com/kyegomez/swarms/graphs/contributors">
|
||||
<img src="https://contrib.rocks/image?repo=kyegomez/swarms-js" />
|
||||
</a>
|
||||
|
||||
|
||||
|
||||
|
||||
----
|
||||
|
||||
## Community
|
||||
|
||||
Join our growing community around the world, for real-time support, ideas, and discussions on Swarms 😊
|
||||
|
||||
- View our official [Blog](https://swarms.apac.ai)
|
||||
- Chat live with us on [Discord](https://discord.gg/kS3rwKs3ZC)
|
||||
- Follow us on [Twitter](https://twitter.com/kyegomez)
|
||||
- Connect with us on [LinkedIn](https://www.linkedin.com/company/the-swarm-corporation)
|
||||
- Visit us on [YouTube](https://www.youtube.com/channel/UC9yXyitkbU_WSy7bd_41SqQ)
|
||||
- [Join the Swarms community on Discord!](https://discord.gg/AJazBmhKnr)
|
||||
- Join our Swarms Community Gathering every Thursday at 1pm NYC Time to unlock the potential of autonomous agents in automating your daily tasks [Sign up here](https://lu.ma/5p2jnc2v)
|
||||
|
||||
---
|
||||
|
||||
## Discovery Call
|
||||
Book a discovery call to learn how Swarms can lower your operating costs by 40% with swarms of autonomous agents in lightspeed. [Click here to book a time that works for you!](https://calendly.com/swarm-corp/30min?month=2023-11)
|
||||
|
||||
|
||||
|
||||
## Accelerate Backlog
|
||||
Help us accelerate our backlog by supporting us financially! Note, we're an open source corporation and so all the revenue we generate is through donations at the moment ;)
|
||||
|
||||
<a href="https://polar.sh/kyegomez"><img src="https://polar.sh/embed/fund-our-backlog.svg?org=kyegomez" /></a>
|
||||
|
||||
---
|
@ -0,0 +1,123 @@
|
||||
# Contributing
|
||||
|
||||
Thank you for your interest in contributing to Swarms! We welcome contributions from the community to help improve usability and readability. By contributing, you can be a part of creating a dynamic and interactive AI system.
|
||||
|
||||
To get started, please follow the guidelines below.
|
||||
|
||||
|
||||
## Optimization Priorities
|
||||
|
||||
To continuously improve Swarms, we prioritize the following design objectives:
|
||||
|
||||
1. **Usability**: Increase the ease of use and user-friendliness of the swarm system to facilitate adoption and interaction with basic input.
|
||||
|
||||
2. **Reliability**: Improve the swarm's ability to obtain the desired output even with basic and un-detailed input.
|
||||
|
||||
3. **Speed**: Reduce the time it takes for the swarm to accomplish tasks by improving the communication layer, critiquing, and self-alignment with meta prompting.
|
||||
|
||||
4. **Scalability**: Ensure that the system is asynchronous, concurrent, and self-healing to support scalability.
|
||||
|
||||
Our goal is to continuously improve Swarms by following this roadmap while also being adaptable to new needs and opportunities as they arise.
|
||||
|
||||
## Join the Swarms Community
|
||||
|
||||
Join the Swarms community on Discord to connect with other contributors, coordinate work, and receive support.
|
||||
|
||||
- [Join the Swarms Discord Server](https://discord.gg/qUtxnK2NMf)
|
||||
|
||||
|
||||
## Report and Issue
|
||||
The easiest way to contribute to our docs is through our public [issue tracker](https://github.com/kyegomez/swarms-docs/issues). Feel free to submit bugs, request features or changes, or contribute to the project directly.
|
||||
|
||||
## Pull Requests
|
||||
|
||||
Swarms docs are built using [MkDocs](https://squidfunk.github.io/mkdocs-material/getting-started/).
|
||||
|
||||
To directly contribute to Swarms documentation, first fork the [swarms-docs](https://github.com/kyegomez/swarms-docs) repository to your GitHub account. Then clone your repository to your local machine.
|
||||
|
||||
From inside the directory run:
|
||||
|
||||
```pip install -r requirements.txt```
|
||||
|
||||
To run `swarms-docs` locally run:
|
||||
|
||||
```mkdocs serve```
|
||||
|
||||
You should see something similar to the following:
|
||||
|
||||
```
|
||||
INFO - Building documentation...
|
||||
INFO - Cleaning site directory
|
||||
INFO - Documentation built in 0.19 seconds
|
||||
INFO - [09:28:33] Watching paths for changes: 'docs', 'mkdocs.yml'
|
||||
INFO - [09:28:33] Serving on http://127.0.0.1:8000/
|
||||
INFO - [09:28:37] Browser connected: http://127.0.0.1:8000/
|
||||
```
|
||||
|
||||
Follow the typical PR process to contribute changes.
|
||||
|
||||
* Create a feature branch.
|
||||
* Commit changes.
|
||||
* Submit a PR.
|
||||
|
||||
|
||||
-------
|
||||
---
|
||||
|
||||
## Taking on Tasks
|
||||
|
||||
We have a growing list of tasks and issues that you can contribute to. To get started, follow these steps:
|
||||
|
||||
1. Visit the [Swarms GitHub repository](https://github.com/kyegomez/swarms) and browse through the existing issues.
|
||||
|
||||
2. Find an issue that interests you and make a comment stating that you would like to work on it. Include a brief description of how you plan to solve the problem and any questions you may have.
|
||||
|
||||
3. Once a project coordinator assigns the issue to you, you can start working on it.
|
||||
|
||||
If you come across an issue that is unclear but still interests you, please post in the Discord server mentioned above. Someone from the community will be able to help clarify the issue in more detail.
|
||||
|
||||
We also welcome contributions to documentation, such as updating markdown files, adding docstrings, creating system architecture diagrams, and other related tasks.
|
||||
|
||||
## Submitting Your Work
|
||||
|
||||
To contribute your changes to Swarms, please follow these steps:
|
||||
|
||||
1. Fork the Swarms repository to your GitHub account. You can do this by clicking on the "Fork" button on the repository page.
|
||||
|
||||
2. Clone the forked repository to your local machine using the `git clone` command.
|
||||
|
||||
3. Before making any changes, make sure to sync your forked repository with the original repository to keep it up to date. You can do this by following the instructions [here](https://docs.github.com/en/github/collaborating-with-pull-requests/syncing-a-fork).
|
||||
|
||||
4. Create a new branch for your changes. This branch should have a descriptive name that reflects the task or issue you are working on.
|
||||
|
||||
5. Make your changes in the branch, focusing on a small, focused change that only affects a few files.
|
||||
|
||||
6. Run any necessary formatting or linting tools to ensure that your changes adhere to the project's coding standards.
|
||||
|
||||
7. Once your changes are ready, commit them to your branch with descriptive commit messages.
|
||||
|
||||
8. Push the branch to your forked repository.
|
||||
|
||||
9. Create a pull request (PR) from your branch to the main Swarms repository. Provide a clear and concise description of your changes in the PR.
|
||||
|
||||
10. Request a review from the project maintainers. They will review your changes, provide feedback, and suggest any necessary improvements.
|
||||
|
||||
11. Make any required updates or address any feedback provided during the review process.
|
||||
|
||||
12. Once your changes have been reviewed and approved, they will be merged into the main branch of the Swarms repository.
|
||||
|
||||
13. Congratulations! You have successfully contributed to Swarms.
|
||||
|
||||
Please note that during the review process, you may be asked to make changes or address certain issues. It is important to engage in open and constructive communication with the project maintainers to ensure the quality of your contributions.
|
||||
|
||||
## Developer Setup
|
||||
|
||||
If you are interested in setting up the Swarms development environment, please follow the instructions provided in the [developer setup guide](docs/developer-setup.md). This guide provides an overview of the different tools and technologies used in the project.
|
||||
|
||||
## Join the Agora Community
|
||||
|
||||
Swarms is brought to you by Agora, the open-source AI research organization. Join the Agora community to connect with other researchers and developers working on AI projects.
|
||||
|
||||
- [Join the Agora Discord Server](https://discord.gg/qUtxnK2NMf)
|
||||
|
||||
Thank you for your contributions and for being a part of the Swarms and Agora community! Together, we can advance Humanity through the power of AI.
|
@ -0,0 +1,358 @@
|
||||
# Architecture
|
||||
|
||||
## **1. Introduction**
|
||||
|
||||
In today's rapidly evolving digital world, harnessing the collaborative power of multiple computational agents is more crucial than ever. 'Swarms' represents a bold stride in this direction—a scalable and dynamic framework designed to enable swarms of agents to function in harmony and tackle complex tasks. This document serves as a comprehensive guide, elucidating the underlying architecture and strategies pivotal to realizing the Swarms vision.
|
||||
|
||||
---
|
||||
|
||||
## **2. The Vision**
|
||||
|
||||
At its heart, the Swarms framework seeks to emulate the collaborative efficiency witnessed in natural systems, like ant colonies or bird flocks. These entities, though individually simple, achieve remarkable outcomes through collaboration. Similarly, Swarms will unleash the collective potential of numerous agents, operating cohesively.
|
||||
|
||||
---
|
||||
|
||||
## **3. Architecture Overview**
|
||||
|
||||
### **3.1 Agent Level**
|
||||
The base level that serves as the building block for all further complexity.
|
||||
|
||||
#### Mechanics:
|
||||
* **Model**: At its core, each agent harnesses a powerful model like OpenAI's GPT.
|
||||
* **Vectorstore**: A memory structure allowing agents to store and retrieve information.
|
||||
* **Tools**: Utilities and functionalities that aid in the agent's task execution.
|
||||
|
||||
#### Interaction:
|
||||
Agents interact with the external world through their model and tools. The Vectorstore aids in retaining knowledge and facilitating inter-agent communication.
|
||||
|
||||
### **3.2 Worker Infrastructure Level**
|
||||
Building on the agent foundation, enhancing capability and readiness for swarm integration.
|
||||
|
||||
#### Mechanics:
|
||||
* **Human Input Integration**: Enables agents to accept and understand human-provided instructions.
|
||||
* **Unique Identifiers**: Assigns each agent a unique ID to facilitate tracking and communication.
|
||||
* **Asynchronous Tools**: Bolsters agents' capability to multitask and interact in real-time.
|
||||
|
||||
#### Interaction:
|
||||
Each worker is an enhanced agent, capable of operating independently or in sync with its peers, allowing for dynamic, scalable operations.
|
||||
|
||||
### **3.3 Swarm Level**
|
||||
Multiple Worker Nodes orchestrated into a synchronized, collaborative entity.
|
||||
|
||||
#### Mechanics:
|
||||
* **Orchestrator**: The maestro, responsible for directing the swarm, task allocation, and communication.
|
||||
* **Scalable Communication Layer**: Facilitates interactions among nodes and between nodes and the orchestrator.
|
||||
* **Task Assignment & Completion Protocols**: Structured procedures ensuring tasks are efficiently distributed and concluded.
|
||||
|
||||
#### Interaction:
|
||||
Nodes collaborate under the orchestrator's guidance, ensuring tasks are partitioned appropriately, executed, and results consolidated.
|
||||
|
||||
### **3.4 Hivemind Level**
|
||||
Envisioned as a 'Swarm of Swarms'. An upper echelon of collaboration.
|
||||
|
||||
#### Mechanics:
|
||||
* **Hivemind Orchestrator**: Oversees multiple swarm orchestrators, ensuring harmony on a grand scale.
|
||||
* **Inter-Swarm Communication Protocols**: Dictates how swarms interact, exchange information, and co-execute tasks.
|
||||
|
||||
#### Interaction:
|
||||
Multiple swarms, each a formidable force, combine their prowess under the Hivemind. This level tackles monumental tasks by dividing them among swarms.
|
||||
|
||||
---
|
||||
|
||||
## **4. Building the Framework: A Task Checklist**
|
||||
|
||||
### **4.1 Foundations: Agent Level**
|
||||
* Define and standardize agent properties.
|
||||
* Integrate desired model (e.g., OpenAI's GPT) with agent.
|
||||
* Implement Vectorstore mechanisms: storage, retrieval, and communication protocols.
|
||||
* Incorporate essential tools and utilities.
|
||||
* Conduct preliminary testing: Ensure agents can execute basic tasks and utilize the Vectorstore.
|
||||
|
||||
### **4.2 Enhancements: Worker Infrastructure Level**
|
||||
* Interface agents with human input mechanisms.
|
||||
* Assign and manage unique identifiers for each worker.
|
||||
* Integrate asynchronous capabilities: Ensure real-time response and multitasking.
|
||||
* Test worker nodes for both solitary and collaborative tasks.
|
||||
|
||||
### **4.3 Cohesion: Swarm Level**
|
||||
* Design and develop the orchestrator: Ensure it can manage multiple worker nodes.
|
||||
* Establish a scalable and efficient communication layer.
|
||||
* Implement task distribution and retrieval protocols.
|
||||
* Test swarms for efficiency, scalability, and robustness.
|
||||
|
||||
### **4.4 Apex Collaboration: Hivemind Level**
|
||||
* Build the Hivemind Orchestrator: Ensure it can oversee multiple swarms.
|
||||
* Define inter-swarm communication, prioritization, and task-sharing protocols.
|
||||
* Develop mechanisms to balance loads and optimize resource utilization across swarms.
|
||||
* Thoroughly test the Hivemind level for macro-task execution.
|
||||
|
||||
---
|
||||
|
||||
## **5. Integration and Communication Mechanisms**
|
||||
|
||||
### **5.1 Vectorstore as the Universal Communication Layer**
|
||||
Serving as the memory and communication backbone, the Vectorstore must:
|
||||
* Facilitate rapid storage and retrieval of high-dimensional vectors.
|
||||
* Enable similarity-based lookups: Crucial for recognizing patterns or finding similar outputs.
|
||||
* Scale seamlessly as agent count grows.
|
||||
|
||||
### **5.2 Orchestrator-Driven Communication**
|
||||
* Orchestrators, both at the swarm and hivemind level, should employ adaptive algorithms to optimally distribute tasks.
|
||||
* Ensure real-time monitoring of task execution and worker node health.
|
||||
* Integrate feedback loops: Allow for dynamic task reassignment in case of node failures or inefficiencies.
|
||||
|
||||
---
|
||||
|
||||
## **6. Conclusion & Forward Path**
|
||||
|
||||
The Swarms framework, once realized, will usher in a new era of computational efficiency and collaboration. While the roadmap ahead is intricate, with diligent planning, development, and testing, Swarms will redefine the boundaries of collaborative computing.
|
||||
|
||||
--------
|
||||
|
||||
|
||||
# Overview
|
||||
|
||||
### 1. Model
|
||||
|
||||
**Overview:**
|
||||
The foundational level where a trained model (e.g., OpenAI GPT model) is initialized. It's the base on which further abstraction levels build upon. It provides the core capabilities to perform tasks, answer queries, etc.
|
||||
|
||||
**Diagram:**
|
||||
```
|
||||
[ Model (openai) ]
|
||||
```
|
||||
|
||||
### 2. Agent Level
|
||||
|
||||
**Overview:**
|
||||
At the agent level, the raw model is coupled with tools and a vector store, allowing it to be more than just a model. The agent can now remember, use tools, and become a more versatile entity ready for integration into larger systems.
|
||||
|
||||
**Diagram:**
|
||||
```
|
||||
+-----------+
|
||||
| Agent |
|
||||
| +-------+ |
|
||||
| | Model | |
|
||||
| +-------+ |
|
||||
| +-----------+ |
|
||||
| | VectorStore | |
|
||||
| +-----------+ |
|
||||
| +-------+ |
|
||||
| | Tools | |
|
||||
| +-------+ |
|
||||
+-----------+
|
||||
```
|
||||
|
||||
### 3. Worker Infrastructure Level
|
||||
|
||||
**Overview:**
|
||||
The worker infrastructure is a step above individual agents. Here, an agent is paired with additional utilities like human input and other tools, making it a more advanced, responsive unit capable of complex tasks.
|
||||
|
||||
**Diagram:**
|
||||
```
|
||||
+----------------+
|
||||
| WorkerNode |
|
||||
| +-----------+ |
|
||||
| | Agent | |
|
||||
| | +-------+ | |
|
||||
| | | Model | | |
|
||||
| | +-------+ | |
|
||||
| | +-------+ | |
|
||||
| | | Tools | | |
|
||||
| | +-------+ | |
|
||||
| +-----------+ |
|
||||
| |
|
||||
| +-----------+ |
|
||||
| |Human Input| |
|
||||
| +-----------+ |
|
||||
| |
|
||||
| +-------+ |
|
||||
| | Tools | |
|
||||
| +-------+ |
|
||||
+----------------+
|
||||
```
|
||||
|
||||
### 4. Swarm Level
|
||||
|
||||
**Overview:**
|
||||
At the swarm level, the orchestrator is central. It's responsible for assigning tasks to worker nodes, monitoring their completion, and handling the communication layer (for example, through a vector store or another universal communication mechanism) between worker nodes.
|
||||
|
||||
**Diagram:**
|
||||
```
|
||||
+------------+
|
||||
|Orchestrator|
|
||||
+------------+
|
||||
|
|
||||
+---------------------------+
|
||||
| |
|
||||
| Swarm-level Communication|
|
||||
| Layer (e.g. |
|
||||
| Vector Store) |
|
||||
+---------------------------+
|
||||
/ | \
|
||||
+---------------+ +---------------+ +---------------+
|
||||
|WorkerNode 1 | |WorkerNode 2 | |WorkerNode n |
|
||||
| | | | | |
|
||||
+---------------+ +---------------+ +---------------+
|
||||
| Task Assigned | Task Completed | Communication |
|
||||
```
|
||||
|
||||
### 5. Hivemind Level
|
||||
|
||||
**Overview:**
|
||||
At the Hivemind level, it's a multi-swarm setup, with an upper-layer orchestrator managing multiple swarm-level orchestrators. The Hivemind orchestrator is responsible for broader tasks like assigning macro-tasks to swarms, handling inter-swarm communications, and ensuring the overall system is functioning smoothly.
|
||||
|
||||
**Diagram:**
|
||||
```
|
||||
+--------+
|
||||
|Hivemind|
|
||||
+--------+
|
||||
|
|
||||
+--------------+
|
||||
|Hivemind |
|
||||
|Orchestrator |
|
||||
+--------------+
|
||||
/ | \
|
||||
+------------+ +------------+ +------------+
|
||||
|Orchestrator| |Orchestrator| |Orchestrator|
|
||||
+------------+ +------------+ +------------+
|
||||
| | |
|
||||
+--------------+ +--------------+ +--------------+
|
||||
| Swarm-level| | Swarm-level| | Swarm-level|
|
||||
|Communication| |Communication| |Communication|
|
||||
| Layer | | Layer | | Layer |
|
||||
+--------------+ +--------------+ +--------------+
|
||||
/ \ / \ / \
|
||||
+-------+ +-------+ +-------+ +-------+ +-------+
|
||||
|Worker | |Worker | |Worker | |Worker | |Worker |
|
||||
| Node | | Node | | Node | | Node | | Node |
|
||||
+-------+ +-------+ +-------+ +-------+ +-------+
|
||||
```
|
||||
|
||||
This setup allows the Hivemind level to operate at a grander scale, with the capability to manage hundreds or even thousands of worker nodes across multiple swarms efficiently.
|
||||
|
||||
|
||||
|
||||
-------
|
||||
# **Swarms Framework Development Strategy Checklist**
|
||||
|
||||
## **Introduction**
|
||||
|
||||
The development of the Swarms framework requires a systematic and granular approach to ensure that each component is robust and that the overall framework is efficient and scalable. This checklist will serve as a guide to building Swarms from the ground up, breaking down tasks into small, manageable pieces.
|
||||
|
||||
---
|
||||
|
||||
## **1. Agent Level Development**
|
||||
|
||||
### **1.1 Model Integration**
|
||||
- [ ] Research the most suitable models (e.g., OpenAI's GPT).
|
||||
- [ ] Design an API for the agent to call the model.
|
||||
- [ ] Implement error handling when model calls fail.
|
||||
- [ ] Test the model with sample data for accuracy and speed.
|
||||
|
||||
### **1.2 Vectorstore Implementation**
|
||||
- [ ] Design the schema for the vector storage system.
|
||||
- [ ] Implement storage methods to add, delete, and update vectors.
|
||||
- [ ] Develop retrieval methods with optimization for speed.
|
||||
- [ ] Create protocols for vector-based communication between agents.
|
||||
- [ ] Conduct stress tests to ascertain storage and retrieval speed.
|
||||
|
||||
### **1.3 Tools & Utilities Integration**
|
||||
- [ ] List out essential tools required for agent functionality.
|
||||
- [ ] Develop or integrate APIs for each tool.
|
||||
- [ ] Implement error handling and logging for tool interactions.
|
||||
- [ ] Validate tools integration with unit tests.
|
||||
|
||||
---
|
||||
|
||||
## **2. Worker Infrastructure Level Development**
|
||||
|
||||
### **2.1 Human Input Integration**
|
||||
- [ ] Design a UI/UX for human interaction with worker nodes.
|
||||
- [ ] Create APIs for input collection.
|
||||
- [ ] Implement input validation and error handling.
|
||||
- [ ] Test human input methods for clarity and ease of use.
|
||||
|
||||
### **2.2 Unique Identifier System**
|
||||
- [ ] Research optimal formats for unique ID generation.
|
||||
- [ ] Develop methods for generating and assigning IDs to agents.
|
||||
- [ ] Implement a tracking system to manage and monitor agents via IDs.
|
||||
- [ ] Validate the uniqueness and reliability of the ID system.
|
||||
|
||||
### **2.3 Asynchronous Operation Tools**
|
||||
- [ ] Incorporate libraries/frameworks to enable asynchrony.
|
||||
- [ ] Ensure tasks within an agent can run in parallel without conflict.
|
||||
- [ ] Test asynchronous operations for efficiency improvements.
|
||||
|
||||
---
|
||||
|
||||
## **3. Swarm Level Development**
|
||||
|
||||
### **3.1 Orchestrator Design & Development**
|
||||
- [ ] Draft a blueprint of orchestrator functionalities.
|
||||
- [ ] Implement methods for task distribution among worker nodes.
|
||||
- [ ] Develop communication protocols for the orchestrator to monitor workers.
|
||||
- [ ] Create feedback systems to detect and address worker node failures.
|
||||
- [ ] Test orchestrator with a mock swarm to ensure efficient task allocation.
|
||||
|
||||
### **3.2 Communication Layer Development**
|
||||
- [ ] Select a suitable communication protocol/framework (e.g., gRPC, WebSockets).
|
||||
- [ ] Design the architecture for scalable, low-latency communication.
|
||||
- [ ] Implement methods for sending, receiving, and broadcasting messages.
|
||||
- [ ] Test communication layer for reliability, speed, and error handling.
|
||||
|
||||
### **3.3 Task Management Protocols**
|
||||
- [ ] Develop a system to queue, prioritize, and allocate tasks.
|
||||
- [ ] Implement methods for real-time task status tracking.
|
||||
- [ ] Create a feedback loop for completed tasks.
|
||||
- [ ] Test task distribution, execution, and feedback systems for efficiency.
|
||||
|
||||
---
|
||||
|
||||
## **4. Hivemind Level Development**
|
||||
|
||||
### **4.1 Hivemind Orchestrator Development**
|
||||
- [ ] Extend swarm orchestrator functionalities to manage multiple swarms.
|
||||
- [ ] Create inter-swarm communication protocols.
|
||||
- [ ] Implement load balancing mechanisms to distribute tasks across swarms.
|
||||
- [ ] Validate hivemind orchestrator functionalities with multi-swarm setups.
|
||||
|
||||
### **4.2 Inter-Swarm Communication Protocols**
|
||||
- [ ] Design methods for swarms to exchange data.
|
||||
- [ ] Implement data reconciliation methods for swarms working on shared tasks.
|
||||
- [ ] Test inter-swarm communication for efficiency and data integrity.
|
||||
|
||||
---
|
||||
|
||||
## **5. Scalability & Performance Testing**
|
||||
|
||||
- [ ] Simulate heavy loads to test the limits of the framework.
|
||||
- [ ] Identify and address bottlenecks in both communication and computation.
|
||||
- [ ] Conduct speed tests under different conditions.
|
||||
- [ ] Test the system's responsiveness under various levels of stress.
|
||||
|
||||
---
|
||||
|
||||
## **6. Documentation & User Guide**
|
||||
|
||||
- [ ] Develop detailed documentation covering architecture, setup, and usage.
|
||||
- [ ] Create user guides with step-by-step instructions.
|
||||
- [ ] Incorporate visual aids, diagrams, and flowcharts for clarity.
|
||||
- [ ] Update documentation regularly with new features and improvements.
|
||||
|
||||
---
|
||||
|
||||
## **7. Continuous Integration & Deployment**
|
||||
|
||||
- [ ] Setup CI/CD pipelines for automated testing and deployment.
|
||||
- [ ] Ensure automatic rollback in case of deployment failures.
|
||||
- [ ] Integrate code quality and security checks in the pipeline.
|
||||
- [ ] Document deployment strategies and best practices.
|
||||
|
||||
---
|
||||
|
||||
## **Conclusion**
|
||||
|
||||
The Swarms framework represents a monumental leap in agent-based computation. This checklist provides a thorough roadmap for the framework's development, ensuring that every facet is addressed in depth. Through diligent adherence to this guide, the Swarms vision can be realized as a powerful, scalable, and robust system ready to tackle the challenges of tomorrow.
|
||||
|
||||
(Note: This document, given the word limit, provides a high-level overview. A full 5000-word document would delve into even more intricate details, nuances, potential pitfalls, and include considerations for security, user experience, compatibility, etc.)
|
@ -0,0 +1,86 @@
|
||||
# Bounty Program
|
||||
|
||||
Our bounty program is an exciting opportunity for contributors to help us build the future of Swarms. By participating, you can earn rewards while contributing to a project that aims to revolutionize digital activity.
|
||||
|
||||
Here's how it works:
|
||||
|
||||
1. **Check out our Roadmap**: We've shared our roadmap detailing our short and long-term goals. These are the areas where we're seeking contributions.
|
||||
|
||||
2. **Pick a Task**: Choose a task from the roadmap that aligns with your skills and interests. If you're unsure, you can reach out to our team for guidance.
|
||||
|
||||
3. **Get to Work**: Once you've chosen a task, start working on it. Remember, quality is key. We're looking for contributions that truly make a difference.
|
||||
|
||||
4. **Submit your Contribution**: Once your work is complete, submit it for review. We'll evaluate your contribution based on its quality, relevance, and the value it brings to Swarms.
|
||||
|
||||
5. **Earn Rewards**: If your contribution is approved, you'll earn a bounty. The amount of the bounty depends on the complexity of the task, the quality of your work, and the value it brings to Swarms.
|
||||
|
||||
## The Three Phases of Our Bounty Program
|
||||
|
||||
### Phase 1: Building the Foundation
|
||||
In the first phase, our focus is on building the basic infrastructure of Swarms. This includes developing key components like the Swarms class, integrating essential tools, and establishing task completion and evaluation logic. We'll also start developing our testing and evaluation framework during this phase. If you're interested in foundational work and have a knack for building robust, scalable systems, this phase is for you.
|
||||
|
||||
### Phase 2: Enhancing the System
|
||||
In the second phase, we'll focus on enhancing Swarms by integrating more advanced features, improving the system's efficiency, and refining our testing and evaluation framework. This phase involves more complex tasks, so if you enjoy tackling challenging problems and contributing to the development of innovative features, this is the phase for you.
|
||||
|
||||
### Phase 3: Towards Super-Intelligence
|
||||
The third phase of our bounty program is the most exciting - this is where we aim to achieve super-intelligence. In this phase, we'll be working on improving the swarm's capabilities, expanding its skills, and fine-tuning the system based on real-world testing and feedback. If you're excited about the future of AI and want to contribute to a project that could potentially transform the digital world, this is the phase for you.
|
||||
|
||||
Remember, our roadmap is a guide, and we encourage you to bring your own ideas and creativity to the table. We believe that every contribution, no matter how small, can make a difference. So join us on this exciting journey and help us create the future of Swarms.
|
||||
|
||||
**To participate in our bounty program, visit the [Swarms Bounty Program Page](https://swarms.ai/bounty).** Let's build the future together!
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## Bounties for Roadmap Items
|
||||
|
||||
To accelerate the development of Swarms and to encourage more contributors to join our journey towards automating every digital activity in existence, we are announcing a Bounty Program for specific roadmap items. Each bounty will be rewarded based on the complexity and importance of the task. Below are the items available for bounty:
|
||||
|
||||
1. **Multi-Agent Debate Integration**: $2000
|
||||
2. **Meta Prompting Integration**: $1500
|
||||
3. **Swarms Class**: $1500
|
||||
4. **Integration of Additional Tools**: $1000
|
||||
5. **Task Completion and Evaluation Logic**: $2000
|
||||
6. **Ocean Integration**: $2500
|
||||
7. **Improved Communication**: $2000
|
||||
8. **Testing and Evaluation**: $1500
|
||||
9. **Worker Swarm Class**: $2000
|
||||
10. **Documentation**: $500
|
||||
|
||||
For each bounty task, there will be a strict evaluation process to ensure the quality of the contribution. This process includes a thorough review of the code and extensive testing to ensure it meets our standards.
|
||||
|
||||
# 3-Phase Testing Framework
|
||||
|
||||
To ensure the quality and efficiency of the Swarm, we will introduce a 3-phase testing framework which will also serve as our evaluation criteria for each of the bounty tasks.
|
||||
|
||||
## Phase 1: Unit Testing
|
||||
In this phase, individual modules will be tested to ensure that they work correctly in isolation. Unit tests will be designed for all functions and methods, with an emphasis on edge cases.
|
||||
|
||||
## Phase 2: Integration Testing
|
||||
After passing unit tests, we will test the integration of different modules to ensure they work correctly together. This phase will also test the interoperability of the Swarm with external systems and libraries.
|
||||
|
||||
## Phase 3: Benchmarking & Stress Testing
|
||||
In the final phase, we will perform benchmarking and stress tests. We'll push the limits of the Swarm under extreme conditions to ensure it performs well in real-world scenarios. This phase will measure the performance, speed, and scalability of the Swarm under high load conditions.
|
||||
|
||||
By following this 3-phase testing framework, we aim to develop a reliable, high-performing, and scalable Swarm that can automate all digital activities.
|
||||
|
||||
# Reverse Engineering to Reach Phase 3
|
||||
|
||||
To reach the Phase 3 level, we need to reverse engineer the tasks we need to complete. Here's an example of what this might look like:
|
||||
|
||||
1. **Set Clear Expectations**: Define what success looks like for each task. Be clear about the outputs and outcomes we expect. This will guide our testing and development efforts.
|
||||
|
||||
2. **Develop Testing Scenarios**: Create a comprehensive list of testing scenarios that cover both common and edge cases. This will help us ensure that our Swarm can handle a wide range of situations.
|
||||
|
||||
3. **Write Test Cases**: For each scenario, write detailed test cases that outline the exact steps to be followed, the inputs to be used, and the expected outputs.
|
||||
|
||||
4. **Execute the Tests**: Run the test cases on our Swarm, making note of any issues or bugs that arise.
|
||||
|
||||
5. **Iterate and Improve**: Based on the results of our tests, iterate and improve our Swarm. This may involve fixing bugs, optimizing code, or redesigning parts of our system.
|
||||
|
||||
6. **Repeat**: Repeat this process until our Swarm meets our expectations and passes all test cases.
|
||||
|
||||
By following these steps, we will systematically build, test, and improve our Swarm until it reaches the Phase 3 level. This methodical approach will help us ensure that we create a reliable, high-performing, and scalable Swarm that can truly automate all digital activities.
|
||||
|
||||
Let's shape the future of digital automation together!
|
@ -0,0 +1,122 @@
|
||||
# **Swarms Framework Development Strategy Checklist**
|
||||
|
||||
## **Introduction**
|
||||
|
||||
The development of the Swarms framework requires a systematic and granular approach to ensure that each component is robust and that the overall framework is efficient and scalable. This checklist will serve as a guide to building Swarms from the ground up, breaking down tasks into small, manageable pieces.
|
||||
|
||||
---
|
||||
|
||||
## **1. Agent Level Development**
|
||||
|
||||
### **1.1 Model Integration**
|
||||
- [ ] Research the most suitable models (e.g., OpenAI's GPT).
|
||||
- [ ] Design an API for the agent to call the model.
|
||||
- [ ] Implement error handling when model calls fail.
|
||||
- [ ] Test the model with sample data for accuracy and speed.
|
||||
|
||||
### **1.2 Vectorstore Implementation**
|
||||
- [ ] Design the schema for the vector storage system.
|
||||
- [ ] Implement storage methods to add, delete, and update vectors.
|
||||
- [ ] Develop retrieval methods with optimization for speed.
|
||||
- [ ] Create protocols for vector-based communication between agents.
|
||||
- [ ] Conduct stress tests to ascertain storage and retrieval speed.
|
||||
|
||||
### **1.3 Tools & Utilities Integration**
|
||||
- [ ] List out essential tools required for agent functionality.
|
||||
- [ ] Develop or integrate APIs for each tool.
|
||||
- [ ] Implement error handling and logging for tool interactions.
|
||||
- [ ] Validate tools integration with unit tests.
|
||||
|
||||
---
|
||||
|
||||
## **2. Worker Infrastructure Level Development**
|
||||
|
||||
### **2.1 Human Input Integration**
|
||||
- [ ] Design a UI/UX for human interaction with worker nodes.
|
||||
- [ ] Create APIs for input collection.
|
||||
- [ ] Implement input validation and error handling.
|
||||
- [ ] Test human input methods for clarity and ease of use.
|
||||
|
||||
### **2.2 Unique Identifier System**
|
||||
- [ ] Research optimal formats for unique ID generation.
|
||||
- [ ] Develop methods for generating and assigning IDs to agents.
|
||||
- [ ] Implement a tracking system to manage and monitor agents via IDs.
|
||||
- [ ] Validate the uniqueness and reliability of the ID system.
|
||||
|
||||
### **2.3 Asynchronous Operation Tools**
|
||||
- [ ] Incorporate libraries/frameworks to enable asynchrony.
|
||||
- [ ] Ensure tasks within an agent can run in parallel without conflict.
|
||||
- [ ] Test asynchronous operations for efficiency improvements.
|
||||
|
||||
---
|
||||
|
||||
## **3. Swarm Level Development**
|
||||
|
||||
### **3.1 Orchestrator Design & Development**
|
||||
- [ ] Draft a blueprint of orchestrator functionalities.
|
||||
- [ ] Implement methods for task distribution among worker nodes.
|
||||
- [ ] Develop communication protocols for the orchestrator to monitor workers.
|
||||
- [ ] Create feedback systems to detect and address worker node failures.
|
||||
- [ ] Test orchestrator with a mock swarm to ensure efficient task allocation.
|
||||
|
||||
### **3.2 Communication Layer Development**
|
||||
- [ ] Select a suitable communication protocol/framework (e.g., gRPC, WebSockets).
|
||||
- [ ] Design the architecture for scalable, low-latency communication.
|
||||
- [ ] Implement methods for sending, receiving, and broadcasting messages.
|
||||
- [ ] Test communication layer for reliability, speed, and error handling.
|
||||
|
||||
### **3.3 Task Management Protocols**
|
||||
- [ ] Develop a system to queue, prioritize, and allocate tasks.
|
||||
- [ ] Implement methods for real-time task status tracking.
|
||||
- [ ] Create a feedback loop for completed tasks.
|
||||
- [ ] Test task distribution, execution, and feedback systems for efficiency.
|
||||
|
||||
---
|
||||
|
||||
## **4. Hivemind Level Development**
|
||||
|
||||
### **4.1 Hivemind Orchestrator Development**
|
||||
- [ ] Extend swarm orchestrator functionalities to manage multiple swarms.
|
||||
- [ ] Create inter-swarm communication protocols.
|
||||
- [ ] Implement load balancing mechanisms to distribute tasks across swarms.
|
||||
- [ ] Validate hivemind orchestrator functionalities with multi-swarm setups.
|
||||
|
||||
### **4.2 Inter-Swarm Communication Protocols**
|
||||
- [ ] Design methods for swarms to exchange data.
|
||||
- [ ] Implement data reconciliation methods for swarms working on shared tasks.
|
||||
- [ ] Test inter-swarm communication for efficiency and data integrity.
|
||||
|
||||
---
|
||||
|
||||
## **5. Scalability & Performance Testing**
|
||||
|
||||
- [ ] Simulate heavy loads to test the limits of the framework.
|
||||
- [ ] Identify and address bottlenecks in both communication and computation.
|
||||
- [ ] Conduct speed tests under different conditions.
|
||||
- [ ] Test the system's responsiveness under various levels of stress.
|
||||
|
||||
---
|
||||
|
||||
## **6. Documentation & User Guide**
|
||||
|
||||
- [ ] Develop detailed documentation covering architecture, setup, and usage.
|
||||
- [ ] Create user guides with step-by-step instructions.
|
||||
- [ ] Incorporate visual aids, diagrams, and flowcharts for clarity.
|
||||
- [ ] Update documentation regularly with new features and improvements.
|
||||
|
||||
---
|
||||
|
||||
## **7. Continuous Integration & Deployment**
|
||||
|
||||
- [ ] Setup CI/CD pipelines for automated testing and deployment.
|
||||
- [ ] Ensure automatic rollback in case of deployment failures.
|
||||
- [ ] Integrate code quality and security checks in the pipeline.
|
||||
- [ ] Document deployment strategies and best practices.
|
||||
|
||||
---
|
||||
|
||||
## **Conclusion**
|
||||
|
||||
The Swarms framework represents a monumental leap in agent-based computation. This checklist provides a thorough roadmap for the framework's development, ensuring that every facet is addressed in depth. Through diligent adherence to this guide, the Swarms vision can be realized as a powerful, scalable, and robust system ready to tackle the challenges of tomorrow.
|
||||
|
||||
(Note: This document, given the word limit, provides a high-level overview. A full 5000-word document would delve into even more intricate details, nuances, potential pitfalls, and include considerations for security, user experience, compatibility, etc.)
|
@ -0,0 +1,100 @@
|
||||
# Costs Structure of Deploying Autonomous Agents
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. Introduction
|
||||
2. Our Time: Generating System Prompts and Custom Tools
|
||||
3. Consultancy Fees
|
||||
4. Model Inference Infrastructure
|
||||
5. Deployment and Continual Maintenance
|
||||
6. Output Metrics: Blogs Generation Rates
|
||||
|
||||
---
|
||||
|
||||
## 1. Introduction
|
||||
|
||||
Autonomous agents are revolutionizing various industries, from self-driving cars to chatbots and customer service solutions. The prospect of automation and improved efficiency makes these agents attractive investments. However, like any other technological solution, deploying autonomous agents involves several cost elements that organizations need to consider carefully. This comprehensive guide aims to provide an exhaustive outline of the costs associated with deploying autonomous agents.
|
||||
|
||||
---
|
||||
|
||||
## 2. Our Time: Generating System Prompts and Custom Tools
|
||||
|
||||
### Description
|
||||
|
||||
The deployment of autonomous agents often requires a substantial investment of time to develop system prompts and custom tools tailored to specific operational needs.
|
||||
|
||||
### Costs
|
||||
|
||||
| Task | Time Required (Hours) | Cost per Hour ($) | Total Cost ($) |
|
||||
| ------------------------ | --------------------- | ----------------- | -------------- |
|
||||
| System Prompts Design | 50 | 100 | 5,000 |
|
||||
| Custom Tools Development | 100 | 100 | 10,000 |
|
||||
| **Total** | **150** | | **15,000** |
|
||||
|
||||
---
|
||||
|
||||
## 3. Consultancy Fees
|
||||
|
||||
### Description
|
||||
|
||||
Consultation is often necessary for navigating the complexities of autonomous agents. This includes system assessment, customization, and other essential services.
|
||||
|
||||
### Costs
|
||||
|
||||
| Service | Fees ($) |
|
||||
| -------------------- | --------- |
|
||||
| Initial Assessment | 5,000 |
|
||||
| System Customization | 7,000 |
|
||||
| Training | 3,000 |
|
||||
| **Total** | **15,000**|
|
||||
|
||||
---
|
||||
|
||||
## 4. Model Inference Infrastructure
|
||||
|
||||
### Description
|
||||
|
||||
The hardware and software needed for the agent's functionality, known as the model inference infrastructure, form a significant part of the costs.
|
||||
|
||||
### Costs
|
||||
|
||||
| Component | Cost ($) |
|
||||
| -------------------- | --------- |
|
||||
| Hardware | 10,000 |
|
||||
| Software Licenses | 2,000 |
|
||||
| Cloud Services | 3,000 |
|
||||
| **Total** | **15,000**|
|
||||
|
||||
---
|
||||
|
||||
## 5. Deployment and Continual Maintenance
|
||||
|
||||
### Description
|
||||
|
||||
Once everything is in place, deploying the autonomous agents and their ongoing maintenance are the next major cost factors.
|
||||
|
||||
### Costs
|
||||
|
||||
| Task | Monthly Cost ($) | Annual Cost ($) |
|
||||
| ------------------- | ---------------- | --------------- |
|
||||
| Deployment | 5,000 | 60,000 |
|
||||
| Ongoing Maintenance | 1,000 | 12,000 |
|
||||
| **Total** | **6,000** | **72,000** |
|
||||
|
||||
---
|
||||
|
||||
## 6. Output Metrics: Blogs Generation Rates
|
||||
|
||||
### Description
|
||||
|
||||
To provide a sense of what an investment in autonomous agents can yield, we offer the following data regarding blogs that can be generated as an example of output.
|
||||
|
||||
### Blogs Generation Rates
|
||||
|
||||
| Timeframe | Number of Blogs |
|
||||
|-----------|-----------------|
|
||||
| Per Day | 20 |
|
||||
| Per Week | 140 |
|
||||
| Per Month | 600 |
|
||||
|
||||
|
@ -0,0 +1,112 @@
|
||||
# Swarms Data Room
|
||||
|
||||
## Table of Contents
|
||||
|
||||
**Introduction**
|
||||
|
||||
- Overview of the Company
|
||||
|
||||
- Vision and Mission Statement
|
||||
|
||||
- Executive Summary
|
||||
|
||||
**Corporate Documents**
|
||||
|
||||
- Articles of Incorporation
|
||||
|
||||
- Bylaws
|
||||
|
||||
- Shareholder Agreements
|
||||
|
||||
- Board Meeting Minutes
|
||||
|
||||
- Company Structure and Org Chart
|
||||
|
||||
**Financial Information**
|
||||
|
||||
- Historical Financial Statements
|
||||
|
||||
- Income Statements
|
||||
|
||||
- Balance Sheets
|
||||
|
||||
- Cash Flow Statements
|
||||
|
||||
- Financial Projections and Forecasts
|
||||
|
||||
- Cap Table
|
||||
|
||||
- Funding History and Use of Funds
|
||||
|
||||
**Products and Services**
|
||||
|
||||
- Detailed Descriptions of Products/Services
|
||||
|
||||
- Product Development Roadmap
|
||||
|
||||
- User Manuals and Technical Specifications
|
||||
|
||||
- Case Studies and Use Cases
|
||||
|
||||
|
||||
## **Introdution**
|
||||
Swarms provides automation-as-a-service through swarms of autonomous agents that work together as a team. We enable our customers to build, deploy, and scale production-grade multi-agent applications to automate real-world tasks.
|
||||
|
||||
### **Vision**
|
||||
Our vision for 2024 is to provide the most reliable infrastructure for deploying autonomous agents into the real world through the Swarm Cloud, our premier cloud platform for the scalable deployment of Multi-Modal Autonomous Agents. The platform focuses on delivering maximum value to users by only taking a small fee when utilizing the agents for the hosted compute power needed to host the agents.
|
||||
|
||||
### **Executive Summary**
|
||||
The Swarm Corporation aims to enable AI models to automate complex workflows and operations, not just singular low-value tasks. We believe collaboration between multiple agents can overcome limitations of individual agents for reasoning, planning, etc. This will allow automation of processes in mission-critical industries like security, logistics, and manufacturing where AI adoption is currently low.
|
||||
|
||||
We provide an open source framework to deploy production-grade multi-modal agents in just a few lines of code. This builds our user base, recruits talent, gets customer feedback to improve products, gains awareness and trust.
|
||||
|
||||
Our business model focuses on customer satisfaction, openness, integration with other tools/platforms, and production-grade reliability.
|
||||
|
||||
Go-to-market strategy is to get the framework to product-market fit with over 50K weekly recurring users, then secure high-value contracts in target industries. Long-term monetization via microtransactions, usage-based pricing, subscriptions.
|
||||
|
||||
The team has thousands of hours building and optimizing autonomous agents. Leadership includes AI engineers, product experts, open source contributors and community builders.
|
||||
|
||||
Key milestones: get 80K framework users in January 2024, start contracts in target verticals, introduce commercial products in 2025 with various pricing models.
|
||||
|
||||
### **Resources**
|
||||
- [Swarm Pre-Seed Deck](https://drive.google.com/file/d/1n8o2mjORbG96uDfx4TabjnyieludYaZz/view?usp=sharing)
|
||||
- [Swarm Memo](https://docs.google.com/document/d/1hS_nv_lFjCqLfnJBoF6ULY9roTbSgSuCkvXvSUSc7Lo/edit?usp=sharing)
|
||||
|
||||
|
||||
|
||||
|
||||
## **Financial Documents**
|
||||
This section is dedicated entirely for corporate documents.
|
||||
|
||||
- [Cap Table](https://docs.google.com/spreadsheets/d/1wuTWbfhYaY5Xp6nSQ9R0wDtSpwSS9coHxsjKd0UbIDc/edit?usp=sharing)
|
||||
|
||||
- [Cashflow Prediction Sheet](https://docs.google.com/spreadsheets/d/1HQEHCIXXMHajXMl5sj8MEfcQtWfOnD7GjHtNiocpD60/edit?usp=sharing)
|
||||
|
||||
|
||||
------
|
||||
|
||||
## **Product**
|
||||
Swarms is an open source framework for developers in python to enable seamless, reliable, and scalable multi-agent orchestration through modularity, customization, and precision.
|
||||
|
||||
- [Swarms Github Page:](https://github.com/kyegomez/swarms)
|
||||
- [Swarms Memo](https://docs.google.com/document/d/1hS_nv_lFjCqLfnJBoF6ULY9roTbSgSuCkvXvSUSc7Lo/edit)
|
||||
- [Swarms Project Board](https://github.com/users/kyegomez/projects/1)
|
||||
- [Swarms Website](https://www.swarms.world/g)
|
||||
- [Swarm Ecosystem](https://github.com/kyegomez/swarm-ecosystem)
|
||||
- [Swarm Core](https://github.com/kyegomez/swarms-core)
|
||||
|
||||
### Product Growth Metrics
|
||||
| Name | Description | Link |
|
||||
|----------------------------------|---------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|
|
||||
| Total Downloads of all time | Total number of downloads for the product over its entire lifespan. | [](https://pepy.tech/project/swarms) |
|
||||
| Downloads this month | Number of downloads for the product in the current month. | [](https://pepy.tech/project/swarms) |
|
||||
| Total Downloads this week | Total number of downloads for the product in the current week. | [](https://pepy.tech/project/swarms) |
|
||||
| Github Forks | Number of times the product's codebase has been copied for optimization, contribution, or usage. | [](https://github.com/kyegomez/swarms/network) |
|
||||
| Github Stars | Number of users who have 'liked' the project. | [](https://github.com/kyegomez/swarms/stargazers) |
|
||||
| Pip Module Metrics | Various project statistics such as watchers, number of contributors, date repository was created, and more. | [CLICK HERE](https://libraries.io/github/kyegomez/swarms) |
|
||||
| Contribution Based Statistics | Statistics like number of contributors, lines of code changed, etc. | [HERE](https://github.com/kyegomez/swarms/graphs/contributors) |
|
||||
| Github Community insights | Insights into the Github community around the product. | [Github Community insights](https://github.com/kyegomez/swarms/graphs/community) |
|
||||
| Github Traffic Metrics | Metrics related to traffic, such as views and clones on Github. | [Github Traffic Metrics](https://github.com/kyegomez/swarms/graphs/traffic) |
|
||||
| Issues with the framework | Current open issues for the product on Github. | [](https://github.com/kyegomez/swarms/issues) |
|
||||
|
||||
|
@ -0,0 +1,9 @@
|
||||
# Demo Ideas
|
||||
|
||||
* We could also try to create an AI influencer run by a swarm, let it create a whole identity and generate images, memes, and other content for Twitter, Reddit, etc.
|
||||
|
||||
* had a thought that we should have either a more general one of these or a swarm or both -- need something connecting all the calendars, events, and initiatives of all the AI communities, langchain, laion, eluther, lesswrong, gato, rob miles, chatgpt hackers, etc etc
|
||||
|
||||
* Swarm of AI influencers to spread marketing
|
||||
|
||||
* Delegation System to better organize teams: Start with a team of passionate humans and let them self-report their skills/strengths so the agent has a concept of who to delegate to, then feed the agent a huge task list (like the bullet list a few messages above) that it breaks down into actionable steps and "prompts" specific team members to complete tasks. Could even suggest breakout teams of a few people with complementary skills to tackle more complex tasks. There can also be a live board that updates each time a team member completes something, to encourage momentum and keep track of progress
|
@ -0,0 +1,152 @@
|
||||
# Design Philosophy Document for Swarms
|
||||
|
||||
## Usable
|
||||
|
||||
### Objective
|
||||
|
||||
Our goal is to ensure that Swarms is intuitive and easy to use for all users, regardless of their level of technical expertise. This includes the developers who implement Swarms in their applications, as well as end users who interact with the implemented systems.
|
||||
|
||||
### Tactics
|
||||
|
||||
- Clear and Comprehensive Documentation: We will provide well-written and easily accessible documentation that guides users through using and understanding Swarms.
|
||||
- User-Friendly APIs: We'll design clean and self-explanatory APIs that help developers to understand their purpose quickly.
|
||||
- Prompt and Effective Support: We will ensure that support is readily available to assist users when they encounter problems or need help with Swarms.
|
||||
|
||||
## Reliable
|
||||
|
||||
### Objective
|
||||
|
||||
Swarms should be dependable and trustworthy. Users should be able to count on Swarms to perform consistently and without error or failure.
|
||||
|
||||
### Tactics
|
||||
|
||||
- Robust Error Handling: We will focus on error prevention, detection, and recovery to minimize failures in Swarms.
|
||||
- Comprehensive Testing: We will apply various testing methodologies such as unit testing, integration testing, and stress testing to validate the reliability of our software.
|
||||
- Continuous Integration/Continuous Delivery (CI/CD): We will use CI/CD pipelines to ensure that all changes are tested and validated before they're merged into the main branch.
|
||||
|
||||
## Fast
|
||||
|
||||
### Objective
|
||||
|
||||
Swarms should offer high performance and rapid response times. The system should be able to handle requests and tasks swiftly.
|
||||
|
||||
### Tactics
|
||||
|
||||
- Efficient Algorithms: We will focus on optimizing our algorithms and data structures to ensure they run as quickly as possible.
|
||||
- Caching: Where appropriate, we will use caching techniques to speed up response times.
|
||||
- Profiling and Performance Monitoring: We will regularly analyze the performance of Swarms to identify bottlenecks and opportunities for improvement.
|
||||
|
||||
## Scalable
|
||||
|
||||
### Objective
|
||||
|
||||
Swarms should be able to grow in capacity and complexity without compromising performance or reliability. It should be able to handle increased workloads gracefully.
|
||||
|
||||
### Tactics
|
||||
|
||||
- Modular Architecture: We will design Swarms using a modular architecture that allows for easy scaling and modification.
|
||||
- Load Balancing: We will distribute tasks evenly across available resources to prevent overload and maximize throughput.
|
||||
- Horizontal and Vertical Scaling: We will design Swarms to be capable of both horizontal (adding more machines) and vertical (adding more power to an existing machine) scaling.
|
||||
|
||||
### Philosophy
|
||||
|
||||
Swarms is designed with a philosophy of simplicity and reliability. We believe that software should be a tool that empowers users, not a hurdle that they need to overcome. Therefore, our focus is on usability, reliability, speed, and scalability. We want our users to find Swarms intuitive and dependable, fast and adaptable to their needs. This philosophy guides all of our design and development decisions.
|
||||
|
||||
# Swarm Architecture Design Document
|
||||
|
||||
## Overview
|
||||
|
||||
The goal of the Swarm Architecture is to provide a flexible and scalable system to build swarm intelligence models that can solve complex problems. This document details the proposed design to create a plug-and-play system, which makes it easy to create custom swarms, and provides pre-configured swarms with multi-modal agents.
|
||||
|
||||
## Design Principles
|
||||
|
||||
- **Modularity**: The system will be built in a modular fashion, allowing various components to be easily swapped or upgraded.
|
||||
- **Interoperability**: Different swarm classes and components should be able to work together seamlessly.
|
||||
- **Scalability**: The design should support the growth of the system by adding more components or swarms.
|
||||
- **Ease of Use**: Users should be able to easily create their own swarms or use pre-configured ones with minimal configuration.
|
||||
|
||||
## Design Components
|
||||
|
||||
### BaseSwarm
|
||||
|
||||
The BaseSwarm is an abstract base class which defines the basic structure of a swarm and the methods that need to be implemented. Any new swarm should inherit from this class and implement the required methods.
|
||||
|
||||
### Swarm Classes
|
||||
|
||||
Various Swarm classes can be implemented inheriting from the BaseSwarm class. Each swarm class should implement the required methods for initializing the components, worker nodes, and boss node, and running the swarm.
|
||||
|
||||
Pre-configured swarm classes with multi-modal agents can be provided for ease of use. These classes come with a default configuration of tools and agents, which can be used out of the box.
|
||||
|
||||
### Tools and Agents
|
||||
|
||||
Tools and agents are the components that provide the actual functionality to the swarms. They can be language models, AI assistants, vector stores, or any other components that can help in problem solving.
|
||||
|
||||
To make the system plug-and-play, a standard interface should be defined for these components. Any new tool or agent should implement this interface, so that it can be easily plugged into the system.
|
||||
|
||||
## Usage
|
||||
|
||||
Users can either use pre-configured swarms or create their own custom swarms.
|
||||
|
||||
To use a pre-configured swarm, they can simply instantiate the corresponding swarm class and call the run method with the required objective.
|
||||
|
||||
To create a custom swarm, they need to:
|
||||
|
||||
1. Define a new swarm class inheriting from BaseSwarm.
|
||||
2. Implement the required methods for the new swarm class.
|
||||
3. Instantiate the swarm class and call the run method.
|
||||
|
||||
### Example
|
||||
|
||||
```python
|
||||
# Using pre-configured swarm
|
||||
swarm = PreConfiguredSwarm(openai_api_key)
|
||||
swarm.run_swarms(objective)
|
||||
|
||||
# Creating custom swarm
|
||||
class CustomSwarm(BaseSwarm):
|
||||
# Implement required methods
|
||||
|
||||
swarm = CustomSwarm(openai_api_key)
|
||||
swarm.run_swarms(objective)
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
This Swarm Architecture design provides a scalable and flexible system for building swarm intelligence models. The plug-and-play design allows users to easily use pre-configured swarms or create their own custom swarms.
|
||||
|
||||
|
||||
# Swarming Architectures
|
||||
Sure, below are five different swarm architectures with their base requirements and an abstract class that processes these components:
|
||||
|
||||
1. **Hierarchical Swarm**: This architecture is characterized by a boss/worker relationship. The boss node takes high-level decisions and delegates tasks to the worker nodes. The worker nodes perform tasks and report back to the boss node.
|
||||
- Requirements: Boss node (can be a large language model), worker nodes (can be smaller language models), and a task queue for task management.
|
||||
|
||||
2. **Homogeneous Swarm**: In this architecture, all nodes in the swarm are identical and contribute equally to problem-solving. Each node has the same capabilities.
|
||||
- Requirements: Homogeneous nodes (can be language models of the same size), communication protocol for nodes to share information.
|
||||
|
||||
3. **Heterogeneous Swarm**: This architecture contains different types of nodes, each with its specific capabilities. This diversity can lead to more robust problem-solving.
|
||||
- Requirements: Different types of nodes (can be different types and sizes of language models), a communication protocol, and a mechanism to delegate tasks based on node capabilities.
|
||||
|
||||
4. **Competitive Swarm**: In this architecture, nodes compete with each other to find the best solution. The system may use a selection process to choose the best solutions.
|
||||
- Requirements: Nodes (can be language models), a scoring mechanism to evaluate node performance, a selection mechanism.
|
||||
|
||||
5. **Cooperative Swarm**: In this architecture, nodes work together and share information to find solutions. The focus is on cooperation rather than competition.
|
||||
- Requirements: Nodes (can be language models), a communication protocol, a consensus mechanism to agree on solutions.
|
||||
|
||||
|
||||
6. **Grid-based Swarm**: This architecture positions agents on a grid, where they can only interact with their neighbors. This is useful for simulations, especially in fields like ecology or epidemiology.
|
||||
- Requirements: Agents (can be language models), a grid structure, and a neighborhood definition (i.e., how to identify neighboring agents).
|
||||
|
||||
7. **Particle Swarm Optimization (PSO) Swarm**: In this architecture, each agent represents a potential solution to an optimization problem. Agents move in the solution space based on their own and their neighbors' past performance. PSO is especially useful for continuous numerical optimization problems.
|
||||
- Requirements: Agents (each representing a solution), a definition of the solution space, an evaluation function to rate the solutions, a mechanism to adjust agent positions based on performance.
|
||||
|
||||
8. **Ant Colony Optimization (ACO) Swarm**: Inspired by ant behavior, this architecture has agents leave a pheromone trail that other agents follow, reinforcing the best paths. It's useful for problems like the traveling salesperson problem.
|
||||
- Requirements: Agents (can be language models), a representation of the problem space, a pheromone updating mechanism.
|
||||
|
||||
9. **Genetic Algorithm (GA) Swarm**: In this architecture, agents represent potential solutions to a problem. They can 'breed' to create new solutions and can undergo 'mutations'. GA swarms are good for search and optimization problems.
|
||||
- Requirements: Agents (each representing a potential solution), a fitness function to evaluate solutions, a crossover mechanism to breed solutions, and a mutation mechanism.
|
||||
|
||||
10. **Stigmergy-based Swarm**: In this architecture, agents communicate indirectly by modifying the environment, and other agents react to such modifications. It's a decentralized method of coordinating tasks.
|
||||
- Requirements: Agents (can be language models), an environment that agents can modify, a mechanism for agents to perceive environment changes.
|
||||
|
||||
These architectures all have unique features and requirements, but they share the need for agents (often implemented as language models) and a mechanism for agents to communicate or interact, whether it's directly through messages, indirectly through the environment, or implicitly through a shared solution space. Some also require specific data structures, like a grid or problem space, and specific algorithms, like for evaluating solutions or updating agent positions.
|
@ -0,0 +1,469 @@
|
||||
|
||||
|
||||
# Swarms Monetization Strategy
|
||||
|
||||
This strategy includes a variety of business models, potential revenue streams, cashflow structures, and customer identification methods. Let's explore these further.
|
||||
|
||||
## Business Models
|
||||
|
||||
1. **Platform as a Service (PaaS):** Provide the Swarms AI platform on a subscription basis, charged monthly or annually. This could be tiered based on usage and access to premium features.
|
||||
|
||||
2. **API Usage-based Pricing:** Charge customers based on their usage of the Swarms API. The more requests made, the higher the fee.
|
||||
|
||||
3. **Managed Services:** Offer complete end-to-end solutions where you manage the entire AI infrastructure for the clients. This could be on a contract basis with a recurring fee.
|
||||
|
||||
4. **Training and Certification:** Provide Swarms AI training and certification programs for interested developers and businesses. These could be monetized as separate courses or subscription-based access.
|
||||
|
||||
5. **Partnerships:** Collaborate with large enterprises and offer them dedicated Swarm AI services. These could be performance-based contracts, ensuring a mutually beneficial relationship.
|
||||
|
||||
6. **Data as a Service (DaaS):** Leverage the data generated by Swarms for insights and analytics, providing valuable business intelligence to clients.
|
||||
|
||||
## Potential Revenue Streams
|
||||
|
||||
1. **Subscription Fees:** This would be the main revenue stream from providing the Swarms platform as a service.
|
||||
|
||||
2. **Usage Fees:** Additional revenue can come from usage fees for businesses that have high demand for Swarms API.
|
||||
|
||||
3. **Contract Fees:** From offering managed services and bespoke solutions to businesses.
|
||||
|
||||
4. **Training Fees:** Revenue from providing training and certification programs to developers and businesses.
|
||||
|
||||
5. **Partnership Contracts:** Large-scale projects with enterprises, involving dedicated Swarm AI services, could provide substantial income.
|
||||
|
||||
6. **Data Insights:** Revenue from selling valuable business intelligence derived from Swarm's aggregated and anonymized data.
|
||||
|
||||
## Potential Customers
|
||||
|
||||
1. **Businesses Across Sectors:** Any business seeking to leverage AI for automation, efficiency, and data insights could be a potential customer. This includes sectors like finance, eCommerce, logistics, healthcare, and more.
|
||||
|
||||
2. **Developers:** Both freelance and those working in organizations could use Swarms to enhance their projects and services.
|
||||
|
||||
3. **Enterprises:** Large enterprises looking to automate and optimize their operations could greatly benefit from Swarms.
|
||||
|
||||
4. **Educational Institutions:** Universities and research institutions could leverage Swarms for research and teaching purposes.
|
||||
|
||||
## Roadmap
|
||||
|
||||
1. **Landing Page Creation:** Develop a dedicated product page on apac.ai for Swarms.
|
||||
|
||||
2. **Hosted Swarms API:** Launch a cloud-based Swarms API service. It should be highly reliable, with robust documentation to attract daily users.
|
||||
|
||||
3. **Consumer and Enterprise Subscription Service:** Launch a comprehensive subscription service on The Domain. This would provide users with access to a wide array of APIs and data streams.
|
||||
|
||||
4. **Dedicated Capacity Deals:** Partner with large enterprises to offer them dedicated Swarm AI solutions for automating their operations.
|
||||
|
||||
5. **Enterprise Partnerships:** Develop partnerships with large enterprises for extensive contract-based projects.
|
||||
|
||||
6. **Integration with Collaboration Platforms:** Develop Swarms bots for platforms like Discord and Slack, charging users a subscription fee for access.
|
||||
|
||||
7. **Personal Data Instances:** Offer users dedicated instances of all their data that the Swarm can query as needed.
|
||||
|
||||
8. **Browser Extension:** Develop a browser extension that integrates with the Swarms platform, offering users a more seamless experience.
|
||||
|
||||
Remember, customer satisfaction and a value-centric approach are at the core of any successful monetization strategy. It's essential to continuously iterate and improve the product based on customer feedback and evolving market needs.
|
||||
|
||||
----
|
||||
|
||||
# Other ideas
|
||||
|
||||
1. **Platform as a Service (PaaS):** Create a cloud-based platform that allows users to build, run, and manage applications without the complexity of maintaining the infrastructure. You could charge users a subscription fee for access to the platform and provide different pricing tiers based on usage levels. This could be an attractive solution for businesses that do not have the capacity to build or maintain their own swarm intelligence solutions.
|
||||
|
||||
2. **Professional Services:** Offer consultancy and implementation services to businesses looking to utilize the Swarm technology. This could include assisting with integration into existing systems, offering custom development services, or helping customers to build specific solutions using the framework.
|
||||
|
||||
3. **Education and Training:** Create a certification program for developers or companies looking to become proficient with the Swarms framework. This could be sold as standalone courses, or bundled with other services.
|
||||
|
||||
4. **Managed Services:** Some companies may prefer to outsource the management of their Swarm-based systems. A managed services solution could take care of all the technical aspects, from hosting the solution to ensuring it runs smoothly, allowing the customer to focus on their core business.
|
||||
|
||||
5. **Data Analysis and Insights:** Swarm intelligence can generate valuable data and insights. By anonymizing and aggregating this data, you could provide industry reports, trend analysis, and other valuable insights to businesses.
|
||||
|
||||
As for the type of platform, Swarms can be offered as a cloud-based solution given its scalability and flexibility. This would also allow you to apply a SaaS/PaaS type monetization model, which provides recurring revenue.
|
||||
|
||||
Potential customers could range from small to large enterprises in various sectors such as logistics, eCommerce, finance, and technology, who are interested in leveraging artificial intelligence and machine learning for complex problem solving, optimization, and decision-making.
|
||||
|
||||
**Product Brief Monetization Strategy:**
|
||||
|
||||
Product Name: Swarms.AI Platform
|
||||
|
||||
Product Description: A cloud-based AI and ML platform harnessing the power of swarm intelligence.
|
||||
|
||||
1. **Platform as a Service (PaaS):** Offer tiered subscription plans (Basic, Premium, Enterprise) to accommodate different usage levels and business sizes.
|
||||
|
||||
2. **Professional Services:** Offer consultancy and custom development services to tailor the Swarms solution to the specific needs of the business.
|
||||
|
||||
3. **Education and Training:** Launch an online Swarms.AI Academy with courses and certifications for developers and businesses.
|
||||
|
||||
4. **Managed Services:** Provide a premium, fully-managed service offering that includes hosting, maintenance, and 24/7 support.
|
||||
|
||||
5. **Data Analysis and Insights:** Offer industry reports and customized insights generated from aggregated and anonymized Swarm data.
|
||||
|
||||
Potential Customers: Enterprises in sectors such as logistics, eCommerce, finance, and technology. This can be sold globally, provided there's an internet connection.
|
||||
|
||||
Marketing Channels: Online marketing (SEO, Content Marketing, Social Media), Partnerships with tech companies, Direct Sales to Enterprises.
|
||||
|
||||
This strategy is designed to provide multiple revenue streams, while ensuring the Swarms.AI platform is accessible and useful to a range of potential customers.
|
||||
|
||||
1. **AI Solution as a Service:** By offering the Swarms framework as a service, businesses can access and utilize the power of multiple LLM agents without the need to maintain the infrastructure themselves. Subscription can be tiered based on usage and additional features.
|
||||
|
||||
2. **Integration and Custom Development:** Offer integration services to businesses wanting to incorporate the Swarms framework into their existing systems. Also, you could provide custom development for businesses with specific needs not met by the standard framework.
|
||||
|
||||
3. **Training and Certification:** Develop an educational platform offering courses, webinars, and certifications on using the Swarms framework. This can serve both developers seeking to broaden their skills and businesses aiming to train their in-house teams.
|
||||
|
||||
4. **Managed Swarms Solutions:** For businesses that prefer to outsource their AI needs, provide a complete solution which includes the development, maintenance, and continuous improvement of swarms-based applications.
|
||||
|
||||
5. **Data Analytics Services:** Leveraging the aggregated insights from the AI swarms, you could offer data analytics services. Businesses can use these insights to make informed decisions and predictions.
|
||||
|
||||
**Type of Platform:**
|
||||
|
||||
Cloud-based platform or Software as a Service (SaaS) will be a suitable model. It offers accessibility, scalability, and ease of updates.
|
||||
|
||||
**Target Customers:**
|
||||
|
||||
The technology can be beneficial for businesses across sectors like eCommerce, technology, logistics, finance, healthcare, and education, among others.
|
||||
|
||||
**Product Brief Monetization Strategy:**
|
||||
|
||||
Product Name: Swarms.AI
|
||||
|
||||
1. **AI Solution as a Service:** Offer different tiered subscriptions (Standard, Premium, and Enterprise) each with varying levels of usage and features.
|
||||
|
||||
2. **Integration and Custom Development:** Offer custom development and integration services, priced based on the scope and complexity of the project.
|
||||
|
||||
3. **Training and Certification:** Launch the Swarms.AI Academy with courses and certifications, available for a fee.
|
||||
|
||||
4. **Managed Swarms Solutions:** Offer fully managed solutions tailored to business needs, priced based on scope and service level agreements.
|
||||
|
||||
5. **Data Analytics Services:** Provide insightful reports and data analyses, which can be purchased on a one-off basis or through a subscription.
|
||||
|
||||
By offering a variety of services and payment models, Swarms.AI will be able to cater to a diverse range of business needs, from small start-ups to large enterprises. Marketing channels would include digital marketing, partnerships with technology companies, presence in tech events, and direct sales to targeted industries.
|
||||
|
||||
|
||||
|
||||
# Roadmap
|
||||
|
||||
* Create a landing page for swarms apac.ai/product/swarms
|
||||
|
||||
* Create Hosted Swarms API for anybody to just use without need for mega gpu infra, charge usage based pricing. Prerequisites for success => Swarms has to be extremely reliable + we need world class documentation and many daily users => how do we get many daily users? We provide a seamless and fluid experience, how do we create a seamless and fluid experience? We write good code that is modular, provides feedback to the user in times of distress, and ultimately accomplishes the user's tasks.
|
||||
|
||||
* Hosted consumer and enterprise subscription as a service on The Domain, where users can interact with 1000s of APIs and ingest 1000s of different data streams.
|
||||
|
||||
* Hosted dedicated capacity deals with mega enterprises on automating many operations with Swarms for monthly subscription 300,000+$
|
||||
|
||||
* Partnerships with enterprises, massive contracts with performance based fee
|
||||
|
||||
* Have discord bot and or slack bot with users personal data, charge subscription + browser extension
|
||||
|
||||
* each user gets a dedicated ocean instance of all their data so the swarm can query it as needed.
|
||||
|
||||
|
||||
|
||||
|
||||
---
|
||||
---
|
||||
|
||||
|
||||
# Swarms Monetization Strategy: A Revolutionary AI-powered Future
|
||||
|
||||
Swarms is a powerful AI platform leveraging the transformative potential of Swarm Intelligence. Our ambition is to monetize this groundbreaking technology in ways that generate significant cashflow while providing extraordinary value to our customers.
|
||||
|
||||
Here we outline our strategic monetization pathways and provide a roadmap that plots our course to future success.
|
||||
|
||||
---
|
||||
|
||||
## I. Business Models
|
||||
|
||||
1. **Platform as a Service (PaaS):** We provide the Swarms platform as a service, billed on a monthly or annual basis. Subscriptions can range from $50 for basic access, to $500+ for premium features and extensive usage.
|
||||
|
||||
2. **API Usage-based Pricing:** Customers are billed according to their use of the Swarms API. Starting at $0.01 per request, this creates a cashflow model that rewards extensive platform usage.
|
||||
|
||||
3. **Managed Services:** We offer end-to-end solutions, managing clients' entire AI infrastructure. Contract fees start from $100,000 per month, offering both a sustainable cashflow and considerable savings for our clients.
|
||||
|
||||
4. **Training and Certification:** A Swarms AI training and certification program is available for developers and businesses. Course costs can range from $200 to $2,000, depending on course complexity and duration.
|
||||
|
||||
5. **Partnerships:** We forge collaborations with large enterprises, offering dedicated Swarm AI services. These performance-based contracts start from $1,000,000, creating a potentially lucrative cashflow stream.
|
||||
|
||||
6. **Data as a Service (DaaS):** Swarms generated data are mined for insights and analytics, with business intelligence reports offered from $500 each.
|
||||
|
||||
---
|
||||
|
||||
## II. Potential Revenue Streams
|
||||
|
||||
1. **Subscription Fees:** From $50 to $500+ per month for platform access.
|
||||
|
||||
2. **Usage Fees:** From $0.01 per API request, generating income from high platform usage.
|
||||
|
||||
3. **Contract Fees:** Starting from $100,000 per month for managed services.
|
||||
|
||||
4. **Training Fees:** From $200 to $2,000 for individual courses or subscription access.
|
||||
|
||||
5. **Partnership Contracts:** Contracts starting from $100,000, offering major income potential.
|
||||
|
||||
6. **Data Insights:** Business intelligence reports starting from $500.
|
||||
|
||||
---
|
||||
|
||||
## III. Potential Customers
|
||||
|
||||
1. **Businesses Across Sectors:** Our offerings cater to businesses across finance, eCommerce, logistics, healthcare, and more.
|
||||
|
||||
2. **Developers:** Both freelancers and organization-based developers can leverage Swarms for their projects.
|
||||
|
||||
3. **Enterprises:** Swarms offers large enterprises solutions for optimizing operations.
|
||||
|
||||
4. **Educational Institutions:** Universities and research institutions can use Swarms for research and teaching.
|
||||
|
||||
---
|
||||
|
||||
## IV. Roadmap
|
||||
|
||||
1. **Landing Page Creation:** Develop a dedicated Swarms product page on apac.ai.
|
||||
|
||||
2. **Hosted Swarms API:** Launch a reliable, well-documented cloud-based Swarms API service.
|
||||
|
||||
3. **Consumer and Enterprise Subscription Service:** Launch an extensive subscription service on The Domain, providing wide-ranging access to APIs and data streams.
|
||||
|
||||
4. **Dedicated Capacity Deals:** Offer large enterprises dedicated Swarm AI solutions, starting from $300,000 monthly subscription.
|
||||
|
||||
5. **Enterprise Partnerships:** Develop performance-based contracts with large enterprises.
|
||||
|
||||
6. **Integration with Collaboration Platforms:** Develop Swarms bots for platforms like Discord and Slack, charging a subscription fee for access.
|
||||
|
||||
7. **Personal Data Instances:** Offer users dedicated data instances that the Swarm can query as needed.
|
||||
|
||||
8. **Browser Extension:** Develop a browser extension that integrates with the Swarms platform for seamless user experience.
|
||||
|
||||
---
|
||||
|
||||
Our North Star remains customer satisfaction and value provision.
|
||||
As we embark on this journey, we continuously refine our product based on customer feedback and evolving market needs, ensuring we lead in the age of AI-driven solutions.
|
||||
|
||||
## **Platform Distribution Strategy for Swarms**
|
||||
|
||||
*Note: This strategy aims to diversify the presence of 'Swarms' across various platforms and mediums while focusing on monetization and value creation for its users.
|
||||
|
||||
---
|
||||
|
||||
### **1. Framework:**
|
||||
|
||||
#### **Objective:**
|
||||
To offer Swarms as an integrated solution within popular frameworks to ensure that developers and businesses can seamlessly incorporate its functionalities.
|
||||
|
||||
#### **Strategy:**
|
||||
|
||||
* **Language/Framework Integration:**
|
||||
* Target popular frameworks like Django, Flask for Python, Express.js for Node, etc.
|
||||
* Create SDKs or plugins for easy integration.
|
||||
|
||||
* **Monetization:**
|
||||
* Freemium Model: Offer basic integration for free, and charge for additional features or advanced integrations.
|
||||
* Licensing: Allow businesses to purchase licenses for enterprise-level integrations.
|
||||
|
||||
* **Promotion:**
|
||||
* Engage in partnerships with popular online coding platforms like Udemy, Coursera, etc., offering courses and tutorials on integrating Swarms.
|
||||
* Host webinars and write technical blogs to promote the integration benefits.
|
||||
|
||||
---
|
||||
|
||||
### **2. Paid API:**
|
||||
|
||||
#### **Objective:**
|
||||
To provide a scalable solution for developers and businesses that want direct access to Swarms' functionalities without integrating the entire framework.
|
||||
|
||||
#### **Strategy:**
|
||||
|
||||
* **API Endpoints:**
|
||||
* Offer various endpoints catering to different functionalities.
|
||||
* Maintain robust documentation to ensure ease of use.
|
||||
|
||||
* **Monetization:**
|
||||
* Usage-based Pricing: Charge based on the number of API calls.
|
||||
* Subscription Tiers: Provide tiered packages based on usage limits and advanced features.
|
||||
|
||||
* **Promotion:**
|
||||
* List on API marketplaces like RapidAPI.
|
||||
* Engage in SEO to make the API documentation discoverable.
|
||||
|
||||
---
|
||||
|
||||
### **3. Domain Hosted:**
|
||||
|
||||
#### **Objective:**
|
||||
To provide a centralized web platform where users can directly access and engage with Swarms' offerings.
|
||||
|
||||
#### **Strategy:**
|
||||
|
||||
* **User-Friendly Interface:**
|
||||
* Ensure a seamless user experience with intuitive design.
|
||||
* Incorporate features like real-time chat support, tutorials, and an FAQ section.
|
||||
|
||||
* **Monetization:**
|
||||
* Subscription Model: Offer monthly/annual subscriptions for premium features.
|
||||
* Affiliate Marketing: Partner with related tech products/services and earn through referrals.
|
||||
|
||||
* **Promotion:**
|
||||
* Invest in PPC advertising on platforms like Google Ads.
|
||||
* Engage in content marketing, targeting keywords related to Swarms' offerings.
|
||||
|
||||
---
|
||||
|
||||
### **4. Build Your Own (No-Code Platform):**
|
||||
|
||||
#### **Objective:**
|
||||
To cater to the non-developer audience, allowing them to leverage Swarms' features without any coding expertise.
|
||||
|
||||
#### **Strategy:**
|
||||
|
||||
* **Drag-and-Drop Interface:**
|
||||
* Offer customizable templates.
|
||||
* Ensure integration with popular platforms and apps.
|
||||
|
||||
* **Monetization:**
|
||||
* Freemium Model: Offer basic features for free, and charge for advanced functionalities.
|
||||
* Marketplace for Plugins: Allow third-party developers to sell their plugins/extensions on the platform.
|
||||
|
||||
* **Promotion:**
|
||||
* Partner with no-code communities and influencers.
|
||||
* Offer promotions and discounts to early adopters.
|
||||
|
||||
---
|
||||
|
||||
### **5. Marketplace for the No-Code Platform:**
|
||||
|
||||
#### **Objective:**
|
||||
To create an ecosystem where third-party developers can contribute, and users can enhance their Swarms experience.
|
||||
|
||||
#### **Strategy:**
|
||||
|
||||
* **Open API for Development:**
|
||||
* Offer robust documentation and developer support.
|
||||
* Ensure a strict quality check for marketplace additions.
|
||||
|
||||
* **Monetization:**
|
||||
* Revenue Sharing: Take a percentage cut from third-party sales.
|
||||
* Featured Listings: Charge developers for premium listings.
|
||||
|
||||
* **Promotion:**
|
||||
* Host hackathons and competitions to boost developer engagement.
|
||||
* Promote top plugins/extensions through email marketing and on the main platform.
|
||||
|
||||
---
|
||||
|
||||
### **Future Outlook & Expansion:**
|
||||
|
||||
* **Hosted Dedicated Capacity:** Hosted dedicated capacity deals for enterprises starting at 399,999$
|
||||
* **Decentralized Free Peer to peer endpoint hosted on The Grid:** Hosted endpoint by the people for the people.
|
||||
* **Browser Extenision:** Athena browser extension for deep browser automation, subscription, usage,
|
||||
|
||||
|
||||
* **Mobile Application:** Develop a mobile app version for Swarms to tap into the vast mobile user base.
|
||||
* **Global Expansion:** Localize the platform for non-English speaking regions to tap into global markets.
|
||||
* **Continuous Learning:** Regularly collect user feedback and iterate on the product features.
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
### **50 Creative Distribution Platforms for Swarms**
|
||||
|
||||
1. **E-commerce Integrations:** Platforms like Shopify, WooCommerce, where Swarms can add value to sellers.
|
||||
|
||||
2. **Web Browser Extensions:** Chrome, Firefox, and Edge extensions that bring Swarms features directly to users.
|
||||
|
||||
3. **Podcasting Platforms:** Swarms-themed content on platforms like Spotify, Apple Podcasts to reach aural learners.
|
||||
|
||||
4. **Virtual Reality (VR) Platforms:** Integration with VR experiences on Oculus or Viveport.
|
||||
|
||||
5. **Gaming Platforms:** Tools or plugins for game developers on Steam, Epic Games.
|
||||
|
||||
6. **Decentralized Platforms:** Using blockchain, create decentralized apps (DApps) versions of Swarms.
|
||||
|
||||
7. **Chat Applications:** Integrate with popular messaging platforms like WhatsApp, Telegram, Slack.
|
||||
|
||||
8. **AI Assistants:** Integration with Siri, Alexa, Google Assistant to provide Swarms functionalities via voice commands.
|
||||
|
||||
9. **Freelancing Websites:** Offer tools or services for freelancers on platforms like Upwork, Fiverr.
|
||||
|
||||
10. **Online Forums:** Platforms like Reddit, Quora, where users can discuss or access Swarms.
|
||||
|
||||
11. **Educational Platforms:** Sites like Khan Academy, Udacity where Swarms can enhance learning experiences.
|
||||
|
||||
12. **Digital Art Platforms:** Integrate with platforms like DeviantArt, Behance.
|
||||
|
||||
13. **Open-source Repositories:** Hosting Swarms on GitHub, GitLab, Bitbucket with open-source plugins.
|
||||
|
||||
14. **Augmented Reality (AR) Apps:** Create AR experiences powered by Swarms.
|
||||
|
||||
15. **Smart Home Devices:** Integrate Swarms' functionalities into smart home devices.
|
||||
|
||||
16. **Newsletters:** Platforms like Substack, where Swarms insights can be shared.
|
||||
|
||||
17. **Interactive Kiosks:** In malls, airports, and other public places.
|
||||
|
||||
18. **IoT Devices:** Incorporate Swarms in devices like smart fridges, smartwatches.
|
||||
|
||||
19. **Collaboration Tools:** Platforms like Trello, Notion, offering Swarms-enhanced productivity.
|
||||
|
||||
20. **Dating Apps:** An AI-enhanced matching algorithm powered by Swarms.
|
||||
|
||||
21. **Music Platforms:** Integrate with Spotify, SoundCloud for music-related AI functionalities.
|
||||
|
||||
22. **Recipe Websites:** Platforms like AllRecipes, Tasty with AI-recommended recipes.
|
||||
|
||||
23. **Travel & Hospitality:** Integrate with platforms like Airbnb, Tripadvisor for AI-based recommendations.
|
||||
|
||||
24. **Language Learning Apps:** Duolingo, Rosetta Stone integrations.
|
||||
|
||||
25. **Virtual Events Platforms:** Websites like Hopin, Zoom where Swarms can enhance the virtual event experience.
|
||||
|
||||
26. **Social Media Management:** Tools like Buffer, Hootsuite with AI insights by Swarms.
|
||||
|
||||
27. **Fitness Apps:** Platforms like MyFitnessPal, Strava with AI fitness insights.
|
||||
|
||||
28. **Mental Health Apps:** Integration into apps like Calm, Headspace for AI-driven wellness.
|
||||
|
||||
29. **E-books Platforms:** Amazon Kindle, Audible with AI-enhanced reading experiences.
|
||||
|
||||
30. **Sports Analysis Tools:** Websites like ESPN, Sky Sports where Swarms can provide insights.
|
||||
|
||||
31. **Financial Tools:** Integration into platforms like Mint, Robinhood for AI-driven financial advice.
|
||||
|
||||
32. **Public Libraries:** Digital platforms of public libraries for enhanced reading experiences.
|
||||
|
||||
33. **3D Printing Platforms:** Websites like Thingiverse, Shapeways with AI customization.
|
||||
|
||||
34. **Meme Platforms:** Websites like Memedroid, 9GAG where Swarms can suggest memes.
|
||||
|
||||
35. **Astronomy Apps:** Platforms like Star Walk, NASA's Eyes with AI-driven space insights.
|
||||
|
||||
36. **Weather Apps:** Integration into Weather.com, AccuWeather for predictive analysis.
|
||||
|
||||
37. **Sustainability Platforms:** Websites like Ecosia, GoodGuide with AI-driven eco-tips.
|
||||
|
||||
38. **Fashion Apps:** Platforms like ASOS, Zara with AI-based style recommendations.
|
||||
|
||||
39. **Pet Care Apps:** Integration into PetSmart, Chewy for AI-driven pet care tips.
|
||||
|
||||
40. **Real Estate Platforms:** Websites like Zillow, Realtor with AI-enhanced property insights.
|
||||
|
||||
41. **DIY Platforms:** Websites like Instructables, DIY.org with AI project suggestions.
|
||||
|
||||
42. **Genealogy Platforms:** Ancestry, MyHeritage with AI-driven family tree insights.
|
||||
|
||||
43. **Car Rental & Sale Platforms:** Integration into AutoTrader, Turo for AI-driven vehicle suggestions.
|
||||
|
||||
44. **Wedding Planning Websites:** Platforms like Zola, The Knot with AI-driven planning.
|
||||
|
||||
45. **Craft Platforms:** Websites like Etsy, Craftsy with AI-driven craft suggestions.
|
||||
|
||||
46. **Gift Recommendation Platforms:** AI-driven gift suggestions for websites like Gifts.com.
|
||||
|
||||
47. **Study & Revision Platforms:** Websites like Chegg, Quizlet with AI-driven study guides.
|
||||
|
||||
48. **Local Business Directories:** Yelp, Yellow Pages with AI-enhanced reviews.
|
||||
|
||||
49. **Networking Platforms:** LinkedIn, Meetup with AI-driven connection suggestions.
|
||||
|
||||
50. **Lifestyle Magazines' Digital Platforms:** Websites like Vogue, GQ with AI-curated fashion and lifestyle insights.
|
||||
|
||||
---
|
||||
|
||||
*Endnote: Leveraging these diverse platforms ensures that Swarms becomes an integral part of multiple ecosystems, enhancing its visibility and user engagement.*
|
@ -0,0 +1,110 @@
|
||||
### FAQ on Swarm Intelligence and Multi-Agent Systems
|
||||
|
||||
#### What is an agent in the context of AI and swarm intelligence?
|
||||
|
||||
In artificial intelligence (AI), an agent refers to an LLM with some objective to accomplish.
|
||||
|
||||
In swarm intelligence, each agent interacts with other agents and possibly the environment to achieve complex collective behaviors or solve problems more efficiently than individual agents could on their own.
|
||||
|
||||
|
||||
#### What do you need Swarms at all?
|
||||
Individual agents are limited by a vast array of issues such as context window loss, single task execution, hallucination, and no collaboration.
|
||||
|
||||
|
||||
#### How does a swarm work?
|
||||
|
||||
A swarm works through the principles of decentralized control, local interactions, and simple rules followed by each agent. Unlike centralized systems, where a single entity dictates the behavior of all components, in a swarm, each agent makes its own decisions based on local information and interactions with nearby agents. These local interactions lead to the emergence of complex, organized behaviors or solutions at the collective level, enabling the swarm to tackle tasks efficiently.
|
||||
|
||||
#### Why do you need more agents in a swarm?
|
||||
|
||||
More agents in a swarm can enhance its problem-solving capabilities, resilience, and efficiency. With more agents:
|
||||
|
||||
- **Diversity and Specialization**: The swarm can leverage a wider range of skills, knowledge, and perspectives, allowing for more creative and effective solutions to complex problems.
|
||||
- **Scalability**: Adding more agents can increase the swarm's capacity to handle larger tasks or multiple tasks simultaneously.
|
||||
- **Robustness**: A larger number of agents enhances the system's redundancy and fault tolerance, as the failure of a few agents has a minimal impact on the overall performance of the swarm.
|
||||
|
||||
#### Isn't it more expensive to use more agents?
|
||||
|
||||
While deploying more agents can initially increase costs, especially in terms of computational resources, hosting, and potentially API usage, there are several factors and strategies that can mitigate these expenses:
|
||||
|
||||
- **Efficiency at Scale**: Larger swarms can often solve problems more quickly or effectively, reducing the overall computational time and resources required.
|
||||
- **Optimization and Caching**: Implementing optimizations and caching strategies can reduce redundant computations, lowering the workload on individual agents and the overall system.
|
||||
- **Dynamic Scaling**: Utilizing cloud services that offer dynamic scaling can ensure you only pay for the resources you need when you need them, optimizing cost-efficiency.
|
||||
|
||||
#### Can swarms make decisions better than individual agents?
|
||||
|
||||
Yes, swarms can make better decisions than individual agents for several reasons:
|
||||
|
||||
- **Collective Intelligence**: Swarms combine the knowledge and insights of multiple agents, leading to more informed and well-rounded decision-making processes.
|
||||
- **Error Correction**: The collaborative nature of swarms allows for error checking and correction among agents, reducing the likelihood of mistakes.
|
||||
- **Adaptability**: Swarms are highly adaptable to changing environments or requirements, as the collective can quickly reorganize or shift strategies based on new information.
|
||||
|
||||
#### How do agents in a swarm communicate?
|
||||
|
||||
Communication in a swarm can vary based on the design and purpose of the system but generally involves either direct or indirect interactions:
|
||||
|
||||
- **Direct Communication**: Agents exchange information directly through messaging, signals, or other communication protocols designed for the system.
|
||||
- **Indirect Communication**: Agents influence each other through the environment, a method known as stigmergy. Actions by one agent alter the environment, which in turn influences the behavior of other agents.
|
||||
|
||||
#### Are swarms only useful in computational tasks?
|
||||
|
||||
While swarms are often associated with computational tasks, their applications extend far beyond. Swarms can be utilized in:
|
||||
|
||||
- **Robotics**: Coordinating multiple robots for tasks like search and rescue, exploration, or surveillance.
|
||||
- **Environmental Monitoring**: Using sensor networks to monitor pollution, wildlife, or climate conditions.
|
||||
- **Social Sciences**: Modeling social behaviors or economic systems to understand complex societal dynamics.
|
||||
- **Healthcare**: Coordinating care strategies in hospital settings or managing pandemic responses through distributed data analysis.
|
||||
|
||||
#### How do you ensure the security of a swarm system?
|
||||
|
||||
Security in swarm systems involves:
|
||||
|
||||
- **Encryption**: Ensuring all communications between agents are encrypted to prevent unauthorized access or manipulation.
|
||||
- **Authentication**: Implementing strict authentication mechanisms to verify the identity of each agent in the swarm.
|
||||
- **Resilience to Attacks**: Designing the swarm to continue functioning effectively even if some agents are compromised or attacked, utilizing redundancy and fault tolerance strategies.
|
||||
|
||||
#### How do individual agents within a swarm share insights without direct learning mechanisms like reinforcement learning?
|
||||
|
||||
In the context of pre-trained Large Language Models (LLMs) that operate within a swarm, sharing insights typically involves explicit communication and data exchange protocols rather than direct learning mechanisms like reinforcement learning. Here's how it can work:
|
||||
|
||||
- **Shared Databases and Knowledge Bases**: Agents can write to and read from a shared database or knowledge base where insights, generated content, and relevant data are stored. This allows agents to benefit from the collective experience of the swarm by accessing information that other agents have contributed.
|
||||
|
||||
- **APIs for Information Exchange**: Custom APIs can facilitate the exchange of information between agents. Through these APIs, agents can request specific information or insights from others within the swarm, effectively sharing knowledge without direct learning.
|
||||
|
||||
#### How do you balance the autonomy of individual LLMs with the need for coherent collective behavior in a swarm?
|
||||
|
||||
Balancing autonomy with collective coherence in a swarm of LLMs involves:
|
||||
|
||||
- **Central Coordination Mechanism**: Implementing a lightweight central coordination mechanism that can assign tasks, distribute information, and collect outputs from individual LLMs. This ensures that while each LLM operates autonomously, their actions are aligned with the swarm's overall objectives.
|
||||
|
||||
- **Standardized Communication Protocols**: Developing standardized protocols for how LLMs communicate and share information ensures that even though each agent works autonomously, the information exchange remains coherent and aligned with the collective goals.
|
||||
|
||||
#### How do LLM swarms adapt to changing environments or tasks without machine learning techniques?
|
||||
|
||||
Adaptation in LLM swarms, without relying on machine learning techniques for dynamic learning, can be achieved through:
|
||||
|
||||
- **Dynamic Task Allocation**: A central system or distributed algorithm can dynamically allocate tasks to different LLMs based on the changing environment or requirements. This ensures that the most suitable LLMs are addressing tasks for which they are best suited as conditions change.
|
||||
|
||||
- **Pre-trained Versatility**: Utilizing a diverse set of pre-trained LLMs with different specialties or training data allows the swarm to select the most appropriate agent for a task as the requirements evolve.
|
||||
|
||||
- **In Context Learning**: In context learning is another mechanism that can be employed within LLM swarms to adapt to changing environments or tasks. This approach involves leveraging the collective knowledge and experiences of the swarm to facilitate learning and improve performance. Here's how it can work:
|
||||
|
||||
|
||||
#### Can LLM swarms operate in physical environments, or are they limited to digital spaces?
|
||||
|
||||
LLM swarms primarily operate in digital spaces, given their nature as software entities. However, they can interact with physical environments indirectly through interfaces with sensors, actuaries, or other devices connected to the Internet of Things (IoT). For example, LLMs can process data from physical sensors and control devices based on their outputs, enabling applications like smart home management or autonomous vehicle navigation.
|
||||
|
||||
#### Without direct learning from each other, how do agents in a swarm improve over time?
|
||||
|
||||
Improvement over time in a swarm of pre-trained LLMs, without direct learning from each other, can be achieved through:
|
||||
|
||||
- **Human Feedback**: Incorporating feedback from human operators or users can guide adjustments to the usage patterns or selection criteria of LLMs within the swarm, optimizing performance based on observed outcomes.
|
||||
|
||||
- **Periodic Re-training and Updating**: The individual LLMs can be periodically re-trained or updated by their developers based on collective insights and feedback from their deployment within swarms. While this does not involve direct learning from each encounter, it allows the LLMs to improve over time based on aggregated experiences.
|
||||
|
||||
These adjustments to the FAQ reflect the specific context of pre-trained LLMs operating within a swarm, focusing on communication, coordination, and adaptation mechanisms that align with their capabilities and constraints.
|
||||
|
||||
|
||||
#### Conclusion
|
||||
|
||||
Swarms represent a powerful paradigm in AI, offering innovative solutions to complex, dynamic problems through collective intelligence and decentralized control. While challenges exist, particularly regarding cost and security, strategic design and management can leverage the strengths of swarm intelligence to achieve remarkable efficiency, adaptability, and robustness in a wide range of applications.
|
@ -0,0 +1,101 @@
|
||||
# The Swarms Flywheel
|
||||
|
||||
1. **Building a Supportive Community:** Initiate by establishing an engaging and inclusive open-source community for both developers and sales freelancers around Swarms. Regular online meetups, webinars, tutorials, and sales training can make them feel welcome and encourage contributions and sales efforts.
|
||||
|
||||
2. **Increased Contributions and Sales Efforts:** The more engaged the community, the more developers will contribute to Swarms and the more effort sales freelancers will put into selling Swarms.
|
||||
|
||||
3. **Improvement in Quality and Market Reach:** More developer contributions mean better quality, reliability, and feature offerings from Swarms. Simultaneously, increased sales efforts from freelancers boost Swarms' market penetration and visibility.
|
||||
|
||||
4. **Rise in User Base:** As Swarms becomes more robust and more well-known, the user base grows, driving more revenue.
|
||||
|
||||
5. **Greater Financial Incentives:** Increased revenue can be redirected to offer more significant financial incentives to both developers and salespeople. Developers can be incentivized based on their contribution to Swarms, and salespeople can be rewarded with higher commissions.
|
||||
|
||||
6. **Attract More Developers and Salespeople:** These financial incentives, coupled with the recognition and experience from participating in a successful project, attract more developers and salespeople to the community.
|
||||
|
||||
7. **Wider Adoption of Swarms:** An ever-improving product, a growing user base, and an increasing number of passionate salespeople accelerate the adoption of Swarms.
|
||||
|
||||
8. **Return to Step 1:** As the community, user base, and sales network continue to grow, the cycle repeats, each time speeding up the flywheel.
|
||||
|
||||
|
||||
```markdown
|
||||
+---------------------+
|
||||
| Building a |
|
||||
| Supportive | <--+
|
||||
| Community | |
|
||||
+--------+-----------+ |
|
||||
| |
|
||||
v |
|
||||
+--------+-----------+ |
|
||||
| Increased | |
|
||||
| Contributions & | |
|
||||
| Sales Efforts | |
|
||||
+--------+-----------+ |
|
||||
| |
|
||||
v |
|
||||
+--------+-----------+ |
|
||||
| Improvement in | |
|
||||
| Quality & Market | |
|
||||
| Reach | |
|
||||
+--------+-----------+ |
|
||||
| |
|
||||
v |
|
||||
+--------+-----------+ |
|
||||
| Rise in User | |
|
||||
| Base | |
|
||||
+--------+-----------+ |
|
||||
| |
|
||||
v |
|
||||
+--------+-----------+ |
|
||||
| Greater Financial | |
|
||||
| Incentives | |
|
||||
+--------+-----------+ |
|
||||
| |
|
||||
v |
|
||||
+--------+-----------+ |
|
||||
| Attract More | |
|
||||
| Developers & | |
|
||||
| Salespeople | |
|
||||
+--------+-----------+ |
|
||||
| |
|
||||
v |
|
||||
+--------+-----------+ |
|
||||
| Wider Adoption of | |
|
||||
| Swarms |----+
|
||||
+---------------------+
|
||||
```
|
||||
|
||||
|
||||
# Potential Risks and Mitigations:
|
||||
|
||||
1. **Insufficient Contributions or Quality of Work**: Open-source efforts rely on individuals being willing and able to spend time contributing. If not enough people participate, or the work they produce is of poor quality, the product development could stall.
|
||||
* **Mitigation**: Create a robust community with clear guidelines, support, and resources. Provide incentives for quality contributions, such as a reputation system, swag, or financial rewards. Conduct thorough code reviews to ensure the quality of contributions.
|
||||
|
||||
2. **Lack of Sales Results**: Commission-based salespeople will only continue to sell the product if they're successful. If they aren't making enough sales, they may lose motivation and cease their efforts.
|
||||
* **Mitigation**: Provide adequate sales training and resources. Ensure the product-market fit is strong, and adjust messaging or sales tactics as necessary. Consider implementing a minimum commission or base pay to reduce risk for salespeople.
|
||||
|
||||
3. **Poor User Experience or User Adoption**: If users don't find the product useful or easy to use, they won't adopt it, and the user base won't grow. This could also discourage salespeople and contributors.
|
||||
* **Mitigation**: Prioritize user experience in the product development process. Regularly gather and incorporate user feedback. Ensure robust user support is in place.
|
||||
|
||||
4. **Inadequate Financial Incentives**: If the financial rewards don't justify the time and effort contributors and salespeople are putting in, they will likely disengage.
|
||||
* **Mitigation**: Regularly review and adjust financial incentives as needed. Ensure that the method for calculating and distributing rewards is transparent and fair.
|
||||
|
||||
5. **Security and Compliance Risks**: As the user base grows and the software becomes more complex, the risk of security issues increases. Moreover, as contributors from various regions join, compliance with various international laws could become an issue.
|
||||
* **Mitigation**: Establish strong security practices from the start. Regularly conduct security audits. Seek legal counsel to understand and adhere to international laws and regulations.
|
||||
|
||||
## Activation Plan for the Flywheel:
|
||||
|
||||
1. **Community Building**: Begin by fostering a supportive community around Swarms. Encourage early adopters to contribute and provide feedback. Create comprehensive documentation, community guidelines, and a forum for discussion and support.
|
||||
|
||||
2. **Sales and Development Training**: Provide resources and training for salespeople and developers. Make sure they understand the product, its value, and how to effectively contribute or sell.
|
||||
|
||||
3. **Increase Contributions and Sales Efforts**: Encourage increased participation by highlighting successful contributions and sales, rewarding top contributors and salespeople, and regularly communicating about the project's progress and impact.
|
||||
|
||||
4. **Iterate and Improve**: Continually gather and implement feedback to improve Swarms and its market reach. The better the product and its alignment with the market, the more the user base will grow.
|
||||
|
||||
5. **Expand User Base**: As the product improves and sales efforts continue, the user base should grow. Ensure you have the infrastructure to support this growth and maintain a positive user experience.
|
||||
|
||||
6. **Increase Financial Incentives**: As the user base and product grow, so too should the financial incentives. Make sure rewards continue to be competitive and attractive.
|
||||
|
||||
7. **Attract More Contributors and Salespeople**: As the financial incentives and success of the product increase, this should attract more contributors and salespeople, further feeding the flywheel.
|
||||
|
||||
Throughout this process, it's important to regularly reassess and adjust your strategy as necessary. Stay flexible and responsive to changes in the market, user feedback, and the evolving needs of the community.
|
@ -0,0 +1,40 @@
|
||||
# Frontend Contributor Guide
|
||||
|
||||
## Mission
|
||||
At the heart of Swarms is the mission to democratize multi-agent technology, making it accessible to businesses of all sizes around the globe. This technology, which allows for the orchestration of multiple autonomous agents to achieve complex goals, has the potential to revolutionize industries by enhancing efficiency, scalability, and innovation. Swarms is committed to leading this charge by developing a platform that empowers businesses and individuals to harness the power of multi-agent systems without the need for specialized knowledge or resources.
|
||||
|
||||
|
||||
## Understanding Your Impact as a Frontend Engineer
|
||||
Crafting User Experiences: As a frontend engineer at Swarms, you play a crucial role in making multi-agent technology understandable and usable for businesses worldwide. Your work involves translating complex systems into intuitive interfaces, ensuring users can easily navigate, manage, and benefit from multi-agent solutions. By focusing on user-centric design and seamless integration, you help bridge the gap between advanced technology and practical business applications.
|
||||
|
||||
Skills and Attributes for Success: Successful frontend engineers at Swarms combine technical expertise with a passion for innovation and a deep understanding of user needs. Proficiency in modern frontend technologies, such as React, NextJS, and Tailwind, is just the beginning. You also need a strong grasp of usability principles, accessibility standards, and the ability to work collaboratively with cross-functional teams. Creativity, problem-solving skills, and a commitment to continuous learning are essential for developing solutions that meet diverse business needs.
|
||||
|
||||
|
||||
## Joining the Team
|
||||
As you contribute to Swarms, you become part of a collaborative effort to change the world. We value each contribution and provide constructive feedback to help you grow. Outstanding contributors who share our vision and demonstrate exceptional skill and dedication are invited to join our team, where they can have an even greater impact on our mission.
|
||||
|
||||
|
||||
### Becoming a Full-Time Swarms Engineer:
|
||||
Swarms is radically devoted to open source and transparency. To join the full time team, you must first contribute to the open source repository so we can assess your technical capability and general way of working. After a series of quality contributions, we'll offer you a full time position!
|
||||
|
||||
Joining Swarms full-time means more than just a job. It's an opportunity to be at the forefront of technological innovation, working alongside passionate professionals dedicated to making a difference. We look for individuals who are not only skilled but also driven by the desire to make multi-agent technology accessible and beneficial to businesses worldwide.
|
||||
|
||||
|
||||
## Resources
|
||||
- **Project Management Details**
|
||||
- **Linear**: Our projects and tasks at a glance. Get a sense of our workflow and priorities.
|
||||
- [View on Linear](https://linear.app/swarms/join/e7f4c6c560ffa0e1395820682f4e110a?s=1)
|
||||
|
||||
- **Design System and UI/UX Guidelines**
|
||||
- **Figma**: Dive into our design system to grasp the aesthetics and user experience objectives of Swarms.
|
||||
- [View on Figma](https://www.figma.com/file/KL4VIXfZKwwLgAes2WbGNa/Swarms-Cloud-Platform?type=design&node-id=0%3A1&mode=design&t=MkrM0mBQa6qsTDtJ-1)
|
||||
|
||||
- **Swarms Platform Repository**
|
||||
- **GitHub**: The hub of our development activities. Familiarize yourself with our codebase and current projects.
|
||||
- [Visit GitHub Repository](https://github.com/kyegomez/swarms-platform)
|
||||
|
||||
- **[Swarms Community](https://discord.gg/pSTSxqDk)**
|
||||
|
||||
|
||||
### Design Style & User Experience
|
||||
- [How to build great products with game design, not gamification](https://blog.superhuman.com/game-design-not-gamification/)
|
@ -0,0 +1,66 @@
|
||||
def calculate_monthly_charge(
|
||||
development_time_hours: float,
|
||||
hourly_rate: float,
|
||||
amortization_months: int,
|
||||
api_calls_per_month: int,
|
||||
cost_per_api_call: float,
|
||||
monthly_maintenance: float,
|
||||
additional_monthly_costs: float,
|
||||
profit_margin_percentage: float,
|
||||
) -> float:
|
||||
"""
|
||||
Calculate the monthly charge for a service based on various cost factors.
|
||||
|
||||
Parameters:
|
||||
- development_time_hours (float): The total number of hours spent on development and setup.
|
||||
- hourly_rate (float): The rate per hour for development and setup.
|
||||
- amortization_months (int): The number of months over which to amortize the development and setup costs.
|
||||
- api_calls_per_month (int): The number of API calls made per month.
|
||||
- cost_per_api_call (float): The cost per API call.
|
||||
- monthly_maintenance (float): The monthly maintenance cost.
|
||||
- additional_monthly_costs (float): Any additional monthly costs.
|
||||
- profit_margin_percentage (float): The desired profit margin as a percentage.
|
||||
|
||||
Returns:
|
||||
- monthly_charge (float): The calculated monthly charge for the service.
|
||||
"""
|
||||
|
||||
# Calculate Development and Setup Costs (amortized monthly)
|
||||
development_and_setup_costs_monthly = (
|
||||
development_time_hours * hourly_rate
|
||||
) / amortization_months
|
||||
|
||||
# Calculate Operational Costs per Month
|
||||
operational_costs_monthly = (
|
||||
(api_calls_per_month * cost_per_api_call)
|
||||
+ monthly_maintenance
|
||||
+ additional_monthly_costs
|
||||
)
|
||||
|
||||
# Calculate Total Monthly Costs
|
||||
total_monthly_costs = (
|
||||
development_and_setup_costs_monthly
|
||||
+ operational_costs_monthly
|
||||
)
|
||||
|
||||
# Calculate Pricing with Profit Margin
|
||||
monthly_charge = total_monthly_costs * (
|
||||
1 + profit_margin_percentage / 100
|
||||
)
|
||||
|
||||
return monthly_charge
|
||||
|
||||
|
||||
# Example usage:
|
||||
monthly_charge = calculate_monthly_charge(
|
||||
development_time_hours=100,
|
||||
hourly_rate=500,
|
||||
amortization_months=12,
|
||||
api_calls_per_month=500000,
|
||||
cost_per_api_call=0.002,
|
||||
monthly_maintenance=1000,
|
||||
additional_monthly_costs=300,
|
||||
profit_margin_percentage=10000,
|
||||
)
|
||||
|
||||
print(f"Monthly Charge: ${monthly_charge:.2f}")
|
@ -0,0 +1,14 @@
|
||||
|
||||
## Purpose
|
||||
Artificial Intelligence has grown at an exponential rate over the past decade. Yet, we are far from fully harnessing its potential. Today's AI operates in isolation, each working separately in their corner. But life doesn't work like that. The world doesn't work like that. Success isn't built in silos; it's built in teams.
|
||||
|
||||
Imagine a world where AI models work in unison. Where they can collaborate, interact, and pool their collective intelligence to achieve more than any single model could. This is the future we envision. But today, we lack a framework for AI to collaborate effectively, to form a true swarm of intelligent agents.
|
||||
|
||||
|
||||
This is a difficult problem, one that has eluded solution. It requires sophisticated systems that can allow individual models to not just communicate but also understand each other, pool knowledge and resources, and create collective intelligence. This is the next frontier of AI.
|
||||
|
||||
But here at Swarms, we have a secret sauce. It's not just a technology or a breakthrough invention. It's a way of thinking - the philosophy of rapid iteration. With each cycle, we make massive progress. We experiment, we learn, and we grow. We have developed a pioneering framework that can enable AI models to work together as a swarm, combining their strengths to create richer, more powerful outputs.
|
||||
|
||||
We are uniquely positioned to take on this challenge with 1,500+ devoted researchers in Agora. We have assembled a team of world-class experts, experienced and driven, united by a shared vision. Our commitment to breaking barriers, pushing boundaries, and our belief in the power of collective intelligence makes us the best team to usher in this future to fundamentally advance our species, Humanity.
|
||||
|
||||
---
|
@ -0,0 +1,82 @@
|
||||
# Research Lists
|
||||
A compilation of projects, papers, blogs in autonomous agents.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Introduction](#introduction)
|
||||
- [Projects](#projects)
|
||||
- [Articles](#articles)
|
||||
- [Talks](#talks)
|
||||
|
||||
|
||||
## Projects
|
||||
|
||||
### Developer tools
|
||||
- [2023/8/10] [ModelScope-Agent](https://github.com/modelscope/modelscope-agent) - An Agent Framework Connecting Models in ModelScope with the World
|
||||
- [2023/05/25] [Gorilla](https://github.com/ShishirPatil/gorilla) - An API store for LLMs
|
||||
- [2023/03/31] [BMTools](https://github.com/OpenBMB/BMTools) - Tool Learning for Big Models, Open-Source Solutions of ChatGPT-Plugins
|
||||
- [2023/03/09] [LMQL](https://github.com/eth-sri/lmql) - A query language for programming (large) language models.
|
||||
- [2022/10/25] [Langchain](https://github.com/hwchase17/langchain) - ⚡ Building applications with LLMs through composability ⚡
|
||||
|
||||
### Applications
|
||||
- [2023/07/08] [ShortGPT](https://github.com/RayVentura/ShortGPT) - 🚀🎬 ShortGPT - An experimental AI framework for automated short/video content creation. Enables creators to rapidly produce, manage, and deliver content using AI and automation.
|
||||
- [2023/07/05] [gpt-researcher](https://github.com/assafelovic/gpt-researcher) - GPT based autonomous agent that does online comprehensive research on any given topic
|
||||
- [2023/07/04] [DemoGPT](https://github.com/melih-unsal/DemoGPT) - 🧩DemoGPT enables you to create quick demos by just using prompts. [[demo]](demogpt.io)
|
||||
- [2023/06/30] [MetaGPT](https://github.com/geekan/MetaGPT) - 🌟 The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo
|
||||
- [2023/06/11] [gpt-engineer](https://github.com/AntonOsika/gpt-engineer) - Specify what you want it to build, the AI asks for clarification, and then builds it.
|
||||
- [2023/05/16] [SuperAGI](https://github.com/TransformerOptimus/SuperAGI) - <⚡️> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
|
||||
- [2023/05/13] [Developer](https://github.com/smol-ai/developer) - Human-centric & Coherent Whole Program Synthesis aka your own personal junior developer
|
||||
- [2023/04/07] [AgentGPT](https://github.com/reworkd/AgentGPT) - 🤖 Assemble, configure, and deploy autonomous AI Agents in your browser. [[demo]](agentgpt.reworkd.ai)
|
||||
- [2023/04/03] [BabyAGI](https://github.com/yoheinakajima/babyagi) - an example of an AI-powered task management system
|
||||
- [2023/03/30] [AutoGPT](https://github.com/Significant-Gravitas/Auto-GPT) - An experimental open-source attempt to make GPT-4 fully autonomous.
|
||||
|
||||
### Benchmarks
|
||||
- [2023/08/07] [AgentBench](https://github.com/THUDM/AgentBench) - A Comprehensive Benchmark to Evaluate LLMs as Agents. [paper](https://arxiv.org/abs/2308.03688)
|
||||
- [2023/06/18] [Auto-GPT-Benchmarks](https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks) - A repo built for the purpose of benchmarking the performance of agents, regardless of how they are set up and how they work.
|
||||
- [2023/05/28] [ToolBench](https://github.com/OpenBMB/ToolBench) - An open platform for training, serving, and evaluating large language model for tool learning.
|
||||
|
||||
## Articles
|
||||
### Research Papers
|
||||
- [2023/08/11] [BOLAA: Benchmarking and Orchestrating LLM-Augmented Autonomous Agents](https://arxiv.org/pdf/2308.05960v1.pdf), Zhiwei Liu, et al.
|
||||
- [2023/07/31] [ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs](https://arxiv.org/abs/2307.16789), Yujia Qin, et al.
|
||||
- [2023/07/16] [Communicative Agents for Software Development](https://arxiv.org/abs/2307.07924), Chen Qian, et al.
|
||||
- [2023/06/09] [Mind2Web: Towards a Generalist Agent for the Web](https://arxiv.org/pdf/2306.06070.pdf), Xiang Deng, et al. [[code]](https://github.com/OSU-NLP-Group/Mind2Web) [[demo]](https://osu-nlp-group.github.io/Mind2Web/)
|
||||
- [2023/06/05] [Orca: Progressive Learning from Complex Explanation Traces of GPT-4](https://arxiv.org/pdf/2306.02707.pdf), Subhabrata Mukherjee et al.
|
||||
- [2023/05/25] [Voyager: An Open-Ended Embodied Agent with Large Language Models](https://arxiv.org/pdf/2305.16291.pdf), Guanzhi Wang, et al. [[code]](https://github.com/MineDojo/Voyager) [[website]](https://voyager.minedojo.org/)
|
||||
- [2023/05/23] [ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models](https://arxiv.org/pdf/2305.18323.pdf), Binfeng Xu, et al. [[code]](https://github.com/billxbf/ReWOO)
|
||||
- [2023/05/17] [Tree of Thoughts: Deliberate Problem Solving with Large Language Models](https://arxiv.org/abs/2305.10601), Shunyu Yao, et al.[[code]](https://github.com/kyegomez/tree-of-thoughts) [[code-orig]](https://github.com/ysymyth/tree-of-thought-llm)
|
||||
- [2023/05/12] [MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers](https://arxiv.org/abs/2305.07185), Lili Yu, et al.
|
||||
- [2023/05/19] [FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance](https://arxiv.org/abs/2305.05176), Lingjiao Chen, et al.
|
||||
- [2023/05/06] [Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models](https://arxiv.org/abs/2305.04091), Lei Wang, et al.
|
||||
- [2023/05/01] [Learning to Reason and Memorize with Self-Notes](https://arxiv.org/abs/2305.00833), Jack Lanchantin, et al.
|
||||
- [2023/04/24] [WizardLM: Empowering Large Language Models to Follow Complex Instructions](https://arxiv.org/abs/2304.12244), Can Xu, et al.
|
||||
- [2023/04/22] [LLM+P: Empowering Large Language Models with Optimal Planning Proficiency](https://arxiv.org/abs/2304.11477), Bo Liu, et al.
|
||||
- [2023/04/07] [Generative Agents: Interactive Simulacra of Human Behavior](https://arxiv.org/abs/2304.03442), Joon Sung Park, et al. [[code]](https://github.com/mkturkcan/generative-agents)
|
||||
- [2023/03/30] [Self-Refine: Iterative Refinement with Self-Feedback](https://arxiv.org/abs/2303.17651), Aman Madaan, et al.[[code]](https://github.com/madaan/self-refine)
|
||||
- [2023/03/30] [HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace](https://arxiv.org/pdf/2303.17580.pdf), Yongliang Shen, et al. [[code]](https://github.com/microsoft/JARVIS) [[demo]](https://huggingface.co/spaces/microsoft/HuggingGPT)
|
||||
- [2023/03/20] [Reflexion: Language Agents with Verbal Reinforcement Learning](https://arxiv.org/pdf/2303.11366.pdf), Noah Shinn, et al. [[code]](https://github.com/noahshinn024/reflexion)
|
||||
- [2023/03/04] [Towards A Unified Agent with Foundation Models](https://openreview.net/pdf?id=JK_B1tB6p-), Norman Di Palo et al.
|
||||
- [2023/02/23] [Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection](https://arxiv.org/abs/2302.12173), Sahar Abdelnab, et al.
|
||||
- [2023/02/09] [Toolformer: Language Models Can Teach Themselves to Use Tools](https://arxiv.org/pdf/2302.04761.pdf), Timo Schick, et al. [[code]](https://github.com/lucidrains/toolformer-pytorch)
|
||||
- [2022/12/12] [LMQL: Prompting Is Programming: A Query Language for Large Language Models](https://arxiv.org/abs/2212.06094), Luca Beurer-Kellner, et al.
|
||||
- [2022/10/06] [ReAct: Synergizing Reasoning and Acting in Language Models](https://arxiv.org/pdf/2210.03629.pdf), Shunyu Yao, et al. [[code]](https://github.com/ysymyth/ReAct)
|
||||
- [2022/07/20] [Inner Monologue: Embodied Reasoning through Planning with Language Models](https://arxiv.org/pdf/2207.05608.pdf), Wenlong Huang, et al. [[demo]](https://innermonologue.github.io/)
|
||||
- [2022/04/04] [Do As I Can, Not As I Say: Grounding Language in Robotic Affordances](), Michael Ahn, e al. [[demo]](https://say-can.github.io/)
|
||||
- [2021/12/17] [WebGPT: Browser-assisted question-answering with human feedback](https://arxiv.org/pdf/2112.09332.pdf), Reiichiro Nakano, et al.
|
||||
- [2021/06/17] [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685), Edward J. Hu, et al.
|
||||
|
||||
|
||||
### Blog Articles
|
||||
|
||||
- [2023/08/14] [A Roadmap of AI Agents(Chinese)](https://zhuanlan.zhihu.com/p/649916692) By Haojie Pan
|
||||
- [2023/06/23] [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) By Lilian Weng
|
||||
- [2023/06/11] [A CRITICAL LOOK AT AI-GENERATED SOFTWARE](https://spectrum.ieee.org/ai-software) By JAIDEEP VAIDYAHAFIZ ASIF
|
||||
- [2023/04/29] [AUTO-GPT: UNLEASHING THE POWER OF AUTONOMOUS AI AGENTS](https://www.leewayhertz.com/autogpt/) By Akash Takyar
|
||||
- [2023/04/20] [Conscious Machines: Experiments, Theory, and Implementations(Chinese)](https://pattern.swarma.org/article/230) By Jiang Zhang
|
||||
- [2023/04/18] [Autonomous Agents & Agent Simulations](https://blog.langchain.dev/agents-round/) By Langchain
|
||||
- [2023/04/16] [4 Autonomous AI Agents you need to know](https://towardsdatascience.com/4-autonomous-ai-agents-you-need-to-know-d612a643fa92) By Sophia Yang
|
||||
- [2023/03/31] [ChatGPT that learns to use tools(Chinese)](https://zhuanlan.zhihu.com/p/618448188) By Haojie Pan
|
||||
|
||||
### Talks
|
||||
- [2023/06/05] [Two Paths to Intelligence](https://www.youtube.com/watch?v=rGgGOccMEiY&t=1497s) by Geoffrey Hinton
|
||||
- [2023/05/24] [State of GPT](https://www.youtube.com/watch?v=bZQun8Y4L2A) by Andrej Karpathy | OpenAI
|
@ -0,0 +1,13 @@
|
||||
## The Plan
|
||||
|
||||
### Phase 1: Building the Foundation
|
||||
In the first phase, our focus is on building the basic infrastructure of Swarms. This includes developing key components like the Swarms class, integrating essential tools, and establishing task completion and evaluation logic. We'll also start developing our testing and evaluation framework during this phase. If you're interested in foundational work and have a knack for building robust, scalable systems, this phase is for you.
|
||||
|
||||
### Phase 2: Optimizing the System
|
||||
In the second phase, we'll focus on optimizng Swarms by integrating more advanced features, improving the system's efficiency, and refining our testing and evaluation framework. This phase involves more complex tasks, so if you enjoy tackling challenging problems and contributing to the development of innovative features, this is the phase for you.
|
||||
|
||||
### Phase 3: Towards Super-Intelligence
|
||||
The third phase of our bounty program is the most exciting - this is where we aim to achieve super-intelligence. In this phase, we'll be working on improving the swarm's capabilities, expanding its skills, and fine-tuning the system based on real-world testing and feedback. If you're excited about the future of AI and want to contribute to a project that could potentially transform the digital world, this is the phase for you.
|
||||
|
||||
Remember, our roadmap is a guide, and we encourage you to bring your own ideas and creativity to the table. We believe that every contribution, no matter how small, can make a difference. So join us on this exciting journey and help us create the future of Swarms.
|
||||
|
@ -0,0 +1,21 @@
|
||||
# [Go To Market Strategy][GTM]
|
||||
|
||||
Our vision is to become the world leader in real-world production grade autonomous agent deployment through open-source product development, Deep Verticalization, and unmatched value delivery to the end user.
|
||||
|
||||
We will focus on first accelerating the open source framework to PMF where it will serve as the backend for upstream products and services such as the Swarm Cloud which will enable enterprises to deploy autonomous agents with long term memory and tools in the cloud and a no-code platform for users to build their own swarm by dragging and dropping blocks.
|
||||
|
||||
Our target user segment for the framework is AI engineers looking to deploy agents into high risk environments where reliability is crucial.
|
||||
|
||||
Once PMF has been achieved and the framework has been extensively benchmarked we aim to establish high value contracts with customers in Security, Logistics, Manufacturing, Health and various other untapped industries.
|
||||
|
||||
Our growth strategy for the OS framework can be summarized by:
|
||||
|
||||
- Educating developers on value of autonomous agent usage.
|
||||
- Tutorial Walkthrough on various applications like deploying multi-modal agents through cameras or building custom swarms for a specific business operation.
|
||||
- Demonstrate unmatched reliability by delighting users.
|
||||
- Staying up to date with trends and integrating the latest models, frameworks, and methodologies.
|
||||
- Building a loyal and devoted community for long term user retention. [Join here](https://codex.apac.ai)
|
||||
|
||||
As we continuously deliver value with the open framework we will strategically position ourselves to acquire leads for high value contracts by demonstrating the power, reliability, and performance of our framework openly.
|
||||
|
||||
Acquire Full Access to the memo here: [TSC Memo](https://docs.google.com/document/d/1hS_nv_lFjCqLfnJBoF6ULY9roTbSgSuCkvXvSUSc7Lo/edit?usp=sharing)
|
@ -0,0 +1,187 @@
|
||||
```markdown
|
||||
# Swarm Alpha: Data Cruncher
|
||||
**Overview**: Processes large datasets.
|
||||
**Strengths**: Efficient data handling.
|
||||
**Weaknesses**: Requires structured data.
|
||||
|
||||
**Pseudo Code**:
|
||||
```sql
|
||||
FOR each data_entry IN dataset:
|
||||
result = PROCESS(data_entry)
|
||||
STORE(result)
|
||||
END FOR
|
||||
RETURN aggregated_results
|
||||
```
|
||||
|
||||
# Swarm Beta: Artistic Ally
|
||||
**Overview**: Generates art pieces.
|
||||
**Strengths**: Creativity.
|
||||
**Weaknesses**: Somewhat unpredictable.
|
||||
|
||||
**Pseudo Code**:
|
||||
```scss
|
||||
INITIATE canvas_parameters
|
||||
SELECT art_style
|
||||
DRAW(canvas_parameters, art_style)
|
||||
RETURN finished_artwork
|
||||
```
|
||||
|
||||
# Swarm Gamma: Sound Sculptor
|
||||
**Overview**: Crafts audio sequences.
|
||||
**Strengths**: Diverse audio outputs.
|
||||
**Weaknesses**: Complexity in refining outputs.
|
||||
|
||||
**Pseudo Code**:
|
||||
```sql
|
||||
DEFINE sound_parameters
|
||||
SELECT audio_style
|
||||
GENERATE_AUDIO(sound_parameters, audio_style)
|
||||
RETURN audio_sequence
|
||||
```
|
||||
|
||||
# Swarm Delta: Web Weaver
|
||||
**Overview**: Constructs web designs.
|
||||
**Strengths**: Modern design sensibility.
|
||||
**Weaknesses**: Limited to web interfaces.
|
||||
|
||||
**Pseudo Code**:
|
||||
```scss
|
||||
SELECT template
|
||||
APPLY user_preferences(template)
|
||||
DESIGN_web(template, user_preferences)
|
||||
RETURN web_design
|
||||
```
|
||||
|
||||
# Swarm Epsilon: Code Compiler
|
||||
**Overview**: Writes and compiles code snippets.
|
||||
**Strengths**: Quick code generation.
|
||||
**Weaknesses**: Limited to certain programming languages.
|
||||
|
||||
**Pseudo Code**:
|
||||
```scss
|
||||
DEFINE coding_task
|
||||
WRITE_CODE(coding_task)
|
||||
COMPILE(code)
|
||||
RETURN executable
|
||||
```
|
||||
|
||||
# Swarm Zeta: Security Shield
|
||||
**Overview**: Detects system vulnerabilities.
|
||||
**Strengths**: High threat detection rate.
|
||||
**Weaknesses**: Potential false positives.
|
||||
|
||||
**Pseudo Code**:
|
||||
```sql
|
||||
MONITOR system_activity
|
||||
IF suspicious_activity_detected:
|
||||
ANALYZE threat_level
|
||||
INITIATE mitigation_protocol
|
||||
END IF
|
||||
RETURN system_status
|
||||
```
|
||||
|
||||
# Swarm Eta: Researcher Relay
|
||||
**Overview**: Gathers and synthesizes research data.
|
||||
**Strengths**: Access to vast databases.
|
||||
**Weaknesses**: Depth of research can vary.
|
||||
|
||||
**Pseudo Code**:
|
||||
```sql
|
||||
DEFINE research_topic
|
||||
SEARCH research_sources(research_topic)
|
||||
SYNTHESIZE findings
|
||||
RETURN research_summary
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Swarm Theta: Sentiment Scanner
|
||||
**Overview**: Analyzes text for sentiment and emotional tone.
|
||||
**Strengths**: Accurate sentiment detection.
|
||||
**Weaknesses**: Contextual nuances might be missed.
|
||||
|
||||
**Pseudo Code**:
|
||||
```arduino
|
||||
INPUT text_data
|
||||
ANALYZE text_data FOR emotional_tone
|
||||
DETERMINE sentiment_value
|
||||
RETURN sentiment_value
|
||||
```
|
||||
|
||||
# Swarm Iota: Image Interpreter
|
||||
**Overview**: Processes and categorizes images.
|
||||
**Strengths**: High image recognition accuracy.
|
||||
**Weaknesses**: Can struggle with abstract visuals.
|
||||
|
||||
**Pseudo Code**:
|
||||
```objective-c
|
||||
LOAD image_data
|
||||
PROCESS image_data FOR features
|
||||
CATEGORIZE image_based_on_features
|
||||
RETURN image_category
|
||||
```
|
||||
|
||||
# Swarm Kappa: Language Learner
|
||||
**Overview**: Translates and interprets multiple languages.
|
||||
**Strengths**: Supports multiple languages.
|
||||
**Weaknesses**: Nuances in dialects might pose challenges.
|
||||
|
||||
**Pseudo Code**:
|
||||
```vbnet
|
||||
RECEIVE input_text, target_language
|
||||
TRANSLATE input_text TO target_language
|
||||
RETURN translated_text
|
||||
```
|
||||
|
||||
# Swarm Lambda: Trend Tracker
|
||||
**Overview**: Monitors and predicts trends based on data.
|
||||
**Strengths**: Proactive trend identification.
|
||||
**Weaknesses**: Requires continuous data stream.
|
||||
|
||||
**Pseudo Code**:
|
||||
```sql
|
||||
COLLECT data_over_time
|
||||
ANALYZE data_trends
|
||||
PREDICT upcoming_trends
|
||||
RETURN trend_forecast
|
||||
```
|
||||
|
||||
# Swarm Mu: Financial Forecaster
|
||||
**Overview**: Analyzes financial data to predict market movements.
|
||||
**Strengths**: In-depth financial analytics.
|
||||
**Weaknesses**: Market volatility can affect predictions.
|
||||
|
||||
**Pseudo Code**:
|
||||
```sql
|
||||
GATHER financial_data
|
||||
COMPUTE statistical_analysis
|
||||
FORECAST market_movements
|
||||
RETURN financial_projections
|
||||
```
|
||||
|
||||
# Swarm Nu: Network Navigator
|
||||
**Overview**: Optimizes and manages network traffic.
|
||||
**Strengths**: Efficient traffic management.
|
||||
**Weaknesses**: Depends on network infrastructure.
|
||||
|
||||
**Pseudo Code**:
|
||||
```sql
|
||||
MONITOR network_traffic
|
||||
IDENTIFY congestion_points
|
||||
OPTIMIZE traffic_flow
|
||||
RETURN network_status
|
||||
```
|
||||
|
||||
# Swarm Xi: Content Curator
|
||||
**Overview**: Gathers and presents content based on user preferences.
|
||||
**Strengths**: Personalized content delivery.
|
||||
**Weaknesses**: Limited by available content sources.
|
||||
|
||||
**Pseudo Code**:
|
||||
```sql
|
||||
DEFINE user_preferences
|
||||
SEARCH content_sources
|
||||
FILTER content_matching_preferences
|
||||
DISPLAY curated_content
|
||||
```
|
||||
|
@ -0,0 +1,50 @@
|
||||
# Swarms Multi-Agent Permissions System (SMAPS)
|
||||
|
||||
## Description
|
||||
SMAPS is a robust permissions management system designed to integrate seamlessly with Swarm's multi-agent AI framework. Drawing inspiration from Amazon's IAM, SMAPS ensures secure, granular control over agent actions while allowing for collaborative human-in-the-loop interventions.
|
||||
|
||||
## Technical Specification
|
||||
|
||||
### 1. Components
|
||||
|
||||
- **User Management**: Handle user registrations, roles, and profiles.
|
||||
- **Agent Management**: Register, monitor, and manage AI agents.
|
||||
- **Permissions Engine**: Define and enforce permissions based on roles.
|
||||
- **Multiplayer Interface**: Allows multiple human users to intervene, guide, or collaborate on tasks being executed by AI agents.
|
||||
|
||||
### 2. Features
|
||||
|
||||
- **Role-Based Access Control (RBAC)**:
|
||||
- Users can be assigned predefined roles (e.g., Admin, Agent Supervisor, Collaborator).
|
||||
- Each role has specific permissions associated with it, defining what actions can be performed on AI agents or tasks.
|
||||
|
||||
- **Dynamic Permissions**:
|
||||
- Create custom roles with specific permissions.
|
||||
- Permissions granularity: From broad (e.g., view all tasks) to specific (e.g., modify parameters of a particular agent).
|
||||
|
||||
- **Multiplayer Collaboration**:
|
||||
- Multiple users can join a task in real-time.
|
||||
- Collaborators can provide real-time feedback or guidance to AI agents.
|
||||
- A voting system for decision-making when human intervention is required.
|
||||
|
||||
- **Agent Supervision**:
|
||||
- Monitor agent actions in real-time.
|
||||
- Intervene, if necessary, to guide agent actions based on permissions.
|
||||
|
||||
- **Audit Trail**:
|
||||
- All actions, whether performed by humans or AI agents, are logged.
|
||||
- Review historical actions, decisions, and interventions for accountability and improvement.
|
||||
|
||||
### 3. Security
|
||||
|
||||
- **Authentication**: Secure login mechanisms with multi-factor authentication options.
|
||||
- **Authorization**: Ensure users and agents can only perform actions they are permitted to.
|
||||
- **Data Encryption**: All data, whether at rest or in transit, is encrypted using industry-standard protocols.
|
||||
|
||||
### 4. Integration
|
||||
|
||||
- **APIs**: Expose APIs for integrating SMAPS with other systems or for extending its capabilities.
|
||||
- **SDK**: Provide software development kits for popular programming languages to facilitate integration and extension.
|
||||
|
||||
## Documentation Description
|
||||
Swarms Multi-Agent Permissions System (SMAPS) offers a sophisticated permissions management mechanism tailored for multi-agent AI frameworks. It combines the robustness of Amazon IAM-like permissions with a unique "multiplayer" feature, allowing multiple humans to collaboratively guide AI agents in real-time. This ensures not only that tasks are executed efficiently but also that they uphold the highest standards of accuracy and ethics. With SMAPS, businesses can harness the power of swarms with confidence, knowing that they have full control and transparency over their AI operations.
|
@ -0,0 +1,73 @@
|
||||
# AgentArchive Documentation
|
||||
## Swarms Multi-Agent Framework
|
||||
|
||||
**AgentArchive is an advanced feature crafted to archive, bookmark, and harness the transcripts of agent runs. It promotes the storing and leveraging of successful agent interactions, offering a powerful means for users to derive "recipes" for future agents. Furthermore, with its public archive feature, users can contribute to and benefit from the collective wisdom of the community.**
|
||||
|
||||
---
|
||||
|
||||
## Overview:
|
||||
|
||||
AgentArchive empowers users to:
|
||||
1. Preserve complete transcripts of agent instances.
|
||||
2. Bookmark and annotate significant runs.
|
||||
3. Categorize runs using various tags.
|
||||
4. Transform successful runs into actionable "recipes".
|
||||
5. Publish and access a shared knowledge base via a public archive.
|
||||
|
||||
---
|
||||
|
||||
## Features:
|
||||
|
||||
### 1. Archiving:
|
||||
|
||||
- **Save Transcripts**: Retain the full narrative of an agent's interaction and choices.
|
||||
- **Searchable Database**: Dive into archives using specific keywords, timestamps, or tags.
|
||||
|
||||
### 2. Bookmarking:
|
||||
|
||||
- **Highlight Essential Runs**: Designate specific agent runs for future reference.
|
||||
- **Annotations**: Embed notes or remarks to bookmarked runs for clearer understanding.
|
||||
|
||||
### 3. Tagging:
|
||||
|
||||
Organize and classify agent runs via:
|
||||
- **Prompt**: The originating instruction that triggered the agent run.
|
||||
- **Tasks**: Distinct tasks or operations executed by the agent.
|
||||
- **Model**: The specific AI model or iteration used during the interaction.
|
||||
- **Temperature (Temp)**: The set randomness or innovation level for the agent.
|
||||
|
||||
### 4. Recipe Generation:
|
||||
|
||||
- **Standardization**: Convert successful run transcripts into replicable "recipes".
|
||||
- **Guidance**: Offer subsequent agents a structured approach, rooted in prior successes.
|
||||
- **Evolution**: Periodically refine recipes based on newer, enhanced runs.
|
||||
|
||||
### 5. Public Archive & Sharing:
|
||||
|
||||
- **Publish Successful Runs**: Users can choose to share their successful agent runs.
|
||||
- **Collaborative Knowledge Base**: Access a shared repository of successful agent interactions from the community.
|
||||
- **Ratings & Reviews**: Users can rate and review shared runs, highlighting particularly effective "recipes."
|
||||
- **Privacy & Redaction**: Ensure that any sensitive information is automatically redacted before publishing.
|
||||
|
||||
---
|
||||
|
||||
## Benefits:
|
||||
|
||||
1. **Efficiency**: Revisit past agent activities to inform and guide future decisions.
|
||||
2. **Consistency**: Guarantee a uniform approach to recurring challenges, leading to predictable and trustworthy outcomes.
|
||||
3. **Collaborative Learning**: Tap into a reservoir of shared experiences, fostering community-driven learning and growth.
|
||||
4. **Transparency**: By sharing successful runs, users can build trust and contribute to the broader community's success.
|
||||
|
||||
---
|
||||
|
||||
## Usage:
|
||||
|
||||
1. **Access AgentArchive**: Navigate to the dedicated section within the Swarms Multi-Agent Framework dashboard.
|
||||
2. **Search, Filter & Organize**: Utilize the search bar and tagging system for precise retrieval.
|
||||
3. **Bookmark, Annotate & Share**: Pin important runs, add notes, and consider sharing with the broader community.
|
||||
4. **Engage with Public Archive**: Explore, rate, and apply shared knowledge to enhance agent performance.
|
||||
|
||||
---
|
||||
|
||||
With AgentArchive, users not only benefit from their past interactions but can also leverage the collective expertise of the Swarms community, ensuring continuous improvement and shared success.
|
||||
|
@ -0,0 +1,67 @@
|
||||
# Swarms Multi-Agent Framework Documentation
|
||||
|
||||
## Table of Contents
|
||||
- Agent Failure Protocol
|
||||
- Swarm Failure Protocol
|
||||
|
||||
---
|
||||
|
||||
## Agent Failure Protocol
|
||||
|
||||
### 1. Overview
|
||||
Agent failures may arise from bugs, unexpected inputs, or external system changes. This protocol aims to diagnose, address, and prevent such failures.
|
||||
|
||||
### 2. Root Cause Analysis
|
||||
- **Data Collection**: Record the task, inputs, and environmental variables present during the failure.
|
||||
- **Diagnostic Tests**: Run the agent in a controlled environment replicating the failure scenario.
|
||||
- **Error Logging**: Analyze error logs to identify patterns or anomalies.
|
||||
|
||||
### 3. Solution Brainstorming
|
||||
- **Code Review**: Examine the code sections linked to the failure for bugs or inefficiencies.
|
||||
- **External Dependencies**: Check if external systems or data sources have changed.
|
||||
- **Algorithmic Analysis**: Evaluate if the agent's algorithms were overwhelmed or faced an unhandled scenario.
|
||||
|
||||
### 4. Risk Analysis & Solution Ranking
|
||||
- Assess the potential risks associated with each solution.
|
||||
- Rank solutions based on:
|
||||
- Implementation complexity
|
||||
- Potential negative side effects
|
||||
- Resource requirements
|
||||
- Assign a success probability score (0.0 to 1.0) based on the above factors.
|
||||
|
||||
### 5. Solution Implementation
|
||||
- Implement the top 3 solutions sequentially, starting with the highest success probability.
|
||||
- If all three solutions fail, trigger the "Human-in-the-Loop" protocol.
|
||||
|
||||
---
|
||||
|
||||
## Swarm Failure Protocol
|
||||
|
||||
### 1. Overview
|
||||
Swarm failures are more complex, often resulting from inter-agent conflicts, systemic bugs, or large-scale environmental changes. This protocol delves deep into such failures to ensure the swarm operates optimally.
|
||||
|
||||
### 2. Root Cause Analysis
|
||||
- **Inter-Agent Analysis**: Examine if agents were in conflict or if there was a breakdown in collaboration.
|
||||
- **System Health Checks**: Ensure all system components supporting the swarm are operational.
|
||||
- **Environment Analysis**: Investigate if external factors or systems impacted the swarm's operation.
|
||||
|
||||
### 3. Solution Brainstorming
|
||||
- **Collaboration Protocols**: Review and refine how agents collaborate.
|
||||
- **Resource Allocation**: Check if the swarm had adequate computational and memory resources.
|
||||
- **Feedback Loops**: Ensure agents are effectively learning from each other.
|
||||
|
||||
### 4. Risk Analysis & Solution Ranking
|
||||
- Assess the potential systemic risks posed by each solution.
|
||||
- Rank solutions considering:
|
||||
- Scalability implications
|
||||
- Impact on individual agents
|
||||
- Overall swarm performance potential
|
||||
- Assign a success probability score (0.0 to 1.0) based on the above considerations.
|
||||
|
||||
### 5. Solution Implementation
|
||||
- Implement the top 3 solutions sequentially, prioritizing the one with the highest success probability.
|
||||
- If all three solutions are unsuccessful, invoke the "Human-in-the-Loop" protocol for expert intervention.
|
||||
|
||||
---
|
||||
|
||||
By following these protocols, the Swarms Multi-Agent Framework can systematically address and prevent failures, ensuring a high degree of reliability and efficiency.
|
@ -0,0 +1,49 @@
|
||||
# Human-in-the-Loop Task Handling Protocol
|
||||
|
||||
## Overview
|
||||
|
||||
The Swarms Multi-Agent Framework recognizes the invaluable contributions humans can make, especially in complex scenarios where nuanced judgment is required. The "Human-in-the-Loop Task Handling Protocol" ensures that when agents encounter challenges they cannot handle autonomously, the most capable human collaborator is engaged to provide guidance, based on their skills and expertise.
|
||||
|
||||
## Protocol Steps
|
||||
|
||||
### 1. Task Initiation & Analysis
|
||||
|
||||
- When a task is initiated, agents first analyze the task's requirements.
|
||||
- The system maintains an understanding of each task's complexity, requirements, and potential challenges.
|
||||
|
||||
### 2. Automated Resolution Attempt
|
||||
|
||||
- Agents first attempt to resolve the task autonomously using their algorithms and data.
|
||||
- If the task can be completed without issues, it progresses normally.
|
||||
|
||||
### 3. Challenge Detection
|
||||
|
||||
- If agents encounter challenges or uncertainties they cannot resolve, the "Human-in-the-Loop" protocol is triggered.
|
||||
|
||||
### 4. Human Collaborator Identification
|
||||
|
||||
- The system maintains a dynamic profile of each human collaborator, cataloging their skills, expertise, and past performance on related tasks.
|
||||
- Using this profile data, the system identifies the most capable human collaborator to assist with the current challenge.
|
||||
|
||||
### 5. Real-time Collaboration
|
||||
|
||||
- The identified human collaborator is notified and provided with all the relevant information about the task and the challenge.
|
||||
- Collaborators can provide guidance, make decisions, or even take over specific portions of the task.
|
||||
|
||||
### 6. Task Completion & Feedback Loop
|
||||
|
||||
- Once the challenge is resolved, agents continue with the task until completion.
|
||||
- Feedback from human collaborators is used to update agent algorithms, ensuring continuous learning and improvement.
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Maintain Up-to-date Human Profiles**: Ensure that the skillsets, expertise, and performance metrics of human collaborators are updated regularly.
|
||||
2. **Limit Interruptions**: Implement mechanisms to limit the frequency of human interventions, ensuring collaborators are not overwhelmed with requests.
|
||||
3. **Provide Context**: When seeking human intervention, provide collaborators with comprehensive context to ensure they can make informed decisions.
|
||||
4. **Continuous Training**: Regularly update and train agents based on feedback from human collaborators.
|
||||
5. **Measure & Optimize**: Monitor the efficiency of the "Human-in-the-Loop" protocol, aiming to reduce the frequency of interventions while maximizing the value of each intervention.
|
||||
6. **Skill Enhancement**: Encourage human collaborators to continuously enhance their skills, ensuring that the collective expertise of the group grows over time.
|
||||
|
||||
## Conclusion
|
||||
|
||||
The integration of human expertise with AI capabilities is a cornerstone of the Swarms Multi-Agent Framework. This "Human-in-the-Loop Task Handling Protocol" ensures that tasks are executed efficiently, leveraging the best of both human judgment and AI automation. Through collaborative synergy, we can tackle challenges more effectively and drive innovation.
|
@ -0,0 +1,48 @@
|
||||
# Secure Communication Protocols
|
||||
|
||||
## Overview
|
||||
|
||||
The Swarms Multi-Agent Framework prioritizes the security and integrity of data, especially personal and sensitive information. Our Secure Communication Protocols ensure that all communications between agents are encrypted, authenticated, and resistant to tampering or unauthorized access.
|
||||
|
||||
## Features
|
||||
|
||||
### 1. End-to-End Encryption
|
||||
|
||||
- All inter-agent communications are encrypted using state-of-the-art cryptographic algorithms.
|
||||
- This ensures that data remains confidential and can only be read by the intended recipient agent.
|
||||
|
||||
### 2. Authentication
|
||||
|
||||
- Before initiating communication, agents authenticate each other using digital certificates.
|
||||
- This prevents impersonation attacks and ensures that agents are communicating with legitimate counterparts.
|
||||
|
||||
### 3. Forward Secrecy
|
||||
|
||||
- Key exchange mechanisms employ forward secrecy, meaning that even if a malicious actor gains access to an encryption key, they cannot decrypt past communications.
|
||||
|
||||
### 4. Data Integrity
|
||||
|
||||
- Cryptographic hashes ensure that the data has not been altered in transit.
|
||||
- Any discrepancies in data integrity result in the communication being rejected.
|
||||
|
||||
### 5. Zero-Knowledge Protocols
|
||||
|
||||
- When handling especially sensitive data, agents use zero-knowledge proofs to validate information without revealing the actual data.
|
||||
|
||||
### 6. Periodic Key Rotation
|
||||
|
||||
- To mitigate the risk of long-term key exposure, encryption keys are periodically rotated.
|
||||
- Old keys are securely discarded, ensuring that even if they are compromised, they cannot be used to decrypt communications.
|
||||
|
||||
## Best Practices for Handling Personal and Sensitive Information
|
||||
|
||||
1. **Data Minimization**: Agents should only request and process the minimum amount of personal data necessary for the task.
|
||||
2. **Anonymization**: Whenever possible, agents should anonymize personal data, stripping away identifying details.
|
||||
3. **Data Retention Policies**: Personal data should be retained only for the period necessary to complete the task, after which it should be securely deleted.
|
||||
4. **Access Controls**: Ensure that only authorized agents have access to personal and sensitive information. Implement strict access control mechanisms.
|
||||
5. **Regular Audits**: Conduct regular security audits to ensure compliance with privacy regulations and to detect any potential vulnerabilities.
|
||||
6. **Training**: All agents should be regularly updated and trained on the latest security protocols and best practices for handling sensitive data.
|
||||
|
||||
## Conclusion
|
||||
|
||||
Secure communication is paramount in the Swarms Multi-Agent Framework, especially when dealing with personal and sensitive information. Adhering to these protocols and best practices ensures the safety, privacy, and trust of all stakeholders involved.
|
@ -0,0 +1,68 @@
|
||||
# Promptimizer Documentation
|
||||
## Swarms Multi-Agent Framework
|
||||
|
||||
**The Promptimizer Tool stands as a cornerstone innovation within the Swarms Multi-Agent Framework, meticulously engineered to refine and supercharge prompts across diverse categories. Capitalizing on extensive libraries of best-practice prompting techniques, this tool ensures your prompts are razor-sharp, tailored, and primed for optimal outcomes.**
|
||||
|
||||
---
|
||||
|
||||
## Overview:
|
||||
|
||||
The Promptimizer Tool is crafted to:
|
||||
1. Rigorously analyze and elevate the quality of provided prompts.
|
||||
2. Furnish best-in-class recommendations rooted in proven prompting strategies.
|
||||
3. Serve a spectrum of categories, from technical operations to expansive creative ventures.
|
||||
|
||||
---
|
||||
|
||||
## Core Features:
|
||||
|
||||
### 1. Deep Prompt Analysis:
|
||||
|
||||
- **Clarity Matrix**: A proprietary algorithm assessing prompt clarity, removing ambiguities and sharpening focus.
|
||||
- **Efficiency Gauge**: Evaluates the prompt's structure to ensure swift and precise desired results.
|
||||
|
||||
### 2. Adaptive Recommendations:
|
||||
|
||||
- **Technique Engine**: Suggests techniques aligned with the gold standard for the chosen category.
|
||||
- **Exemplar Database**: Offers an extensive array of high-quality prompt examples for comparison and inspiration.
|
||||
|
||||
### 3. Versatile Category Framework:
|
||||
|
||||
- **Tech Suite**: Optimizes prompts for technical tasks, ensuring actionable clarity.
|
||||
- **Narrative Craft**: Hones prompts to elicit vivid and coherent stories.
|
||||
- **Visual Visionary**: Shapes prompts for precise and dynamic visual generation.
|
||||
- **Sonic Sculptor**: Orchestrates prompts for audio creation, tuning into desired tones and moods.
|
||||
|
||||
### 4. Machine Learning Integration:
|
||||
|
||||
- **Feedback Dynamo**: Harnesses user feedback, continually refining the tool's recommendation capabilities.
|
||||
- **Live Library Updates**: Periodic syncing with the latest in prompting techniques, ensuring the tool remains at the cutting edge.
|
||||
|
||||
### 5. Collaboration & Sharing:
|
||||
|
||||
- **TeamSync**: Allows teams to collaborate on prompt optimization in real-time.
|
||||
- **ShareSpace**: Share and access a community-driven repository of optimized prompts, fostering collective growth.
|
||||
|
||||
---
|
||||
|
||||
## Benefits:
|
||||
|
||||
1. **Precision Engineering**: Harness the power of refined prompts, ensuring desired outcomes are achieved with surgical precision.
|
||||
2. **Learning Hub**: Immerse in a tool that not only refines but educates, enhancing the user's prompting acumen.
|
||||
3. **Versatile Mastery**: Navigate seamlessly across categories, ensuring top-tier prompt quality regardless of the domain.
|
||||
4. **Community-driven Excellence**: Dive into a world of shared knowledge, elevating the collective expertise of the Swarms community.
|
||||
|
||||
---
|
||||
|
||||
## Usage Workflow:
|
||||
|
||||
1. **Launch the Prompt Optimizer**: Access the tool directly from the Swarms Multi-Agent Framework dashboard.
|
||||
2. **Prompt Entry**: Input the initial prompt for refinement.
|
||||
3. **Category Selection**: Pinpoint the desired category for specialized optimization.
|
||||
4. **Receive & Review**: Engage with the tool's recommendations, comparing original and optimized prompts.
|
||||
5. **Collaborate, Implement & Share**: Work in tandem with team members, deploy the refined prompt, and consider contributing to the community repository.
|
||||
|
||||
---
|
||||
|
||||
By integrating the Promptimizer Tool into their workflow, Swarms users stand poised to redefine the boundaries of what's possible, turning each prompt into a beacon of excellence and efficiency.
|
||||
|
@ -0,0 +1,68 @@
|
||||
# Shorthand Communication System
|
||||
## Swarms Multi-Agent Framework
|
||||
|
||||
**The Enhanced Shorthand Communication System is designed to streamline agent-agent communication within the Swarms Multi-Agent Framework. This system employs concise alphanumeric notations to relay task-specific details to agents efficiently.**
|
||||
|
||||
---
|
||||
|
||||
## Format:
|
||||
|
||||
The shorthand format is structured as `[AgentType]-[TaskLayer].[TaskNumber]-[Priority]-[Status]`.
|
||||
|
||||
---
|
||||
|
||||
## Components:
|
||||
|
||||
### 1. Agent Type:
|
||||
- Denotes the specific agent role, such as:
|
||||
* `C`: Code agent
|
||||
* `D`: Data processing agent
|
||||
* `M`: Monitoring agent
|
||||
* `N`: Network agent
|
||||
* `R`: Resource management agent
|
||||
* `I`: Interface agent
|
||||
* `S`: Security agent
|
||||
|
||||
### 2. Task Layer & Number:
|
||||
- Represents the task's category.
|
||||
* Example: `1.8` signifies Task layer 1, task number 8.
|
||||
|
||||
### 3. Priority:
|
||||
- Indicates task urgency.
|
||||
* `H`: High
|
||||
* `M`: Medium
|
||||
* `L`: Low
|
||||
|
||||
### 4. Status:
|
||||
- Gives a snapshot of the task's progress.
|
||||
* `I`: Initialized
|
||||
* `P`: In-progress
|
||||
* `C`: Completed
|
||||
* `F`: Failed
|
||||
* `W`: Waiting
|
||||
|
||||
---
|
||||
|
||||
## Extended Features:
|
||||
|
||||
### 1. Error Codes (for failures):
|
||||
- `E01`: Resource issues
|
||||
- `E02`: Data inconsistency
|
||||
- `E03`: Dependency malfunction
|
||||
... and more as needed.
|
||||
|
||||
### 2. Collaboration Flag:
|
||||
- `+`: Denotes required collaboration.
|
||||
|
||||
---
|
||||
|
||||
## Example Codes:
|
||||
|
||||
- `C-1.8-H-I`: A high-priority coding task that's initializing.
|
||||
- `D-2.3-M-P`: A medium-priority data task currently in-progress.
|
||||
- `M-3.5-L-P+`: A low-priority monitoring task in progress needing collaboration.
|
||||
|
||||
---
|
||||
|
||||
By leveraging the Enhanced Shorthand Communication System, the Swarms Multi-Agent Framework can ensure swift interactions, concise communications, and effective task management.
|
||||
|
@ -0,0 +1,16 @@
|
||||
|
||||
# Swarms
|
||||
|
||||
Orchestrate enterprise-grade agents for multi-agent collaboration and orchestration to automate real-world problems.
|
||||
|
||||
| Core Concepts | How-To Guides | Examples | Community |
|
||||
|--------------------------------------------------|----------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|
|
||||
| [Agents](swarms/structs/agent) | [Installing Swarms](swarms/install/install) | [Swarm of Business Analysts for Business Reports](applications/business-analyst-agent) | [Join the Swarms Community!](https://discord.gg/3Zck7nX6) |
|
||||
| [Memory](swarms/memory/diy_memory) | [Docker Setup](swarms/install/docker_setup) | [Compliance Swarm for Customer Privacy](https://medium.com/@kyeg/building-compliance-agents-with-chroma-db-llama3-sop-prompting-0ed3e73559d2) | [Swarms Ecosystem](https://github.com/kyegomez/swarm-ecosystem) |
|
||||
| [Tools](swarms/tools/main) | [Create Custom Tools](https://medium.com/@kyeg/the-swarms-tool-system-functions-pydantic-basemodels-as-tools-and-radical-customization-c2a2e227b8ca) | [Self-Replicating Hierarchical Swarms](https://medium.com/@kyeg/announcing-neosapiens-self-replicating-swarms-0a47410aafa7) | [Support Team](https://cal.com/swarms/swarms-onboarding-session) |
|
||||
| [Tasks](swarms/structs/task) | [Multi-Agent Flows](swarms/structs/agent_rearrange) | | [Book a 1 on 1 Call With Founder: Kye](https://cal.com/swarms/swarms-onboarding-session) |
|
||||
| [Multi-Agent Orchestration](swarms/structs/agent_rearrange) | [Sequential Workflows](swarms/structs/sequential_workflow) | | |
|
||||
| | [Connecting LLMs](https://docs.swarms.world/en/latest/swarms/models/custom_model/) | | |
|
||||
| | [Customizing Agents](./how-to/Customizing-Agents) | | |
|
||||
| | [Human Input on Execution](./how-to/Human-Input-on-Execution) | | |
|
||||
| | [Agent Monitoring with AgentOps](./how-to/AgentOps-Observability) | | |
|
@ -0,0 +1,167 @@
|
||||
docs_dir: '.' # replace with the correct path if your documentation files are not in the same directory as mkdocs.yml
|
||||
site_name: Swarms Documentation
|
||||
site_url: https://swarms.apac.ai
|
||||
site_author: Swarms
|
||||
site_description: Orchestrate Swarms of Agents From Any Framework Like OpenAI, Langchain, and Etc for Real World Workflow Automation.
|
||||
repo_name: kyegomez/swarms
|
||||
repo_url: https://github.com/kyegomez/swarms
|
||||
edit_uri: https://github.com/kyegomez/swarms/tree/main/docs
|
||||
copyright: TGSC Corp 2024. All rights reserved.
|
||||
|
||||
|
||||
plugins:
|
||||
# - glightbox
|
||||
- search
|
||||
- git-authors
|
||||
- mkdocs-jupyter:
|
||||
kernel_name: python3
|
||||
execute: false
|
||||
include_source: True
|
||||
include_requirejs: true
|
||||
- mkdocstrings:
|
||||
default_handler: python
|
||||
handlers:
|
||||
python:
|
||||
options:
|
||||
parameter_headings: true
|
||||
paths: [supervision]
|
||||
load_external_modules: true
|
||||
allow_inspection: true
|
||||
show_bases: true
|
||||
group_by_category: true
|
||||
docstring_style: google
|
||||
show_symbol_type_heading: true
|
||||
show_symbol_type_toc: true
|
||||
show_category_heading: true
|
||||
domains: [std, py]
|
||||
- git-committers:
|
||||
repository: kyegomez/swarms
|
||||
branch: master
|
||||
# token: !ENV ["GITHUB_TOKEN"]
|
||||
- git-revision-date-localized:
|
||||
enable_creation_date: true
|
||||
extra_css:
|
||||
- assets/css/extra.css
|
||||
extra:
|
||||
social:
|
||||
- icon: fontawesome/brands/twitter
|
||||
link: https://x.com/KyeGomezB
|
||||
- icon: fontawesome/brands/github
|
||||
link: https://github.com/kyegomez/swarms
|
||||
theme:
|
||||
name: material
|
||||
custom_dir: overrides
|
||||
logo: assets/img/SwarmsLogoIcon.png
|
||||
palette:
|
||||
# Palette toggle for light mode
|
||||
- scheme: default
|
||||
primary: black
|
||||
toggle:
|
||||
icon: material/brightness-7
|
||||
name: Switch to dark mode
|
||||
# Palette toggle for dark mode
|
||||
- scheme: slate
|
||||
primary: black
|
||||
toggle:
|
||||
icon: material/brightness-4
|
||||
name: Switch to light mode
|
||||
features:
|
||||
- content.code.copy
|
||||
- content.code.annotate
|
||||
- navigation.tabs
|
||||
- navigation.sections
|
||||
- navigation.expand
|
||||
- navigation.top
|
||||
- announce.dismiss
|
||||
markdown_extensions:
|
||||
- pymdownx.highlight:
|
||||
anchor_linenums: true
|
||||
line_spans: __span
|
||||
pygments_lang_class: true
|
||||
- admonition
|
||||
- pymdownx.inlinehilite
|
||||
- pymdownx.snippets
|
||||
- pymdownx.superfences
|
||||
- pymdownx.details
|
||||
- pymdownx.tabbed
|
||||
- tables
|
||||
- def_list
|
||||
- footnotes
|
||||
nav:
|
||||
- Home:
|
||||
- Overview: "index.md"
|
||||
- Installation Guide: "swarms/install/install.md"
|
||||
- Framework: "swarms/index.md"
|
||||
- Documentation:
|
||||
- Overview: "swarms/framework_structure.md"
|
||||
- Agents:
|
||||
- Agents Overview: "swarms/structs/index.md"
|
||||
- Agents with Tools: "swarms/tools/main.md"
|
||||
- Build Agents: "swarms/structs/diy_your_own_agent.md"
|
||||
- Custom Vector Memory Databases: "swarms/memory/diy_memory.md"
|
||||
- Short Term Memory: "swarms/memory/short_term_memory.md"
|
||||
- Models:
|
||||
- How to Create A Custom Language Model: "swarms/models/custom_model.md"
|
||||
- Models Available: "swarms/models/index.md"
|
||||
- MultiModal Models Available: "swarms/models/multimodal_models.md"
|
||||
- Deploying Azure OpenAI in Production A Comprehensive Guide: "swarms/models/azure_openai.md"
|
||||
- Language Models:
|
||||
- BaseLLM: "swarms/models/base_llm.md"
|
||||
- HuggingFaceLLM: "swarms/models/huggingface.md"
|
||||
- Anthropic: "swarms/models/anthropic.md"
|
||||
- OpenAIChat: "swarms/models/openai.md"
|
||||
- MultiModal Models:
|
||||
- BaseMultiModalModel: "swarms/models/base_multimodal_model.md"
|
||||
- Fuyu: "swarms/models/fuyu.md"
|
||||
- Vilt: "swarms/models/vilt.md"
|
||||
- Idefics: "swarms/models/idefics.md"
|
||||
- Kosmos: "swarms/models/kosmos.md"
|
||||
- Nougat: "swarms/models/nougat.md"
|
||||
- Dalle3: "swarms/models/dalle3.md"
|
||||
- GPT4VisionAPI: "swarms/models/gpt4v.md"
|
||||
- GPT4o: "swarms/models/gpt4o.md"
|
||||
- Multi-Agent Collaboration:
|
||||
- Overview: "swarms/structs/multi_agent_orchestration.md"
|
||||
- Workflow: "swarms/structs/workflow.md"
|
||||
- Multi-Agent Orchestration: "swarms/structs/multi_agent_orchestration.md"
|
||||
- Structs:
|
||||
- Foundational Structures:
|
||||
- Agent: "swarms/structs/agent.md"
|
||||
- BaseStructure: "swarms/structs/basestructure.md"
|
||||
- Task: "swarms/structs/task.md"
|
||||
- YamlModel: "swarms/structs/yaml_model.md"
|
||||
- BaseWorkflow: "swarms/structs/baseworkflow.md"
|
||||
- Workflows:
|
||||
- ConcurrentWorkflow: "swarms/structs/concurrentworkflow.md"
|
||||
- SequentialWorkflow: "swarms/structs/sequential_workflow.md"
|
||||
- Multi Agent Architectures:
|
||||
- Conversation: "swarms/structs/conversation.md"
|
||||
- SwarmNetwork: "swarms/structs/swarmnetwork.md"
|
||||
- MajorityVoting: "swarms/structs/majorityvoting.md"
|
||||
- AgentRearrange: "swarms/structs/agent_rearrange.md"
|
||||
- RoundRobin: "swarms/structs/round_robin_swarm.md"
|
||||
- Mixture of Agents: "swarms/structs/moa.md"
|
||||
- Guides:
|
||||
- Models:
|
||||
- How to Create A Custom Language Model: "swarms/models/custom_model.md"
|
||||
- Deploying Azure OpenAI in Production, A Comprehensive Guide: "swarms/models/azure_openai.md"
|
||||
- Agents:
|
||||
- Agent: "examples/flow.md"
|
||||
- DIY Build Your Own Agent: "diy_your_own_agent.md"
|
||||
- Equipping Autonomous Agents with Tools: "examples/tools_agent.md"
|
||||
- Swarms:
|
||||
- SequentialWorkflow: "examples/reliable_autonomous_agents.md"
|
||||
- Swarms Cloud API:
|
||||
- Overview: "swarms_cloud/main.md"
|
||||
- Available Models: "swarms_cloud/available_models.md"
|
||||
- Migrate from OpenAI to Swarms in 3 lines of code: "swarms_cloud/migrate_openai.md"
|
||||
- Getting Started with SOTA Vision Language Models VLM: "swarms_cloud/getting_started.md"
|
||||
- Enterprise Guide to High-Performance Multi-Agent LLM Deployments: "swarms_cloud/production_deployment.md"
|
||||
- Under The Hood The Swarm Cloud Serving Infrastructure: "swarms_cloud/architecture.md"
|
||||
- Contribute:
|
||||
- Contributing: "contributing.md"
|
||||
- Docker Setup: "swarms/install/docker_setup.md"
|
||||
- Multi-Agent Repository Template: "swarms/install/multi_agent_template.md"
|
||||
- Glossary + Further Reading:
|
||||
- Agent Glossary: "swarms/glossary.md"
|
||||
- List of The Best Multi-Agent Papers: "swarms/papers.md"
|
@ -0,0 +1,9 @@
|
||||
{% extends "base.html" %}
|
||||
|
||||
<!--https://squidfunk.github.io/mkdocs-material/customization/#overriding-blocks-->
|
||||
|
||||
{% block announce %}
|
||||
<div style="text-align:center">
|
||||
<a href="https://github.com/kyegomez/swarms">Star and contribute</a> to Swarms on GitHub!
|
||||
</div>
|
||||
{% endblock %}
|
@ -0,0 +1,55 @@
|
||||
# The Limits of Individual Agents
|
||||
|
||||

|
||||
|
||||
|
||||
Individual agents have pushed the boundaries of what machines can learn and accomplish. However, despite their impressive capabilities, these agents face inherent limitations that can hinder their effectiveness in complex, real-world applications. This blog explores the critical constraints of individual agents, such as context window limits, hallucination, single-task threading, and lack of collaboration, and illustrates how multi-agent collaboration can address these limitations. In short,
|
||||
|
||||
- Context Window Limits
|
||||
- Single Task Execution
|
||||
- Hallucination
|
||||
- No collaboration
|
||||
|
||||
|
||||
|
||||
#### Context Window Limits
|
||||
|
||||
One of the most significant constraints of individual agents, particularly in the domain of language models, is the context window limit. This limitation refers to the maximum amount of information an agent can consider at any given time. For instance, many language models can only process a fixed number of tokens (words or characters) in a single inference, restricting their ability to understand and generate responses based on longer texts. This limitation can lead to a lack of coherence in longer compositions and an inability to maintain context in extended conversations or documents.
|
||||
|
||||
#### Hallucination
|
||||
|
||||
Hallucination in AI refers to the phenomenon where an agent generates information that is not grounded in the input data or real-world facts. This can manifest as making up facts, entities, or events that do not exist or are incorrect. Hallucinations pose a significant challenge in ensuring the reliability and trustworthiness of AI-generated content, particularly in critical applications such as news generation, academic research, and legal advice.
|
||||
|
||||
#### Single Task Threading
|
||||
|
||||
Individual agents are often designed to excel at specific tasks, leveraging their architecture and training data to optimize performance in a narrowly defined domain. However, this specialization can also be a drawback, as it limits the agent's ability to multitask or adapt to tasks that fall outside its primary domain. Single-task threading means an agent may excel in language translation but struggle with image recognition or vice versa, necessitating the deployment of multiple specialized agents for comprehensive AI solutions.
|
||||
|
||||
#### Lack of Collaboration
|
||||
|
||||
Traditional AI agents operate in isolation, processing inputs and generating outputs independently. This isolation limits their ability to leverage diverse perspectives, share knowledge, or build upon the insights of other agents. In complex problem-solving scenarios, where multiple facets of a problem need to be addressed simultaneously, this lack of collaboration can lead to suboptimal solutions or an inability to tackle multifaceted challenges effectively.
|
||||
|
||||
# The Elegant yet Simple Solution
|
||||
|
||||
- ## Multi-Agent Collaboration
|
||||
|
||||
Recognizing the limitations of individual agents, researchers and practitioners have explored the potential of multi-agent collaboration as a means to transcend these constraints. Multi-agent systems comprise several agents that can interact, communicate, and collaborate to achieve common goals or solve complex problems. This collaborative approach offers several advantages:
|
||||
|
||||
#### Overcoming Context Window Limits
|
||||
|
||||
By dividing a large task among multiple agents, each focusing on different segments of the problem, multi-agent systems can effectively overcome the context window limits of individual agents. For instance, in processing a long document, different agents could be responsible for understanding and analyzing different sections, pooling their insights to generate a coherent understanding of the entire text.
|
||||
|
||||
#### Mitigating Hallucination
|
||||
|
||||
Through collaboration, agents can cross-verify facts and information, reducing the likelihood of hallucinations. If one agent generates a piece of information, other agents can provide checks and balances, verifying the accuracy against known data or through consensus mechanisms.
|
||||
|
||||
#### Enhancing Multitasking Capabilities
|
||||
|
||||
Multi-agent systems can tackle tasks that require a diverse set of skills by leveraging the specialization of individual agents. For example, in a complex project that involves both natural language processing and image analysis, one agent specialized in text can collaborate with another specialized in visual data, enabling a comprehensive approach to the task.
|
||||
|
||||
#### Facilitating Collaboration and Knowledge Sharing
|
||||
|
||||
Multi-agent collaboration inherently encourages the sharing of knowledge and insights, allowing agents to learn from each other and improve their collective performance. This can be particularly powerful in scenarios where iterative learning and adaptation are crucial, such as dynamic environments or tasks that evolve over time.
|
||||
|
||||
### Conclusion
|
||||
|
||||
While individual AI agents have made remarkable strides in various domains, their inherent limitations necessitate innovative approaches to unlock the full potential of artificial intelligence. Multi-agent collaboration emerges as a compelling solution, offering a pathway to transcend individual constraints through collective intelligence. By harnessing the power of collaborative AI, we can address more complex, multifaceted problems, paving the way for more versatile, efficient, and effective AI systems in the future.
|
@ -0,0 +1,134 @@
|
||||
# The Swarms Framework: Orchestrating Agents for Enterprise Automation
|
||||
|
||||
In the rapidly evolving landscape of artificial intelligence (AI) and automation, a new paradigm is emerging: the orchestration of multiple agents working in collaboration to tackle complex tasks. This approach, embodied by the Swarms Framework, aims to address the fundamental limitations of individual agents and unlocks the true potential of AI-driven automation in enterprise operations.
|
||||
|
||||
Individual agents are plagued by the same issues: short term memory constraints, hallucinations, single task limitations, lack of collaboration, and cost inefficiences.
|
||||
|
||||
[Learn more here from a list of compiled agent papers](https://github.com/kyegomez/awesome-multi-agent-papers)
|
||||
|
||||
## The Purpose of Swarms: Overcoming Agent Limitations
|
||||
|
||||
Individual agents, while remarkable in their own right, face several inherent challenges that hinder their ability to effectively automate enterprise operations at scale. These limitations include:
|
||||
|
||||
1. Short-Term Memory Constraints
|
||||
2. Hallucination and Factual Inconsistencies
|
||||
3. Single-Task Limitations
|
||||
4. Lack of Collaborative Capabilities
|
||||
5. Cost Inefficiencies
|
||||
|
||||
By orchestrating multiple agents to work in concert, the Swarms Framework directly tackles these limitations, paving the way for more efficient, reliable, and cost-effective enterprise automation.
|
||||
|
||||
### Limitation 1: Short-Term Memory Constraints
|
||||
|
||||
Many AI agents, particularly those based on large language models, suffer from short-term memory constraints. These agents can effectively process and respond to prompts, but their ability to retain and reason over information across multiple interactions or tasks is limited. This limitation can be problematic in enterprise environments, where complex workflows often involve retaining and referencing contextual information over extended periods.
|
||||
|
||||
The Swarms Framework addresses this limitation by leveraging the collective memory of multiple agents working in tandem. While individual agents may have limited short-term memory, their combined memory pool becomes significantly larger, enabling the retention and retrieval of contextual information over extended periods. This collective memory is facilitated by agents specializing in information storage and retrieval, such as those based on systems like Llama Index or Pinecone.
|
||||
|
||||
### Limitation 2: Hallucination and Factual Inconsistencies
|
||||
|
||||
Another challenge faced by many AI agents is the tendency to generate responses that may contain factual inconsistencies or hallucinations -- information that is not grounded in reality or the provided context. This issue can undermine the reliability and trustworthiness of automated systems, particularly in domains where accuracy and consistency are paramount.
|
||||
|
||||
The Swarms Framework mitigates this limitation by employing multiple agents with diverse knowledge bases and capabilities. By leveraging the collective intelligence of these agents, the framework can cross-reference and validate information, reducing the likelihood of hallucinations and factual inconsistencies. Additionally, specialized agents can be tasked with fact-checking and verification, further enhancing the overall reliability of the system.
|
||||
|
||||
### Limitation 3: Single-Task Limitations
|
||||
|
||||
Most individual AI agents are designed and optimized for specific tasks or domains, limiting their ability to handle complex, multi-faceted workflows that often characterize enterprise operations. While an agent may excel at a particular task, such as natural language processing or data analysis, it may struggle with other aspects of a larger workflow, such as task coordination or decision-making.
|
||||
|
||||
The Swarms Framework overcomes this limitation by orchestrating a diverse ensemble of agents, each specializing in different tasks or capabilities. By intelligently combining and coordinating these agents, the framework can tackle complex, multi-threaded workflows that span various domains and task types. This modular approach allows for the seamless integration of new agents as they become available, enabling the continuous expansion and enhancement of the system's capabilities.
|
||||
|
||||
### Limitation 4: Lack of Collaborative Capabilities
|
||||
|
||||
Most AI agents are designed to operate independently, lacking the ability to effectively collaborate with other agents or coordinate their actions towards a common goal. This limitation can hinder the scalability and efficiency of automated systems, particularly in enterprise environments where tasks often require the coordination of multiple agents or systems.
|
||||
|
||||
The Swarms Framework addresses this limitation by introducing a layer of coordination and collaboration among agents. Through specialized coordination agents and communication protocols, the framework enables agents to share information, divide tasks, and synchronize their actions. This collaborative approach not only increases efficiency but also enables the emergence of collective intelligence, where the combined capabilities of multiple agents surpass the sum of their individual abilities.
|
||||
|
||||
### Limitation 5: Cost Inefficiencies
|
||||
|
||||
Running large AI models or orchestrating multiple agents can be computationally expensive, particularly in enterprise environments where scalability and cost-effectiveness are critical considerations. Inefficient resource utilization or redundant computations can quickly escalate costs, making widespread adoption of AI-driven automation financially prohibitive.
|
||||
|
||||
The Swarms Framework tackles this limitation by optimizing resource allocation and workload distribution among agents. By intelligently assigning tasks to the most appropriate agents and leveraging agent specialization, the framework minimizes redundant computations and improves overall resource utilization. Additionally, the framework can dynamically scale agent instances based on demand, ensuring that computational resources are allocated efficiently and costs are minimized.
|
||||
|
||||
## The Swarms Framework: A Holistic Approach to Enterprise Automation
|
||||
|
||||
The Swarms Framework is a comprehensive solution that addresses the limitations of individual agents by orchestrating their collective capabilities. By integrating agents from various frameworks, including LangChain, AutoGPT, Llama Index, and others, the framework leverages the strengths of each agent while mitigating their individual weaknesses.
|
||||
|
||||
At its core, the Swarms Framework operates on the principle of multi-agent collaboration. By introducing specialized coordination agents and communication protocols, the framework enables agents to share information, divide tasks, and synchronize their actions towards a common goal. This collaborative approach not only increases efficiency but also enables the emergence of collective intelligence, where the combined capabilities of multiple agents surpass the sum of their individual abilities.
|
||||
|
||||
The framework's architecture is modular and extensible, allowing for the seamless integration of new agents as they become available. This flexibility ensures that the system's capabilities can continuously expand and adapt to evolving enterprise needs and technological advancements.
|
||||
|
||||
|
||||
## Benefits of the Swarms Framework
|
||||
|
||||
The adoption of the Swarms Framework in enterprise environments offers numerous benefits:
|
||||
|
||||
1. Increased Efficiency and Scalability
|
||||
2. Improved Reliability and Accuracy
|
||||
3. Adaptability and Continuous Improvement
|
||||
4. Cost Optimization
|
||||
5. Enhanced Security and Compliance
|
||||
|
||||
## Increased Efficiency and Scalability
|
||||
|
||||
By orchestrating the collective capabilities of multiple agents, the Swarms Framework enables the efficient execution of complex, multi-threaded workflows. Tasks can be parallelized and distributed across specialized agents, reducing bottlenecks and increasing overall throughput. Additionally, the framework's modular design and ability to dynamically scale agent instances based on demand ensure that the system can adapt to changing workloads and scale seamlessly as enterprise needs evolve.
|
||||
|
||||
## Improved Reliability and Accuracy
|
||||
|
||||
The collaborative nature of the Swarms Framework reduces the risk of hallucinations and factual inconsistencies that can arise from individual agents. By leveraging the collective knowledge and diverse perspectives of multiple agents, the framework can cross-reference and validate information, enhancing the overall reliability and accuracy of its outputs.
|
||||
|
||||
Additionally, the framework's ability to incorporate specialized fact-checking and verification agents further strengthens the trustworthiness of the system's outcomes, ensuring that critical decisions and actions are based on accurate and reliable information.
|
||||
|
||||
## Adaptability and Continuous Improvement
|
||||
|
||||
The modular architecture of the Swarms Framework allows for the seamless integration of new agents as they become available, enabling the continuous expansion and enhancement of the system's capabilities. As new AI models, algorithms, or data sources emerge, the framework can readily incorporate them, ensuring that enterprise operations remain at the forefront of technological advancements.
|
||||
|
||||
Furthermore, the framework's monitoring and analytics capabilities provide valuable insights into system performance, enabling the identification of areas for improvement and the optimization of agent selection, task assignments, and resource allocation strategies over time.
|
||||
|
||||
## Cost Optimization
|
||||
|
||||
By intelligently orchestrating the collaboration of multiple agents, the Swarms Framework optimizes resource utilization and minimizes redundant computations. This efficient use of computational resources translates into cost savings, making the widespread adoption of AI-driven automation more financially viable for enterprises.
|
||||
|
||||
The framework's ability to dynamically scale agent instances based on demand further contributes to cost optimization, ensuring that resources are allocated only when needed and minimizing idle or underutilized instances.
|
||||
|
||||
## Enhanced Security and Compliance
|
||||
|
||||
In enterprise environments, ensuring the security and compliance of automated systems is paramount. The Swarms Framework addresses these concerns by incorporating robust security measures and compliance controls.
|
||||
|
||||
The framework's centralized Memory Manager component enables the implementation of access control mechanisms and data encryption, protecting sensitive information from unauthorized access or breaches. Additionally, the framework's modular design allows for the integration of specialized agents focused on compliance monitoring and auditing, ensuring that enterprise operations adhere to relevant regulations and industry standards.
|
||||
|
||||
## Real-World Applications and Use Cases
|
||||
|
||||
The Swarms Framework finds applications across a wide range of enterprise domains, enabling organizations to automate complex operations and streamline their workflows. Here are some examples of real-world use cases:
|
||||
|
||||
1. Intelligent Process Automation (IPA)
|
||||
2. Customer Service and Support
|
||||
3. Fraud Detection and Risk Management
|
||||
4. Supply Chain Optimization
|
||||
5. Research and Development
|
||||
|
||||
## Intelligent Process Automation (IPA)
|
||||
|
||||
In the realm of business process automation, the Swarms Framework can orchestrate agents to automate and optimize complex workflows spanning multiple domains and task types. By combining agents specialized in areas such as natural language processing, data extraction, decision-making, and task coordination, the framework can streamline and automate processes that traditionally required manual intervention or coordination across multiple systems.
|
||||
|
||||
## Customer Service and Support
|
||||
|
||||
The framework's ability to integrate agents with diverse capabilities, such as natural language processing, knowledge retrieval, and decision-making, makes it well-suited for automating customer service and support operations. Agents can collaborate to understand customer inquiries, retrieve relevant information from knowledge bases, and provide accurate and personalized responses, improving customer satisfaction and reducing operational costs.
|
||||
|
||||
## Fraud Detection and Risk Management
|
||||
|
||||
In the financial and cybersecurity domains, the Swarms Framework can orchestrate agents specialized in data analysis, pattern recognition, and risk assessment to detect and mitigate fraudulent activities or security threats. By combining the collective intelligence of these agents, the framework can identify complex patterns and anomalies that may be difficult for individual agents to detect, enhancing the overall effectiveness of fraud detection and risk management strategies.
|
||||
|
||||
## Supply Chain Optimization
|
||||
|
||||
The complexity of modern supply chains often requires the coordination of multiple systems and stakeholders. The Swarms Framework can integrate agents specialized in areas such as demand forecasting, inventory management, logistics optimization, and supplier coordination to streamline and optimize supply chain operations. By orchestrating the collective capabilities of these agents, the framework can identify bottlenecks, optimize resource allocation, and facilitate seamless collaboration among supply chain partners.
|
||||
|
||||
## Research and Development
|
||||
|
||||
In research and development environments, the Swarms Framework can accelerate innovation by enabling the collaboration of agents specialized in areas such as literature review, data analysis, hypothesis generation, and experiment design. By orchestrating these agents, the framework can facilitate the exploration of new ideas, identify promising research directions, and streamline the iterative process of scientific inquiry.
|
||||
|
||||
# Conclusion
|
||||
|
||||
The Swarms Framework represents a paradigm shift in the field of enterprise automation, addressing the limitations of individual agents by orchestrating their collective capabilities. By integrating agents from various frameworks and enabling multi-agent collaboration, the Swarms Framework overcomes challenges such as short-term memory constraints, hallucinations, single-task limitations, lack of collaboration, and cost inefficiencies.
|
||||
|
||||
Through its modular architecture, centralized coordination, and advanced monitoring and analytics capabilities, the Swarms Framework empowers enterprises to automate complex operations with increased efficiency, reliability, and adaptability. It unlocks the true potential of AI-driven automation, enabling organizations to stay ahead of the curve and thrive in an ever-evolving technological landscape.
|
||||
|
||||
As the field of artificial intelligence continues to advance, the Swarms Framework stands as a robust and flexible solution, ready to embrace new developments and seamlessly integrate emerging agents and capabilities. By harnessing the power of collective intelligence, the framework paves the way for a future where enterprises can leverage the full potential of AI to drive innovation, optimize operations, and gain a competitive edge in their respective industries.
|
@ -0,0 +1,53 @@
|
||||
# Why Swarms?
|
||||
|
||||
The need for multiple agents to work together in artificial intelligence (AI) and particularly in the context of Large Language Models (LLMs) stems from several inherent limitations and challenges in handling complex, dynamic, and multifaceted tasks with single-agent systems. Collaborating with multiple agents offers a pathway to enhance reliability, computational efficiency, cognitive diversity, and problem-solving capabilities. This section delves into the rationale behind employing multi-agent systems and strategizes on overcoming the associated expenses, such as API bills and hosting costs.
|
||||
|
||||
### Why Multiple Agents Are Necessary
|
||||
|
||||
#### 1. **Cognitive Diversity**
|
||||
|
||||
Different agents can bring varied perspectives, knowledge bases, and problem-solving approaches to a task. This diversity is crucial in complex problem-solving scenarios where a single approach might not be sufficient. Cognitive diversity enhances creativity, leading to innovative solutions and the ability to tackle a broader range of problems.
|
||||
|
||||
#### 2. **Specialization and Expertise**
|
||||
|
||||
In many cases, tasks are too complex for a single agent to handle efficiently. By dividing the task among multiple specialized agents, each can focus on a segment where it excels, thereby increasing the overall efficiency and effectiveness of the solution. This approach leverages the expertise of individual agents to achieve superior performance in tasks that require multifaceted knowledge and skills.
|
||||
|
||||
#### 3. **Scalability and Flexibility**
|
||||
|
||||
Multi-agent systems can more easily scale to handle large-scale or evolving tasks. Adding more agents to the system can increase its capacity or capabilities, allowing it to adapt to larger workloads or new types of tasks. This scalability is essential in dynamic environments where the demand and nature of tasks can change rapidly.
|
||||
|
||||
#### 4. **Robustness and Redundancy**
|
||||
|
||||
Collaboration among multiple agents enhances the system's robustness by introducing redundancy. If one agent fails or encounters an error, others can compensate, ensuring the system remains operational. This redundancy is critical in mission-critical applications where failure is not an option.
|
||||
|
||||
### Overcoming Expenses with API Bills and Hosting
|
||||
|
||||
Deploying multiple agents, especially when relying on cloud-based services or APIs, can incur significant costs. Here are strategies to manage and reduce these expenses:
|
||||
|
||||
#### 1. **Optimize Agent Efficiency**
|
||||
|
||||
Before scaling up the number of agents, ensure each agent operates as efficiently as possible. This can involve refining algorithms, reducing unnecessary API calls, and optimizing data processing to minimize computational requirements and, consequently, the associated costs.
|
||||
|
||||
#### 2. **Use Open Source and Self-Hosted Solutions**
|
||||
|
||||
Where possible, leverage open-source models and technologies that can be self-hosted. While there is an initial investment in setting up the infrastructure, over time, self-hosting can significantly reduce costs related to API calls and reliance on third-party services.
|
||||
|
||||
#### 3. **Implement Intelligent Caching**
|
||||
|
||||
Caching results for frequently asked questions or common tasks can drastically reduce the need for repeated computations or API calls. Intelligent caching systems can determine what information to store and for how long, optimizing the balance between fresh data and computational savings.
|
||||
|
||||
#### 4. **Dynamic Scaling and Load Balancing**
|
||||
|
||||
Use cloud services that offer dynamic scaling and load balancing to adjust the resources allocated based on the current demand. This ensures you're not paying for idle resources during low-usage periods while still being able to handle high demand when necessary.
|
||||
|
||||
#### 5. **Collaborative Cost-Sharing Models**
|
||||
|
||||
In scenarios where multiple stakeholders benefit from the multi-agent system, consider implementing a cost-sharing model. This approach distributes the financial burden among the users or beneficiaries, making it more sustainable.
|
||||
|
||||
#### 6. **Monitor and Analyze Costs**
|
||||
|
||||
Regularly monitor and analyze your usage and associated costs to identify potential savings. Many cloud providers offer tools to track and forecast expenses, helping you to adjust your usage patterns and configurations to minimize costs without sacrificing performance.
|
||||
|
||||
### Conclusion
|
||||
|
||||
The collaboration of multiple agents in AI systems presents a robust solution to the complexity, specialization, scalability, and robustness challenges inherent in single-agent approaches. While the associated costs can be significant, strategic optimization, leveraging open-source technologies, intelligent caching, dynamic resource management, collaborative cost-sharing, and diligent monitoring can mitigate these expenses. By adopting these strategies, organizations can harness the power of multi-agent systems to tackle complex problems more effectively and efficiently, ensuring the sustainable deployment of these advanced technologies.
|
@ -0,0 +1,22 @@
|
||||
mkdocs
|
||||
mkdocs-material
|
||||
mkdocs-glightbox
|
||||
mkdocs-git-authors-plugin
|
||||
mkdocs-git-revision-date-plugin
|
||||
mkdocs-git-committers-plugin
|
||||
mkdocstrings
|
||||
mike
|
||||
mkdocs-jupyter
|
||||
mkdocs-git-committers-plugin-2
|
||||
mkdocs-git-revision-date-localized-plugin
|
||||
mkdocs-redirects
|
||||
mkdocs-material-extensions
|
||||
mkdocs-simple-hooks
|
||||
mkdocs-awesome-pages-plugin
|
||||
mkdocs-versioning
|
||||
mkdocs-mermaid2-plugin
|
||||
mkdocs-include-markdown-plugin
|
||||
mkdocs-enumerate-headings-plugin
|
||||
mkdocs-autolinks-plugin
|
||||
mkdocs-minify-html-plugin
|
||||
mkdocs-autolinks-plugin
|
@ -0,0 +1,123 @@
|
||||
# swarms.agents
|
||||
|
||||
## 1. Introduction
|
||||
|
||||
`AbstractAgent` is an abstract class that serves as a foundation for implementing AI agents. An agent is an entity that can communicate with other agents and perform actions. The `AbstractAgent` class allows for customization in the implementation of the `receive` method, enabling different agents to define unique actions for receiving and processing messages.
|
||||
|
||||
`AbstractAgent` provides capabilities for managing tools and accessing memory, and has methods for running, chatting, and stepping through communication with other agents.
|
||||
|
||||
## 2. Class Definition
|
||||
|
||||
```python
|
||||
class AbstractAgent:
|
||||
"""An abstract class for AI agent.
|
||||
|
||||
An agent can communicate with other agents and perform actions.
|
||||
Different agents can differ in what actions they perform in the `receive` method.
|
||||
|
||||
Agents are full and completed:
|
||||
|
||||
Agents = llm + tools + memory
|
||||
"""
|
||||
|
||||
def __init__(self, name: str):
|
||||
"""
|
||||
Args:
|
||||
name (str): name of the agent.
|
||||
"""
|
||||
self._name = name
|
||||
|
||||
@property
|
||||
def name(self):
|
||||
"""Get the name of the agent."""
|
||||
return self._name
|
||||
|
||||
def tools(self, tools):
|
||||
"""init tools"""
|
||||
|
||||
def memory(self, memory_store):
|
||||
"""init memory"""
|
||||
|
||||
def reset(self):
|
||||
"""(Abstract method) Reset the agent."""
|
||||
|
||||
def run(self, task: str):
|
||||
"""Run the agent once"""
|
||||
|
||||
def _arun(self, taks: str):
|
||||
"""Run Async run"""
|
||||
|
||||
def chat(self, messages: List[Dict]):
|
||||
"""Chat with the agent"""
|
||||
|
||||
def _achat(self, messages: List[Dict]):
|
||||
"""Asynchronous Chat"""
|
||||
|
||||
def step(self, message: str):
|
||||
"""Step through the agent"""
|
||||
|
||||
def _astep(self, message: str):
|
||||
"""Asynchronous step"""
|
||||
```
|
||||
|
||||
## 3. Functionality and Usage
|
||||
|
||||
The `AbstractAgent` class represents a generic AI agent and provides a set of methods to interact with it.
|
||||
|
||||
To create an instance of an agent, the `name` of the agent should be specified.
|
||||
|
||||
### Core Methods
|
||||
|
||||
#### 1. `reset`
|
||||
|
||||
The `reset` method allows the agent to be reset to its initial state.
|
||||
|
||||
```python
|
||||
agent.reset()
|
||||
```
|
||||
|
||||
#### 2. `run`
|
||||
|
||||
The `run` method allows the agent to perform a specific task.
|
||||
|
||||
```python
|
||||
agent.run("some_task")
|
||||
```
|
||||
|
||||
#### 3. `chat`
|
||||
|
||||
The `chat` method enables communication with the agent through a series of messages.
|
||||
|
||||
```python
|
||||
messages = [{"id": 1, "text": "Hello, agent!"}, {"id": 2, "text": "How are you?"}]
|
||||
agent.chat(messages)
|
||||
```
|
||||
|
||||
#### 4. `step`
|
||||
|
||||
The `step` method allows the agent to process a single message.
|
||||
|
||||
```python
|
||||
agent.step("Hello, agent!")
|
||||
```
|
||||
|
||||
### Asynchronous Methods
|
||||
|
||||
The class also provides asynchronous variants of the core methods.
|
||||
|
||||
### Additional Functionality
|
||||
|
||||
Additional functionalities for agent initialization and management of tools and memory are also provided.
|
||||
|
||||
```python
|
||||
agent.tools(some_tools)
|
||||
agent.memory(some_memory_store)
|
||||
```
|
||||
|
||||
## 4. Additional Information and Tips
|
||||
|
||||
When implementing a new agent using the `AbstractAgent` class, ensure that the `receive` method is overridden to define the specific behavior of the agent upon receiving messages.
|
||||
|
||||
## 5. References and Resources
|
||||
|
||||
For further exploration and understanding of AI agents and agent communication, refer to the relevant literature and research on this topic.
|
@ -0,0 +1,124 @@
|
||||
# `Idea2Image` Documentation
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Introduction](#introduction)
|
||||
2. [Idea2Image Class](#idea2image-class)
|
||||
- [Initialization Parameters](#initialization-parameters)
|
||||
3. [Methods and Usage](#methods-and-usage)
|
||||
- [llm_prompt Method](#llm-prompt-method)
|
||||
- [generate_image Method](#generate-image-method)
|
||||
4. [Examples](#examples)
|
||||
- [Example 1: Generating an Image](#example-1-generating-an-image)
|
||||
5. [Additional Information](#additional-information)
|
||||
6. [References and Resources](#references-and-resources)
|
||||
|
||||
---
|
||||
|
||||
## 1. Introduction <a name="introduction"></a>
|
||||
|
||||
Welcome to the documentation for the Swarms library, with a focus on the `Idea2Image` class. This comprehensive guide provides in-depth information about the Swarms library and its core components. Before we dive into the details, it's crucial to understand the purpose and significance of this library.
|
||||
|
||||
### 1.1 Purpose
|
||||
|
||||
The Swarms library aims to simplify interactions with AI models for generating images from text prompts. The `Idea2Image` class is designed to generate images from textual descriptions using the DALLE-3 model and the OpenAI GPT-4 language model.
|
||||
|
||||
### 1.2 Key Features
|
||||
|
||||
- **Image Generation:** Swarms allows you to generate images based on natural language prompts, providing a bridge between textual descriptions and visual content.
|
||||
|
||||
- **Integration with DALLE-3:** The `Idea2Image` class leverages the power of DALLE-3 to create images that match the given textual descriptions.
|
||||
|
||||
- **Language Model Integration:** The class integrates with OpenAI's GPT-3 for prompt refinement, enhancing the specificity of image generation.
|
||||
|
||||
---
|
||||
|
||||
## 2. Idea2Image Class <a name="idea2image-class"></a>
|
||||
|
||||
The `Idea2Image` class is a fundamental module in the Swarms library, enabling the generation of images from text prompts.
|
||||
|
||||
### 2.1 Initialization Parameters <a name="initialization-parameters"></a>
|
||||
|
||||
Here are the initialization parameters for the `Idea2Image` class:
|
||||
|
||||
- `image` (str): Text prompt for the image to generate.
|
||||
|
||||
- `openai_api_key` (str): OpenAI API key. This key is used for prompt refinement with GPT-3. If not provided, the class will attempt to use the `OPENAI_API_KEY` environment variable.
|
||||
|
||||
- `cookie` (str): Cookie value for DALLE-3. This cookie is used to interact with the DALLE-3 API. If not provided, the class will attempt to use the `BING_COOKIE` environment variable.
|
||||
|
||||
- `output_folder` (str): Folder to save the generated images. The default folder is "images/".
|
||||
|
||||
### 2.2 Methods <a name="methods-and-usage"></a>
|
||||
|
||||
The `Idea2Image` class provides the following methods:
|
||||
|
||||
- `llm_prompt()`: Returns a prompt for refining the image generation. This method helps improve the specificity of the image generation prompt.
|
||||
|
||||
- `generate_image()`: Generates and downloads the image based on the prompt. It refines the prompt, opens the website with the query, retrieves image URLs, and downloads the images to the specified folder.
|
||||
|
||||
---
|
||||
|
||||
## 3. Methods and Usage <a name="methods-and-usage"></a>
|
||||
|
||||
Let's explore the methods provided by the `Idea2Image` class and how to use them effectively.
|
||||
|
||||
### 3.1 `llm_prompt` Method <a name="llm-prompt-method"></a>
|
||||
|
||||
The `llm_prompt` method returns a refined prompt for generating the image. It's a critical step in improving the specificity and accuracy of the image generation process. The method provides a guide for refining the prompt, helping users describe the desired image more precisely.
|
||||
|
||||
### 3.2 `generate_image` Method <a name="generate-image-method"></a>
|
||||
|
||||
The `generate_image` method combines the previous methods to execute the whole process of generating and downloading images based on the provided prompt. It's a convenient way to automate the image generation process.
|
||||
|
||||
---
|
||||
|
||||
## 4. Examples <a name="examples"></a>
|
||||
|
||||
Let's dive into practical examples to demonstrate the usage of the `Idea2Image` class.
|
||||
|
||||
### 4.1 Example 1: Generating an Image <a name="example-1-generating-an-image"></a>
|
||||
|
||||
In this example, we create an instance of the `Idea2Image` class and use it to generate an image based on a text prompt:
|
||||
|
||||
```python
|
||||
from swarms.agents import Idea2Image
|
||||
|
||||
# Create an instance of the Idea2Image class with your prompt and API keys
|
||||
idea2image = Idea2Image(
|
||||
image="Fish hivemind swarm in light blue avatar anime in zen garden pond concept art anime art, happy fish, anime scenery",
|
||||
openai_api_key="your_openai_api_key_here",
|
||||
cookie="your_cookie_value_here",
|
||||
)
|
||||
|
||||
# Generate and download the image
|
||||
idea2image.generate_image()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Additional Information <a name="additional-information"></a>
|
||||
|
||||
Here are some additional tips and information for using the Swarms library and the `Idea2Image` class effectively:
|
||||
|
||||
- Refining the prompt is a crucial step to influence the style, composition, and mood of the generated image. Follow the provided guide in the `llm_prompt` method to create precise prompts.
|
||||
|
||||
- Experiment with different prompts, variations, and editing techniques to create unique and interesting images.
|
||||
|
||||
- You can combine separate DALLE-3 outputs into panoramas and murals by careful positioning and editing.
|
||||
|
||||
- Consider sharing your creations and exploring resources in communities like Reddit r/dalle2 for inspiration and tools.
|
||||
|
||||
- The `output_folder` parameter allows you to specify the folder where generated images will be saved. Ensure that you have the necessary permissions to write to that folder.
|
||||
|
||||
---
|
||||
|
||||
## 6. References and Resources <a name="references-and-resources"></a>
|
||||
|
||||
For further information and resources related to the Swarms library and DALLE-3:
|
||||
|
||||
- [DALLE-3 Unofficial API Documentation](https://www.bing.com/images/create): The official documentation for the DALLE-3 Unofficial API, where you can explore additional features and capabilities.
|
||||
|
||||
- [OpenAI GPT-3 Documentation](https://beta.openai.com/docs/): The documentation for OpenAI's GPT-3, which is used for prompt refinement.
|
||||
|
||||
This concludes the documentation for the Swarms library and the `Idea2Image` class. You now have a comprehensive guide on how to generate images from text prompts using DALLE-3 and GPT-3 with Swarms.
|
@ -0,0 +1,112 @@
|
||||
# The Module/Class Name: Message
|
||||
|
||||
In the swarms.agents framework, the class `Message` is used to represent a message with timestamp and optional metadata.
|
||||
|
||||
## Overview and Introduction
|
||||
|
||||
The `Message` class is a fundamental component that enables the representation of messages within an agent system. Messages contain essential information such as the sender, content, timestamp, and optional metadata.
|
||||
|
||||
## Class Definition
|
||||
|
||||
### Constructor: `__init__`
|
||||
|
||||
The constructor of the `Message` class takes three parameters:
|
||||
|
||||
1. `sender` (str): The sender of the message.
|
||||
2. `content` (str): The content of the message.
|
||||
3. `metadata` (dict or None): Optional metadata associated with the message.
|
||||
|
||||
### Methods
|
||||
|
||||
1. `__repr__(self)`: Returns a string representation of the `Message` object, including the timestamp, sender, and content.
|
||||
|
||||
```python
|
||||
class Message:
|
||||
"""
|
||||
Represents a message with timestamp and optional metadata.
|
||||
|
||||
Usage
|
||||
--------------
|
||||
mes = Message(
|
||||
sender = "Kye",
|
||||
content = "message"
|
||||
)
|
||||
|
||||
print(mes)
|
||||
"""
|
||||
|
||||
def __init__(self, sender, content, metadata=None):
|
||||
self.timestamp = datetime.datetime.now()
|
||||
self.sender = sender
|
||||
self.content = content
|
||||
self.metadata = metadata or {}
|
||||
|
||||
def __repr__(self):
|
||||
"""
|
||||
__repr__ represents the string representation of the Message object.
|
||||
|
||||
Returns:
|
||||
(str) A string containing the timestamp, sender, and content of the message.
|
||||
"""
|
||||
return f"{self.timestamp} - {self.sender}: {self.content}"
|
||||
```
|
||||
|
||||
## Functionality and Usage
|
||||
|
||||
The `Message` class represents a message in the agent system. Upon initialization, the `timestamp` is set to the current date and time, and the `metadata` is set to an empty dictionary if no metadata is provided.
|
||||
|
||||
### Usage Example 1
|
||||
|
||||
Creating a `Message` object and displaying its string representation.
|
||||
|
||||
```python
|
||||
mes = Message(sender="Kye", content="Hello! How are you?")
|
||||
|
||||
print(mes)
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
2023-09-20 13:45:00 - Kye: Hello! How are you?
|
||||
```
|
||||
|
||||
### Usage Example 2
|
||||
|
||||
Creating a `Message` object with metadata.
|
||||
|
||||
```python
|
||||
metadata = {"priority": "high", "category": "urgent"}
|
||||
mes_with_metadata = Message(
|
||||
sender="Alice", content="Important update", metadata=metadata
|
||||
)
|
||||
|
||||
print(mes_with_metadata)
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
2023-09-20 13:46:00 - Alice: Important update
|
||||
```
|
||||
|
||||
### Usage Example 3
|
||||
|
||||
Creating a `Message` object without providing metadata.
|
||||
|
||||
```python
|
||||
mes_no_metadata = Message(sender="Bob", content="Reminder: Meeting at 2PM")
|
||||
|
||||
print(mes_no_metadata)
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
2023-09-20 13:47:00 - Bob: Reminder: Meeting at 2PM
|
||||
```
|
||||
|
||||
## Additional Information and Tips
|
||||
|
||||
When creating a new `Message` object, ensure that the required parameters `sender` and `content` are provided. The `timestamp` will automatically be assigned the current date and time. Optional `metadata` can be included to provide additional context or information associated with the message.
|
||||
|
||||
## References and Resources
|
||||
|
||||
For further information on the `Message` class and its usage, refer to the official swarms.agents documentation and relevant tutorials related to message handling and communication within the agent system.
|
@ -0,0 +1,94 @@
|
||||
# `OmniModalAgent` Documentation
|
||||
|
||||
## Overview & Architectural Analysis
|
||||
The `OmniModalAgent` class is at the core of an architecture designed to facilitate dynamic interactions using various tools, through a seamless integration of planning, task execution, and response generation mechanisms. It encompasses multiple modalities including natural language processing, image processing, and more, aiming to provide comprehensive and intelligent responses.
|
||||
|
||||
### Architectural Components:
|
||||
1. **LLM (Language Model)**: It acts as the foundation, underpinning the understanding and generation of language-based interactions.
|
||||
2. **Chat Planner**: This component drafts a blueprint for the steps necessary based on the user's input.
|
||||
3. **Task Executor**: As the name suggests, it's responsible for executing the formulated tasks.
|
||||
4. **Tools**: A collection of tools and utilities used to process different types of tasks. They span across areas like image captioning, translation, and more.
|
||||
|
||||
|
||||
## Structure & Organization
|
||||
|
||||
### Table of Contents:
|
||||
1. Class Introduction and Architecture
|
||||
2. Constructor (`__init__`)
|
||||
3. Core Methods
|
||||
- `run`
|
||||
- `chat`
|
||||
- `_stream_response`
|
||||
4. Example Usage
|
||||
5. Error Messages & Exception Handling
|
||||
6. Summary & Further Reading
|
||||
|
||||
### Constructor (`__init__`):
|
||||
The agent is initialized with a language model (`llm`). During initialization, the agent loads a myriad of tools to facilitate a broad spectrum of tasks, from document querying to image transformations.
|
||||
|
||||
### Core Methods:
|
||||
#### 1. `run(self, input: str) -> str`:
|
||||
Executes the OmniAgent. The agent plans its actions based on the user's input, executes those actions, and then uses a response generator to construct its reply.
|
||||
|
||||
#### 2. `chat(self, msg: str, streaming: bool) -> str`:
|
||||
Facilitates an interactive chat with the agent. It processes user messages, handles exceptions, and returns a response, either in streaming format or as a whole string.
|
||||
|
||||
#### 3. `_stream_response(self, response: str)`:
|
||||
For streaming mode, this function yields the response token by token, ensuring a smooth output agent.
|
||||
|
||||
## Examples & Use Cases
|
||||
Initialize the `OmniModalAgent` and communicate with it:
|
||||
```python
|
||||
import os
|
||||
|
||||
from dotenv import load_dotenv
|
||||
|
||||
from swarms.agents.omni_modal_agent import OmniModalAgent, OpenAIChat
|
||||
from swarms.models import OpenAIChat
|
||||
|
||||
# Load the environment variables
|
||||
load_dotenv()
|
||||
|
||||
# Get the API key from the environment
|
||||
api_key = os.environ.get("OPENAI_API_KEY")
|
||||
|
||||
# Initialize the language model
|
||||
llm = OpenAIChat(
|
||||
temperature=0.5,
|
||||
model_name="gpt-4",
|
||||
openai_api_key=api_key,
|
||||
)
|
||||
|
||||
|
||||
agent = OmniModalAgent(llm)
|
||||
response = agent.run("Translate 'Hello' to French.")
|
||||
print(response)
|
||||
```
|
||||
|
||||
For a chat-based interaction:
|
||||
```python
|
||||
agent = OmniModalAgent(llm_instance)
|
||||
print(agent.chat("How are you doing today?"))
|
||||
```
|
||||
|
||||
## Error Messages & Exception Handling
|
||||
The `chat` method in `OmniModalAgent` incorporates exception handling. When an error arises during message processing, it returns a formatted error message detailing the exception. This approach ensures that users receive informative feedback in case of unexpected situations.
|
||||
|
||||
For example, if there's an internal processing error, the chat function would return:
|
||||
```
|
||||
Error processing message: [Specific error details]
|
||||
```
|
||||
|
||||
## Summary
|
||||
`OmniModalAgent` epitomizes the fusion of various AI tools, planners, and executors into one cohesive unit, providing a comprehensive interface for diverse tasks and modalities. The versatility and robustness of this agent make it indispensable for applications desiring to bridge multiple AI functionalities in a unified manner.
|
||||
|
||||
For more extensive documentation, API references, and advanced use-cases, users are advised to refer to the primary documentation repository associated with the parent project. Regular updates, community feedback, and patches can also be found there.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
@ -0,0 +1,79 @@
|
||||
# Module/Class Name: OmniModalAgent
|
||||
|
||||
The `OmniModalAgent` class is a module that operates based on the Language Model (LLM) aka Language Understanding Model, Plans, Tasks, and Tools. It is designed to be a multi-modal chatbot which uses various AI-based capabilities for fulfilling user requests.
|
||||
|
||||
It has the following architecture:
|
||||
1. Language Model (LLM).
|
||||
2. Chat Planner - Plans
|
||||
3. Task Executor - Tasks
|
||||
4. Tools - Tools
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
### Usage
|
||||
from swarms import OmniModalAgent, OpenAIChat
|
||||
|
||||
llm = OpenAIChat()
|
||||
agent = OmniModalAgent(llm)
|
||||
response = agent.run("Hello, how are you? Create an image of how your are doing!")
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
### Initialization
|
||||
|
||||
The constructor of `OmniModalAgent` class takes two main parameters:
|
||||
- `llm`: A `BaseLanguageModel` that represents the language model
|
||||
- `tools`: A List of `BaseTool` instances that are used by the agent for fulfilling different requests.
|
||||
|
||||
```python
|
||||
def __init__(
|
||||
self,
|
||||
llm: BaseLanguageModel,
|
||||
# tools: List[BaseTool]
|
||||
):
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Methods
|
||||
|
||||
The class has two main methods:
|
||||
1. `run`: This method takes an input string and executes various plans and tasks using the provided tools. Ultimately, it generates a response based on the user's input and returns it.
|
||||
- Parameters:
|
||||
- `input`: A string representing the user's input text.
|
||||
- Returns:
|
||||
- A string representing the response.
|
||||
|
||||
Usage:
|
||||
```python
|
||||
response = agent.run("Hello, how are you? Create an image of how your are doing!")
|
||||
```
|
||||
|
||||
2. `chat`: This method is used to simulate a chat dialog with the agent. It can take user's messages and return the response (or stream the response word-by-word if required).
|
||||
- Parameters:
|
||||
- `msg` (optional): A string representing the message to send to the agent.
|
||||
- `streaming` (optional): A boolean specifying whether to stream the response.
|
||||
- Returns:
|
||||
- A string representing the response from the agent.
|
||||
|
||||
Usage:
|
||||
```python
|
||||
response = agent.chat("Hello")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Streaming Response
|
||||
|
||||
The class provides a method `_stream_response` that can be used to get the response token by token (i.e. word by word). It yields individual tokens from the response.
|
||||
|
||||
Usage:
|
||||
```python
|
||||
for token in _stream_response(response):
|
||||
print(token)
|
||||
```
|
||||
|
@ -0,0 +1,111 @@
|
||||
# ToolAgent Documentation
|
||||
|
||||
|
||||
### Overview and Introduction
|
||||
|
||||
The `ToolAgent` class represents an intelligent agent capable of performing a specific task using a pre-trained model and tokenizer. It leverages the Transformer models of the Hugging Face `transformers` library to generate outputs that adhere to a specific JSON schema. This provides developers with a flexible tool for creating bots, text generators, and conversational AI agents. The `ToolAgent` operates based on a JSON schema provided by you, the user. Using the schema, the agent applies the provided model and tokenizer to generate structured text data that matches the specified format.
|
||||
|
||||
The primary objective of the `ToolAgent` class is to amplify the efficiency of developers and AI practitioners by simplifying the process of generating meaningful outputs that navigate the complexities of the model and tokenizer.
|
||||
|
||||
### Class Definition
|
||||
|
||||
The `ToolAgent` class has the following definition:
|
||||
|
||||
```python
|
||||
class ToolAgent(BaseLLM):
|
||||
def __init__(
|
||||
self,
|
||||
name: str,
|
||||
description: str,
|
||||
model: Any,
|
||||
tokenizer: Any,
|
||||
json_schema: Any,
|
||||
*args,
|
||||
**kwargs,
|
||||
)
|
||||
def run(self, task: str, *args, **kwargs)
|
||||
def __call__(self, task: str, *args, **kwargs)
|
||||
```
|
||||
|
||||
### Arguments
|
||||
|
||||
The `ToolAgent` class takes the following arguments:
|
||||
|
||||
| Argument | Type | Description |
|
||||
| --- | --- | --- |
|
||||
| name | str | The name of the tool agent.
|
||||
| description | str | A description of the tool agent.
|
||||
| model | Any | The model used by the tool agent (e.g., `transformers.AutoModelForCausalLM`).
|
||||
| tokenizer | Any | The tokenizer used by the tool agent (e.g., `transformers.AutoTokenizer`).
|
||||
| json_schema | Any | The JSON schema used by the tool agent.
|
||||
| *args | - | Variable-length arguments.
|
||||
| **kwargs | - | Keyword arguments.
|
||||
|
||||
### Methods
|
||||
|
||||
`ToolAgent` exposes the following methods:
|
||||
|
||||
#### `run(self, task: str, *args, **kwargs) -> Any`
|
||||
|
||||
- Description: Runs the tool agent for a specific task.
|
||||
- Parameters:
|
||||
- `task` (str): The task to be performed by the tool agent.
|
||||
- `*args`: Variable-length argument list.
|
||||
- `**kwargs`: Arbitrary keyword arguments.
|
||||
- Returns: The output of the tool agent.
|
||||
- Raises: Exception if an error occurs during the execution of the tool agent.
|
||||
|
||||
|
||||
#### `__call__(self, task: str, *args, **kwargs) -> Any`
|
||||
|
||||
- Description: Calls the tool agent to perform a specific task.
|
||||
- Parameters:
|
||||
- `task` (str): The task to be performed by the tool agent.
|
||||
- `*args`: Variable-length argument list.
|
||||
- `**kwargs`: Arbitrary keyword arguments.
|
||||
- Returns: The output of the tool agent.
|
||||
|
||||
### Usage Example
|
||||
|
||||
```python
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
|
||||
from swarms import ToolAgent
|
||||
|
||||
# Creating a model and tokenizer
|
||||
model = AutoModelForCausalLM.from_pretrained("databricks/dolly-v2-12b")
|
||||
tokenizer = AutoTokenizer.from_pretrained("databricks/dolly-v2-12b")
|
||||
|
||||
# Defining a JSON schema
|
||||
json_schema = {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"name": {"type": "string"},
|
||||
"age": {"type": "number"},
|
||||
"is_student": {"type": "boolean"},
|
||||
"courses": {"type": "array", "items": {"type": "string"}},
|
||||
},
|
||||
}
|
||||
|
||||
# Defining a task
|
||||
task = "Generate a person's information based on the following schema:"
|
||||
|
||||
# Creating the ToolAgent instance
|
||||
agent = ToolAgent(model=model, tokenizer=tokenizer, json_schema=json_schema)
|
||||
|
||||
# Running the tool agent
|
||||
generated_data = agent.run(task)
|
||||
|
||||
# Accessing and printing the generated data
|
||||
print(generated_data)
|
||||
```
|
||||
|
||||
### Additional Information and Tips
|
||||
|
||||
When using the `ToolAgent`, it is important to ensure compatibility between the provided model, tokenizer, and the JSON schema. Additionally, any errors encountered during the execution of the tool agent are propagated as exceptions. Handling such exceptions appropriately can improve the robustness of the tool agent usage.
|
||||
|
||||
### References and Resources
|
||||
|
||||
For further exploration and understanding of the underlying Transformer-based models and tokenizers, refer to the Hugging Face `transformers` library documentation and examples. Additionally, for JSON schema modeling, you can refer to the official JSON Schema specification and examples.
|
||||
|
||||
This documentation provides a comprehensive guide on using the `ToolAgent` class from `swarms` library, and it is recommended to refer back to this document when utilizing the `ToolAgent` for developing your custom conversational agents or text generation tools.
|
@ -0,0 +1,78 @@
|
||||
# WorkerClass Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
The Worker class represents an autonomous agent that can perform tasks through function calls or by running a chat. It can be used to create applications that demand effective user interactions like search engines, human-like conversational bots, or digital assistants.
|
||||
|
||||
The `Worker` class is part of the `swarms.agents` codebase. This module is largely used in Natural Language Processing (NLP) projects where the agent undertakes conversations and other language-specific operations.
|
||||
|
||||
## Class Definition
|
||||
|
||||
The class `Worker` has the following arguments:
|
||||
|
||||
| Argument | Type | Default Value | Description |
|
||||
|-----------------------|---------------|----------------------------------|----------------------------------------------------|
|
||||
| name | str | "Worker" | Name of the agent. |
|
||||
| role | str | "Worker in a swarm" | Role of the agent. |
|
||||
| external_tools | list | None | List of external tools available to the agent. |
|
||||
| human_in_the_loop | bool | False | Determines whether human interaction is required. |
|
||||
| temperature | float | 0.5 | Temperature for the autonomous agent. |
|
||||
| llm | None | None | Language model. |
|
||||
| openai_api_key | str | None | OpenAI API key. |
|
||||
| tools | List[Any] | None | List of tools available to the agent. |
|
||||
| embedding_size | int | 1536 | Size of the word embeddings. |
|
||||
| search_kwargs | dict | {"k": 8} | Search parameters. |
|
||||
| args | Multiple | | Additional arguments that can be passed. |
|
||||
| kwargs | Multiple | | Additional keyword arguments that can be passed. |
|
||||
## Usage
|
||||
|
||||
#### Example 1: Creating and Running an Agent
|
||||
|
||||
```python
|
||||
from swarms import Worker
|
||||
|
||||
worker = Worker(
|
||||
name="My Worker",
|
||||
role="Worker",
|
||||
external_tools=[MyTool1(), MyTool2()],
|
||||
human_in_the_loop=False,
|
||||
temperature=0.5,
|
||||
llm=some_language_model,
|
||||
openai_api_key="my_key",
|
||||
)
|
||||
worker.run("What's the weather in Miami?")
|
||||
```
|
||||
|
||||
#### Example 2: Receiving and Sending Messages
|
||||
|
||||
```python
|
||||
worker.receieve("User", "Hello there!")
|
||||
worker.receieve("User", "Can you tell me something about history?")
|
||||
worker.send()
|
||||
```
|
||||
|
||||
#### Example 3: Setting up Tools
|
||||
|
||||
```python
|
||||
external_tools = [MyTool1(), MyTool2()]
|
||||
worker = Worker(
|
||||
name="My Worker",
|
||||
role="Worker",
|
||||
external_tools=external_tools,
|
||||
human_in_the_loop=False,
|
||||
temperature=0.5,
|
||||
)
|
||||
```
|
||||
|
||||
## Additional Information and Tips
|
||||
|
||||
- The class allows the setting up of tools for the worker to operate effectively. It provides setup facilities for essential computing infrastructure, such as the agent's memory and language model.
|
||||
- By setting the `human_in_the_loop` parameter to True, interactions with the worker can be made more user-centric.
|
||||
- The `openai_api_key` argument can be provided for leveraging the OpenAI infrastructure and services.
|
||||
- A qualified language model can be passed as an instance of the `llm` object, which can be useful when integrating with state-of-the-art text generation engines.
|
||||
|
||||
## References and Resources
|
||||
|
||||
- [OpenAI APIs](https://openai.com)
|
||||
- [Models and Languages at HuggingFace](https://huggingface.co/models)
|
||||
- [Deep Learning and Language Modeling at the Allen Institute for AI](https://allenai.org)
|
@ -0,0 +1,127 @@
|
||||
## Swarms Framework Conceptual Breakdown
|
||||
|
||||
The `swarms` framework is a sophisticated structure designed to orchestrate the collaborative work of multiple agents in a hierarchical manner. This breakdown provides a conceptual and visual representation of the framework, highlighting the interactions between models, tools, memory, agents, and swarms.
|
||||
|
||||
### Hierarchical Structure
|
||||
|
||||
The framework can be visualized as a multi-layered hierarchy:
|
||||
|
||||
1. **Models, Tools, Memory**: These form the foundational components that agents utilize to perform tasks.
|
||||
2. **Agents**: Individual entities that encapsulate specific functionalities, utilizing models, tools, and memory.
|
||||
3. **Swarm**: A collection of multiple agents working together in a coordinated manner.
|
||||
4. **Structs**: High-level structures that organize and manage swarms, enabling complex workflows and interactions.
|
||||
|
||||
### Visual Representation
|
||||
|
||||
Below are visual graphs illustrating the hierarchical and tree structure of the `swarms` framework.
|
||||
|
||||
#### 1. Foundational Components: Models, Tools, Memory
|
||||
|
||||
```mermaid
|
||||
graph TD;
|
||||
Models --> Agents
|
||||
Tools --> Agents
|
||||
Memory --> Agents
|
||||
subgraph Foundational_Components
|
||||
Models
|
||||
Tools
|
||||
Memory
|
||||
end
|
||||
```
|
||||
|
||||
#### 2. Agents and Their Interactions
|
||||
|
||||
```mermaid
|
||||
graph TD;
|
||||
Agents --> Swarm
|
||||
subgraph Agents_Collection
|
||||
Agent1
|
||||
Agent2
|
||||
Agent3
|
||||
end
|
||||
subgraph Individual_Agents
|
||||
Agent1 --> Models
|
||||
Agent1 --> Tools
|
||||
Agent1 --> Memory
|
||||
Agent2 --> Models
|
||||
Agent2 --> Tools
|
||||
Agent2 --> Memory
|
||||
Agent3 --> Models
|
||||
Agent3 --> Tools
|
||||
Agent3 --> Memory
|
||||
end
|
||||
```
|
||||
|
||||
#### 3. Multiple Agents Form a Swarm
|
||||
|
||||
```mermaid
|
||||
graph TD;
|
||||
Swarm1 --> Struct
|
||||
Swarm2 --> Struct
|
||||
Swarm3 --> Struct
|
||||
subgraph Swarms_Collection
|
||||
Swarm1
|
||||
Swarm2
|
||||
Swarm3
|
||||
end
|
||||
subgraph Individual_Swarms
|
||||
Swarm1 --> Agent1
|
||||
Swarm1 --> Agent2
|
||||
Swarm1 --> Agent3
|
||||
Swarm2 --> Agent4
|
||||
Swarm2 --> Agent5
|
||||
Swarm2 --> Agent6
|
||||
Swarm3 --> Agent7
|
||||
Swarm3 --> Agent8
|
||||
Swarm3 --> Agent9
|
||||
end
|
||||
```
|
||||
|
||||
#### 4. Structs Organizing Multiple Swarms
|
||||
|
||||
```mermaid
|
||||
graph TD;
|
||||
Struct --> Swarms_Collection
|
||||
subgraph High_Level_Structs
|
||||
Struct1
|
||||
Struct2
|
||||
Struct3
|
||||
end
|
||||
subgraph Struct1
|
||||
Swarm1
|
||||
Swarm2
|
||||
end
|
||||
subgraph Struct2
|
||||
Swarm3
|
||||
end
|
||||
subgraph Struct3
|
||||
Swarm4
|
||||
Swarm5
|
||||
end
|
||||
```
|
||||
|
||||
### Directory Breakdown
|
||||
|
||||
The directory structure of the `swarms` framework is organized to support its hierarchical architecture:
|
||||
|
||||
```sh
|
||||
swarms/
|
||||
├── agents/
|
||||
├── artifacts/
|
||||
├── marketplace/
|
||||
├── memory/
|
||||
├── models/
|
||||
├── prompts/
|
||||
├── schemas/
|
||||
├── structs/
|
||||
├── telemetry/
|
||||
├── tools/
|
||||
├── utils/
|
||||
└── __init__.py
|
||||
```
|
||||
|
||||
### Summary
|
||||
|
||||
The `swarms` framework is designed to facilitate complex multi-agent interactions through a structured and layered approach. By leveraging foundational components like models, tools, and memory, individual agents are empowered to perform specialized tasks. These agents are then coordinated within swarms to achieve collective goals, and swarms are managed within high-level structs to orchestrate sophisticated workflows.
|
||||
|
||||
This hierarchical design ensures scalability, flexibility, and robustness, making the `swarms` framework a powerful tool for various applications in AI, data analysis, optimization, and beyond.
|
@ -0,0 +1,48 @@
|
||||
# Glossary of Terms
|
||||
|
||||
**Agent**:
|
||||
An LLM (Large Language Model) equipped with tools and memory, operating with a specific objective in a loop. An agent can perform tasks, interact with other agents, and utilize external tools and memory systems to achieve its goals.
|
||||
|
||||
**Swarms**:
|
||||
A group of more than two agents working together and communicating to accomplish a shared objective. Swarms enable complex, collaborative tasks that leverage the strengths of multiple agents.
|
||||
|
||||
**Tool**:
|
||||
A Python function that is converted into a function call, allowing agents to perform specific actions or access external resources. Tools enhance the capabilities of agents by providing specialized functionalities.
|
||||
|
||||
**Memory System**:
|
||||
A system for managing information retrieval and storage, often implemented as a Retrieval-Augmented Generation (RAG) system or a memory vector database. Memory systems enable agents to recall previous interactions, store new information, and improve decision-making based on historical data.
|
||||
|
||||
**LLM (Large Language Model)**:
|
||||
A type of AI model designed to understand and generate human-like text. LLMs, such as GPT-3 or GPT-4, are used as the core computational engine for agents.
|
||||
|
||||
**System Prompt**:
|
||||
A predefined prompt that sets the context and instructions for an agent's task. The system prompt guides the agent's behavior and response generation.
|
||||
|
||||
**Max Loops**:
|
||||
The maximum number of iterations an agent will perform to complete its task. This parameter helps control the extent of an agent's processing and ensures tasks are completed efficiently.
|
||||
|
||||
**Dashboard**:
|
||||
A user interface that provides real-time monitoring and control over the agents and their activities. Dashboards can display agent status, logs, and performance metrics.
|
||||
|
||||
**Streaming On**:
|
||||
A setting that enables agents to stream their output incrementally, providing real-time feedback as they process tasks. This feature is useful for monitoring progress and making adjustments on the fly.
|
||||
|
||||
**Verbose**:
|
||||
A setting that controls the level of detail in an agent's output and logging. When verbose mode is enabled, the agent provides more detailed information about its operations and decisions.
|
||||
|
||||
**Multi-modal**:
|
||||
The capability of an agent to process and integrate multiple types of data, such as text, images, and audio. Multi-modal agents can handle more complex tasks that require diverse inputs.
|
||||
|
||||
**Autosave**:
|
||||
A feature that automatically saves the agent's state and progress at regular intervals. Autosave helps prevent data loss and allows for recovery in case of interruptions.
|
||||
|
||||
**Flow**:
|
||||
The predefined sequence in which agents in a swarm interact and process tasks. The flow ensures that each agent's output is appropriately passed to the next agent, facilitating coordinated efforts.
|
||||
|
||||
**Long Term Memory**:
|
||||
A component of the memory system that retains information over extended periods, enabling agents to recall and utilize past interactions and experiences.
|
||||
|
||||
**Output Schema**:
|
||||
A structured format for the output generated by agents, often defined using data models like Pydantic's BaseModel. Output schemas ensure consistency and clarity in the information produced by agents.
|
||||
|
||||
By understanding these terms, you can effectively build and orchestrate agents and swarms, leveraging their capabilities to perform complex, collaborative tasks.
|
@ -0,0 +1,89 @@
|
||||
<div align="center">
|
||||
<p>
|
||||
<a align="center" href="" target="_blank">
|
||||
<img
|
||||
width="850"
|
||||
src="https://github.com/kyegomez/swarms/raw/master/images/swarmslogobanner.png"
|
||||
>
|
||||
</a>
|
||||
</p>
|
||||
</div>
|
||||
|
||||
# Installation Guide
|
||||
|
||||
You can install `swarms` with pip in a
|
||||
[**Python>=3.10**](https://www.python.org/) environment.
|
||||
|
||||
!!! example "pip install (recommended)"
|
||||
|
||||
=== "headless"
|
||||
The headless installation of `swarms` is designed for environments where graphical user interfaces (GUI) are not needed, making it more lightweight and suitable for server-side applications.
|
||||
|
||||
```bash
|
||||
pip install swarms
|
||||
```
|
||||
|
||||
|
||||
|
||||
!!! example "git clone (for development)"
|
||||
|
||||
=== "virtualenv"
|
||||
|
||||
```bash
|
||||
# clone repository and navigate to root directory
|
||||
git clone https://github.com/kyegomez/swarms.git
|
||||
cd swarms
|
||||
|
||||
# setup python environment and activate it
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate
|
||||
pip install --upgrade pip
|
||||
|
||||
# headless install
|
||||
pip install -e "."
|
||||
|
||||
# desktop install
|
||||
pip install -e ".[desktop]"
|
||||
```
|
||||
|
||||
=== "poetry"
|
||||
|
||||
```bash
|
||||
# clone repository and navigate to root directory
|
||||
git clone https://github.com/kyegomez/swarms.git
|
||||
cd swarms
|
||||
|
||||
# setup python environment and activate it
|
||||
poetry env use python3.10
|
||||
poetry shell
|
||||
|
||||
# headless install
|
||||
poetry install
|
||||
|
||||
# desktop install
|
||||
poetry install --extras "desktop"
|
||||
```
|
||||
|
||||
|
||||
# Javascript
|
||||
|
||||
!!! example "NPM install |WIP|"
|
||||
|
||||
=== "headless"
|
||||
Get started with the NPM implementation of Swarms with this command:
|
||||
|
||||
```bash
|
||||
npm install swarms-js
|
||||
```
|
||||
|
||||
|
||||
## Documentation
|
||||
|
||||
[Learn more about swarms →](swarms/)
|
||||
|
||||
|
||||
## Examples
|
||||
|
||||
Check out Swarms examples for building agents, data retrieval, and more.
|
||||
|
||||
[Checkout Swarms examples →](examples/)
|
@ -0,0 +1,6 @@
|
||||
# Getting Started with Multi-Agent Collaboration Using the Multi-Agent Github Template
|
||||
|
||||
|
||||
The Multi-Agent Github Template, a radically simple, reliable, and high-performance framework, is designed to empower developers and prompt engineers to harness the full potential of multi-agent collaboration. [LINK](https://medium.com/@kyeg/getting-started-with-multi-agent-collaboration-using-the-multi-agent-github-template-0f0a6cba0dc0)
|
||||
|
||||
[GITHUB](https://github.com/kyegomez/Multi-Agent-Template-App)
|
@ -0,0 +1,131 @@
|
||||
# Deploying Azure OpenAI in Production: A Comprehensive Guide
|
||||
|
||||
In today's fast-paced digital landscape, leveraging cutting-edge technologies has become essential for businesses to stay competitive and provide exceptional services to their customers. One such technology that has gained significant traction is Azure OpenAI, a powerful platform that allows developers to integrate advanced natural language processing (NLP) capabilities into their applications. Whether you're building a chatbot, a content generation system, or any other AI-powered solution, Azure OpenAI offers a robust and scalable solution for production-grade deployment.
|
||||
|
||||
In this comprehensive guide, we'll walk through the process of setting up and deploying Azure OpenAI in a production environment. We'll dive deep into the code, provide clear explanations, and share best practices to ensure a smooth and successful implementation.
|
||||
|
||||
## Prerequisites:
|
||||
Before we begin, it's essential to have the following prerequisites in place:
|
||||
|
||||
1. **Python**: You'll need to have Python installed on your system. This guide assumes you're using Python 3.6 or later.
|
||||
2. **Azure Subscription**: You'll need an active Azure subscription to access Azure OpenAI services.
|
||||
3. **Azure OpenAI Resource**: Create an Azure OpenAI resource in your Azure subscription.
|
||||
4. **Python Packages**: Install the required Python packages, including `python-dotenv` and `swarms`.
|
||||
|
||||
## Setting up the Environment:
|
||||
To kick things off, we'll set up our development environment and install the necessary dependencies.
|
||||
|
||||
1. **Create a Virtual Environment**: It's a best practice to create a virtual environment to isolate your project dependencies from the rest of your system. You can create a virtual environment using `venv` or any other virtual environment management tool of your choice.
|
||||
|
||||
```
|
||||
python -m venv myenv
|
||||
```
|
||||
|
||||
2. **Activate the Virtual Environment**: Activate the virtual environment to ensure that any packages you install are isolated within the environment.
|
||||
|
||||
```
|
||||
source myenv/bin/activate # On Windows, use `myenv\Scripts\activate`
|
||||
```
|
||||
|
||||
3. **Install Required Packages**: Install the `python-dotenv` and `swarms` packages using pip.
|
||||
|
||||
```
|
||||
pip install python-dotenv swarms
|
||||
```
|
||||
|
||||
4. **Create a `.env` File**: In the root directory of your project, create a new file called `.env`. This file will store your Azure OpenAI credentials and configuration settings.
|
||||
|
||||
```
|
||||
AZURE_OPENAI_ENDPOINT=<your_azure_openai_endpoint>
|
||||
AZURE_OPENAI_DEPLOYMENT=<your_azure_openai_deployment_name>
|
||||
OPENAI_API_VERSION=<your_openai_api_version>
|
||||
AZURE_OPENAI_API_KEY=<your_azure_openai_api_key>
|
||||
AZURE_OPENAI_AD_TOKEN=<your_azure_openai_ad_token>
|
||||
```
|
||||
|
||||
Replace the placeholders with your actual Azure OpenAI credentials and configuration settings.
|
||||
|
||||
## Connecting to Azure OpenAI:
|
||||
Now that we've set up our environment, let's dive into the code that connects to Azure OpenAI and interacts with the language model.
|
||||
|
||||
```python
|
||||
import os
|
||||
from dotenv import load_dotenv
|
||||
from swarms import AzureOpenAI
|
||||
|
||||
# Load the environment variables
|
||||
load_dotenv()
|
||||
|
||||
# Create an instance of the AzureOpenAI class
|
||||
model = AzureOpenAI(
|
||||
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
|
||||
deployment_name=os.getenv("AZURE_OPENAI_DEPLOYMENT"),
|
||||
openai_api_version=os.getenv("OPENAI_API_VERSION"),
|
||||
openai_api_key=os.getenv("AZURE_OPENAI_API_KEY"),
|
||||
azure_ad_token=os.getenv("AZURE_OPENAI_AD_TOKEN")
|
||||
)
|
||||
```
|
||||
|
||||
## Let's break down this code:
|
||||
|
||||
1. **Import Statements**: We import the necessary modules, including `os` for interacting with the operating system, `load_dotenv` from `python-dotenv` to load environment variables, and `AzureOpenAI` from `swarms` to interact with the Azure OpenAI service.
|
||||
|
||||
2. **Load Environment Variables**: We use `load_dotenv()` to load the environment variables stored in the `.env` file we created earlier.
|
||||
|
||||
3. **Create AzureOpenAI Instance**: We create an instance of the `AzureOpenAI` class by passing in the required configuration parameters:
|
||||
- `azure_endpoint`: The endpoint URL for your Azure OpenAI resource.
|
||||
- `deployment_name`: The name of the deployment you want to use.
|
||||
- `openai_api_version`: The version of the OpenAI API you want to use.
|
||||
- `openai_api_key`: Your Azure OpenAI API key, which authenticates your requests.
|
||||
- `azure_ad_token`: An optional Azure Active Directory (AAD) token for additional security.
|
||||
|
||||
Querying the Language Model:
|
||||
With our connection to Azure OpenAI established, we can now query the language model and receive responses.
|
||||
|
||||
```python
|
||||
# Define the prompt
|
||||
prompt = "Analyze this load document and assess it for any risks and create a table in markdwon format."
|
||||
|
||||
# Generate a response
|
||||
response = model(prompt)
|
||||
print(response)
|
||||
```
|
||||
|
||||
## Here's what's happening:
|
||||
|
||||
1. **Define the Prompt**: We define a prompt, which is the input text or question we want to feed into the language model.
|
||||
|
||||
2. **Generate a Response**: We call the `model` instance with the `prompt` as an argument. This triggers the Azure OpenAI service to process the prompt and generate a response.
|
||||
|
||||
3. **Print the Response**: Finally, we print the response received from the language model.
|
||||
|
||||
Running the Code:
|
||||
To run the code, save it in a Python file (e.g., `main.py`) and execute it from the command line:
|
||||
|
||||
```
|
||||
python main.py
|
||||
```
|
||||
|
||||
## Best Practices for Production Deployment:
|
||||
While the provided code serves as a basic example, there are several best practices to consider when deploying Azure OpenAI in a production environment:
|
||||
|
||||
1. **Secure Credentials Management**: Instead of storing sensitive credentials like API keys in your codebase, consider using secure storage solutions like Azure Key Vault or environment variables managed by your cloud provider.
|
||||
|
||||
2. **Error Handling and Retries**: Implement robust error handling and retry mechanisms to handle potential failures or rate-limiting scenarios.
|
||||
|
||||
3. **Logging and Monitoring**: Implement comprehensive logging and monitoring strategies to track application performance, identify issues, and gather insights for optimization.
|
||||
|
||||
4. **Scalability and Load Testing**: Conduct load testing to ensure your application can handle anticipated traffic volumes and scale appropriately based on demand.
|
||||
|
||||
5. **Caching and Optimization**: Explore caching strategies and performance optimizations to improve response times and reduce the load on the Azure OpenAI service.
|
||||
|
||||
6. **Integration with Other Services**: Depending on your use case, you may need to integrate Azure OpenAI with other Azure services or third-party tools for tasks like data processing, storage, or analysis.
|
||||
|
||||
7. **Compliance and Security**: Ensure your application adheres to relevant compliance standards and security best practices, especially when handling sensitive data.
|
||||
|
||||
## Conclusion:
|
||||
Azure OpenAI is a powerful platform that enables developers to integrate advanced natural language processing capabilities into their applications. By following the steps outlined in this guide, you can set up a production-ready environment for deploying Azure OpenAI and start leveraging its capabilities in your projects.
|
||||
|
||||
Remember, this guide serves as a starting point, and there are numerous additional features and capabilities within Azure OpenAI that you can explore to enhance your applications further. As with any production deployment, it's crucial to follow best practices, conduct thorough testing, and implement robust monitoring and security measures.
|
||||
|
||||
With the right approach and careful planning, you can successfully deploy Azure OpenAI in a production environment and unlock the power of cutting-edge language models to drive innovation and provide exceptional experiences for your users.
|
@ -0,0 +1,350 @@
|
||||
# `PgVectorVectorStore` Documentation
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Introduction](#introduction)
|
||||
2. [Overview](#overview)
|
||||
3. [Class Definition](#class-definition)
|
||||
4. [Functionality and Usage](#functionality-and-usage)
|
||||
- [Setting Up the Database](#setting-up-the-database)
|
||||
- [Upserting Vectors](#upserting-vectors)
|
||||
- [Loading Vector Entries](#loading-vector-entries)
|
||||
- [Querying Vectors](#querying-vectors)
|
||||
5. [Additional Information](#additional-information)
|
||||
6. [References and Resources](#references-and-resources)
|
||||
|
||||
---
|
||||
|
||||
## 1. Introduction <a name="introduction"></a>
|
||||
|
||||
Welcome to the documentation for the Swarms `PgVectorVectorStore` class! Swarms is a library that provides various memory and storage options for high-dimensional vectors. In this documentation, we will focus on the `PgVectorVectorStore` class, which is a vector storage driver that uses PostgreSQL with the PGVector extension as the underlying storage engine.
|
||||
|
||||
### 1.1 Purpose
|
||||
|
||||
The `PgVectorVectorStore` class allows you to interact with a PostgreSQL database and store high-dimensional vectors efficiently. By using Swarms with PostgreSQL and PGVector, you can manage and work with vector data in your applications with ease.
|
||||
|
||||
### 1.2 Key Features
|
||||
|
||||
- Integration with PostgreSQL and PGVector for vector storage.
|
||||
- Simple and convenient API for upserting vectors, querying, and loading entries.
|
||||
- Support for creating and managing vector collections in PostgreSQL.
|
||||
|
||||
---
|
||||
|
||||
## 2. Overview <a name="overview"></a>
|
||||
|
||||
Before diving into the details of the `PgVectorVectorStore` class, let's provide an overview of its purpose and functionality.
|
||||
|
||||
The `PgVectorVectorStore` class is designed to:
|
||||
|
||||
- Store high-dimensional vectors in a PostgreSQL database with the PGVector extension.
|
||||
- Offer a seamless and efficient way to upsert vectors into the database.
|
||||
- Provide methods for loading individual vector entries or all vector entries in a collection.
|
||||
- Support vector queries, allowing you to find vectors similar to a given query vector.
|
||||
|
||||
In the following sections, we will explore the class definition, its parameters, and how to use it effectively.
|
||||
|
||||
---
|
||||
|
||||
## 3. Class Definition <a name="class-definition"></a>
|
||||
|
||||
Let's start by examining the class definition of `PgVectorVectorStore`, including its attributes and parameters.
|
||||
|
||||
```python
|
||||
class PgVectorVectorStore(BaseVectorStore):
|
||||
"""
|
||||
A vector store driver to Postgres using the PGVector extension.
|
||||
|
||||
Attributes:
|
||||
connection_string: An optional string describing the target Postgres database instance.
|
||||
create_engine_params: Additional configuration params passed when creating the database connection.
|
||||
engine: An optional sqlalchemy Postgres engine to use.
|
||||
table_name: Optionally specify the name of the table to used to store vectors.
|
||||
...
|
||||
"""
|
||||
```
|
||||
|
||||
Attributes:
|
||||
|
||||
- `connection_string` (Optional[str]): An optional string describing the target Postgres database instance.
|
||||
- `create_engine_params` (dict): Additional configuration parameters passed when creating the database connection.
|
||||
- `engine` (Optional[Engine]): An optional SQLAlchemy Postgres engine to use.
|
||||
- `table_name` (str): Optionally specify the name of the table to be used to store vectors.
|
||||
|
||||
### 3.1 Attribute Validators
|
||||
|
||||
The class includes validators for the `connection_string` and `engine` attributes to ensure their proper usage. These validators help maintain consistency in attribute values.
|
||||
|
||||
### 3.2 Initialization
|
||||
|
||||
During initialization, the class checks if an engine is provided. If an engine is not provided, it creates a new database connection using the `connection_string` and `create_engine_params`.
|
||||
|
||||
---
|
||||
|
||||
## 4. Functionality and Usage <a name="functionality-and-usage"></a>
|
||||
|
||||
In this section, we will explore the functionality of the `PgVectorVectorStore` class and provide detailed instructions on how to use it effectively.
|
||||
|
||||
### 4.1 Setting Up the Database <a name="setting-up-the-database"></a>
|
||||
|
||||
Before using the `PgVectorVectorStore` to store and query vectors, you need to set up the database. This includes creating the necessary extensions and database schema. You can do this using the `setup` method.
|
||||
|
||||
```python
|
||||
def setup(
|
||||
self,
|
||||
create_schema: bool = True,
|
||||
install_uuid_extension: bool = True,
|
||||
install_vector_extension: bool = True,
|
||||
) -> None:
|
||||
"""
|
||||
Provides a mechanism to initialize the database schema and extensions.
|
||||
|
||||
Parameters:
|
||||
- create_schema (bool): If True, creates the necessary database schema for vector storage. Default: True.
|
||||
- install_uuid_extension (bool): If True, installs the UUID extension in the database. Default: True.
|
||||
- install_vector_extension (bool): If True, installs the PGVector extension in the database. Default: True.
|
||||
"""
|
||||
```
|
||||
|
||||
#### Example 1: Setting Up the Database
|
||||
|
||||
```python
|
||||
# Initialize the PgVectorVectorStore instance
|
||||
vector_store = PgVectorVectorStore(
|
||||
connection_string="your-db-connection-string", table_name="your-table-name"
|
||||
)
|
||||
|
||||
# Set up the database with default settings
|
||||
vector_store.setup()
|
||||
```
|
||||
|
||||
#### Example 2: Customized Database Setup
|
||||
|
||||
```python
|
||||
# Initialize the PgVectorVectorStore instance
|
||||
vector_store = PgVectorVectorStore(
|
||||
connection_string="your-db-connection-string", table_name="your-table-name"
|
||||
)
|
||||
|
||||
# Set up the database with customized settings
|
||||
vector_store.setup(
|
||||
create_schema=False, install_uuid_extension=True, install_vector_extension=True
|
||||
)
|
||||
```
|
||||
|
||||
### 4.2 Upserting Vectors <a name="upserting-vectors"></a>
|
||||
|
||||
The `upsert_vector` method allows you to insert or update a vector in the collection. You can specify the vector, an optional vector ID, namespace, and metadata.
|
||||
|
||||
```python
|
||||
def upsert_vector(
|
||||
self,
|
||||
vector: list[float],
|
||||
vector_id: Optional[str] = None,
|
||||
namespace: Optional[str] = None,
|
||||
meta: Optional[dict] = None,
|
||||
**kwargs,
|
||||
) -> str:
|
||||
"""
|
||||
Inserts or updates a vector in the collection.
|
||||
|
||||
Parameters:
|
||||
- vector (list[float]): The vector to upsert.
|
||||
- vector_id (Optional[str]): An optional ID for the vector. If not provided, a unique ID will be generated.
|
||||
- namespace (Optional[str]): An optional namespace for the vector.
|
||||
- meta (Optional[dict]): An optional metadata dictionary associated with the vector.
|
||||
- **kwargs: Additional keyword arguments.
|
||||
|
||||
Returns:
|
||||
- str: The ID of the upserted vector.
|
||||
"""
|
||||
```
|
||||
|
||||
#### Example: Upserting a Vector
|
||||
|
||||
```python
|
||||
# Initialize the PgVectorVectorStore instance
|
||||
vector_store = PgVectorVectorStore(
|
||||
connection_string="your-db-connection-string", table_name="your-table-name"
|
||||
)
|
||||
|
||||
# Define a vector and upsert it
|
||||
vector = [0.1, 0.2, 0.3, 0.4]
|
||||
vector_id = "unique-vector-id"
|
||||
namespace = "your-namespace"
|
||||
meta = {"key1": "value1", "key2": "value2"}
|
||||
|
||||
vector_store.upsert_vector(
|
||||
vector=vector, vector_id=vector_id, namespace=namespace, meta=meta
|
||||
)
|
||||
```
|
||||
|
||||
### 4.3 Loading Vector Entries <a name="loading-vector-entries"></a>
|
||||
|
||||
You can load vector entries from the collection using the `load_entry` and `load_entries` methods.
|
||||
|
||||
#### 4
|
||||
|
||||
.3.1 Loading a Single Entry
|
||||
|
||||
The `load_entry` method allows you to load a specific vector entry based on its identifier and optional namespace.
|
||||
|
||||
```python
|
||||
def load_entry(
|
||||
self, vector_id: str, namespace: Optional[str] = None
|
||||
) -> BaseVectorStore.Entry:
|
||||
"""
|
||||
Retrieves a specific vector entry from the collection based on its identifier and optional namespace.
|
||||
|
||||
Parameters:
|
||||
- vector_id (str): The ID of the vector to retrieve.
|
||||
- namespace (Optional[str]): An optional namespace for filtering. Default: None.
|
||||
|
||||
Returns:
|
||||
- BaseVectorStore.Entry: The loaded vector entry.
|
||||
"""
|
||||
```
|
||||
|
||||
#### Example: Loading a Single Entry
|
||||
|
||||
```python
|
||||
# Initialize the PgVectorVectorStore instance
|
||||
vector_store = PgVectorVectorStore(connection_string="your-db-connection-string", table_name="your-table-name")
|
||||
|
||||
# Load a specific vector entry
|
||||
loaded_entry = vector_store.load_entry(vector_id="unique-vector-id", namespace="your-namespace")
|
||||
|
||||
if loaded_entry is not None:
|
||||
loaded_vector = loaded_entry.vector
|
||||
loaded_meta = loaded_entry.meta
|
||||
# Use the loaded vector and metadata as needed
|
||||
else:
|
||||
# Vector not found
|
||||
```
|
||||
|
||||
#### 4.3.2 Loading Multiple Entries
|
||||
|
||||
The `load_entries` method allows you to load all vector entries from the collection, optionally filtering by namespace.
|
||||
|
||||
```python
|
||||
def load_entries(self, namespace: Optional[str] = None) -> list[BaseVectorStore.Entry]:
|
||||
"""
|
||||
Retrieves all vector entries from the collection, optionally filtering to only those that match the provided namespace.
|
||||
|
||||
Parameters:
|
||||
- namespace (Optional[str]): An optional namespace for filtering. Default: None.
|
||||
|
||||
Returns:
|
||||
- list[BaseVectorStore.Entry]: A list of loaded vector entries.
|
||||
"""
|
||||
```
|
||||
|
||||
#### Example: Loading Multiple Entries
|
||||
|
||||
```python
|
||||
# Initialize the PgVectorVectorStore instance
|
||||
vector_store = PgVectorVectorStore(
|
||||
connection_string="your-db-connection-string", table_name="your-table-name"
|
||||
)
|
||||
|
||||
# Load all vector entries in the specified namespace
|
||||
entries = vector_store.load_entries(namespace="your-namespace")
|
||||
|
||||
# Process the loaded entries
|
||||
for entry in entries:
|
||||
vector_id = entry.id
|
||||
vector = entry.vector
|
||||
meta = entry.meta
|
||||
|
||||
# Handle the loaded entries as needed
|
||||
```
|
||||
|
||||
### 4.4 Querying Vectors <a name="querying-vectors"></a>
|
||||
|
||||
You can perform vector queries to find vectors similar to a given query vector using the `query` method. You can specify the query string, the maximum number of results to return, and other options.
|
||||
|
||||
```python
|
||||
def query(
|
||||
self,
|
||||
query: str,
|
||||
count: Optional[int] = BaseVectorStore.DEFAULT_QUERY_COUNT,
|
||||
namespace: Optional[str] = None,
|
||||
include_vectors: bool = False,
|
||||
distance_metric: str = "cosine_distance",
|
||||
**kwargs,
|
||||
) -> list[BaseVectorStore.QueryResult]:
|
||||
"""
|
||||
Performs a search on the collection to find vectors similar to the provided input vector,
|
||||
optionally filtering to only those that match the provided namespace.
|
||||
|
||||
Parameters:
|
||||
- query (str): The query string to find similar vectors.
|
||||
- count (Optional[int]): Maximum number of results to return. Default: BaseVectorStore.DEFAULT_QUERY_COUNT.
|
||||
- namespace (Optional[str]): An optional namespace for filtering. Default: None.
|
||||
- include_vectors (bool): If True, includes vectors in the query results. Default: False.
|
||||
- distance_metric (str): The distance metric to use for similarity measurement.
|
||||
Options: "cosine_distance", "l2_distance", "inner_product". Default: "cosine_distance".
|
||||
- **kwargs: Additional keyword arguments.
|
||||
|
||||
Returns:
|
||||
- list[BaseVectorStore.QueryResult]: A list of query results, each containing vector ID, vector (if included), score, and metadata.
|
||||
"""
|
||||
```
|
||||
|
||||
#### Example: Querying Vectors
|
||||
|
||||
```python
|
||||
# Initialize the PgVectorVectorStore instance
|
||||
vector_store = PgVectorVectorStore(
|
||||
connection_string="your-db-connection-string", table_name="your-table-name"
|
||||
)
|
||||
|
||||
# Perform a vector query
|
||||
query_string = "your-query-string"
|
||||
count = 10 # Maximum number of results to return
|
||||
namespace = "your-namespace"
|
||||
include_vectors = False # Set to True to include vectors in results
|
||||
distance_metric = "cosine_distance"
|
||||
|
||||
results = vector_store.query(
|
||||
query=query_string,
|
||||
count=count,
|
||||
namespace=namespace,
|
||||
include_vectors=include_vectors,
|
||||
distance_metric=distance_metric,
|
||||
)
|
||||
|
||||
# Process the query results
|
||||
for result in results:
|
||||
vector_id = result.id
|
||||
vector = result.vector
|
||||
score = result.score
|
||||
meta = result.meta
|
||||
|
||||
# Handle the results as needed
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Additional Information <a name="additional-information"></a>
|
||||
|
||||
Here are some additional tips and information for using the `PgVectorVectorStore` class effectively:
|
||||
|
||||
- When upserting vectors, you can generate a unique vector ID using a hash of the vector's content to ensure uniqueness.
|
||||
- Consider using namespaces to organize and categorize vectors within your PostgreSQL database.
|
||||
- You can choose from different distance metrics (cosine distance, L2 distance, inner product) for vector querying based on your application's requirements.
|
||||
- Keep your database connection string secure and follow best practices for database access control.
|
||||
|
||||
---
|
||||
|
||||
## 6. References and Resources <a name="references-and-resources"></a>
|
||||
|
||||
Here are some references and resources for further information on Swarms and PostgreSQL with PGVector:
|
||||
|
||||
- [Swarms GitHub Repository](https://github.com/swarms): Swarms library on GitHub for updates and contributions.
|
||||
- [PostgreSQL Official Website](https://www.postgresql.org/): Official PostgreSQL website for documentation and resources.
|
||||
- [PGVector GitHub Repository](https://github.com/ankane/pgvector): PGVector extension on GitHub for detailed information.
|
||||
|
||||
---
|
||||
|
||||
This concludes the documentation for the Swarms `PgVectorVectorStore` class. You now have a comprehensive understanding of how to use Swarms with PostgreSQL and PGVector for vector storage. If you have any further questions or need assistance, please refer to the provided references and resources. Happy coding!
|
@ -0,0 +1,293 @@
|
||||
# `PineconeDB` Documentation
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Introduction](#introduction)
|
||||
2. [PineconeVector Class](#pineconevector-class)
|
||||
3. [Installation](#installation)
|
||||
4. [Usage](#usage)
|
||||
- [Creating a PineconeVector Instance](#creating-a-pineconevector-instance)
|
||||
- [Creating an Index](#creating-an-index)
|
||||
- [Upserting Vectors](#upserting-vectors)
|
||||
- [Querying the Index](#querying-the-index)
|
||||
- [Loading an Entry](#loading-an-entry)
|
||||
- [Loading Entries](#loading-entries)
|
||||
5. [Additional Information](#additional-information)
|
||||
6. [References and Resources](#references-and-resources)
|
||||
|
||||
---
|
||||
|
||||
## 1. Introduction <a name="introduction"></a>
|
||||
|
||||
Welcome to the Swarms documentation! Swarms is a library that provides various memory and storage options for high-dimensional vectors. In this documentation, we will focus on the `PineconeVector` class, which is a vector storage driver that uses Pinecone as the underlying storage engine.
|
||||
|
||||
### 1.1 Purpose
|
||||
|
||||
The `PineconeVector` class allows you to interact with Pinecone, a vector database that enables the storage, search, and retrieval of high-dimensional vectors with speed and low latency. By using Swarms with Pinecone, you can easily manage and work with vector data in your applications without the need to manage infrastructure.
|
||||
|
||||
### 1.2 Key Features
|
||||
|
||||
- Seamless integration with Pinecone for vector storage.
|
||||
- Simple and convenient API for upserting vectors, querying, and loading entries.
|
||||
- Support for creating and managing indexes.
|
||||
|
||||
---
|
||||
|
||||
## 2. PineconeVector Class <a name="pineconevector-class"></a>
|
||||
|
||||
The `PineconeVector` class is the core component of Swarms that interacts with Pinecone for vector storage. Below, we will provide an in-depth overview of this class, including its purpose, parameters, and methods.
|
||||
|
||||
### 2.1 Class Definition
|
||||
|
||||
```python
|
||||
class PineconeVector(BaseVector):
|
||||
```
|
||||
|
||||
### 2.2 Parameters
|
||||
|
||||
The `PineconeVector` class accepts the following parameters during initialization:
|
||||
|
||||
- `api_key` (str): The API key for your Pinecone account.
|
||||
- `index_name` (str): The name of the index to use.
|
||||
- `environment` (str): The environment to use. Either "us-west1-gcp" or "us-east1-gcp".
|
||||
- `project_name` (str, optional): The name of the project to use. Defaults to `None`.
|
||||
- `index` (pinecone.Index, optional): The Pinecone index to use. Defaults to `None`.
|
||||
|
||||
### 2.3 Methods
|
||||
|
||||
The `PineconeVector` class provides several methods for interacting with Pinecone:
|
||||
|
||||
#### 2.3.1 `upsert_vector`
|
||||
|
||||
```python
|
||||
def upsert_vector(
|
||||
self,
|
||||
vector: list[float],
|
||||
vector_id: Optional[str] = None,
|
||||
namespace: Optional[str] = None,
|
||||
meta: Optional[dict] = None,
|
||||
**kwargs
|
||||
) -> str:
|
||||
```
|
||||
|
||||
Upserts a vector into the index.
|
||||
|
||||
- `vector` (list[float]): The vector to upsert.
|
||||
- `vector_id` (Optional[str]): An optional ID for the vector. If not provided, a unique ID will be generated.
|
||||
- `namespace` (Optional[str]): An optional namespace for the vector.
|
||||
- `meta` (Optional[dict]): An optional metadata dictionary associated with the vector.
|
||||
- `**kwargs`: Additional keyword arguments.
|
||||
|
||||
#### 2.3.2 `load_entry`
|
||||
|
||||
```python
|
||||
def load_entry(
|
||||
self, vector_id: str, namespace: Optional[str] = None
|
||||
) -> Optional[BaseVector.Entry]:
|
||||
```
|
||||
|
||||
Loads a single vector from the index.
|
||||
|
||||
- `vector_id` (str): The ID of the vector to load.
|
||||
- `namespace` (Optional[str]): An optional namespace for the vector.
|
||||
|
||||
#### 2.3.3 `load_entries`
|
||||
|
||||
```python
|
||||
def load_entries(self, namespace: Optional[str] = None) -> list[BaseVector.Entry]:
|
||||
```
|
||||
|
||||
Loads all vectors from the index.
|
||||
|
||||
- `namespace` (Optional[str]): An optional namespace for the vectors.
|
||||
|
||||
#### 2.3.4 `query`
|
||||
|
||||
```python
|
||||
def query(
|
||||
self,
|
||||
query: str,
|
||||
count: Optional[int] = None,
|
||||
namespace: Optional[str] = None,
|
||||
include_vectors: bool = False,
|
||||
include_metadata=True,
|
||||
**kwargs
|
||||
) -> list[BaseVector.QueryResult]:
|
||||
```
|
||||
|
||||
Queries the index for vectors similar to the given query string.
|
||||
|
||||
- `query` (str): The query string.
|
||||
- `count` (Optional[int]): The maximum number of results to return. If not provided, a default value is used.
|
||||
- `namespace` (Optional[str]): An optional namespace for the query.
|
||||
- `include_vectors` (bool): Whether to include vectors in the query results.
|
||||
- `include_metadata` (bool): Whether to include metadata in the query results.
|
||||
- `**kwargs`: Additional keyword arguments.
|
||||
|
||||
#### 2.3.5 `create_index`
|
||||
|
||||
```python
|
||||
def create_index(self, name: str, **kwargs) -> None:
|
||||
```
|
||||
|
||||
Creates a new index.
|
||||
|
||||
- `name` (str): The name of the index to create.
|
||||
- `**kwargs`: Additional keyword arguments.
|
||||
|
||||
---
|
||||
|
||||
## 3. Installation <a name="installation"></a>
|
||||
|
||||
To use the Swarms library and the `PineconeVector` class, you will need to install the library and its dependencies. Follow these steps to get started:
|
||||
|
||||
1. Install Swarms:
|
||||
|
||||
```bash
|
||||
pip install swarms
|
||||
```
|
||||
|
||||
2. Install Pinecone:
|
||||
|
||||
You will also need a Pinecone account and API key. Follow the instructions on the Pinecone website to create an account and obtain an API key.
|
||||
|
||||
3. Import the necessary modules in your Python code:
|
||||
|
||||
```python
|
||||
from swarms.memory.vector_stores.pinecone import PineconeVector
|
||||
```
|
||||
|
||||
Now you're ready to use the `PineconeVector` class to work with Pinecone for vector storage.
|
||||
|
||||
---
|
||||
|
||||
## 4. Usage <a name="usage"></a>
|
||||
|
||||
In this section, we will provide detailed examples of how to use the `PineconeVector` class for vector storage with Pinecone.
|
||||
|
||||
### 4.1 Creating a PineconeVector Instance <a name="creating-a-pineconevector-instance"></a>
|
||||
|
||||
To get started, you need to create an instance of the `PineconeVector` class. You will need your Pinecone API key, the name of the index you want to use, and the environment. You can also specify an optional project name if you have one.
|
||||
|
||||
```python
|
||||
pv = PineconeVector(
|
||||
api_key="your-api-key",
|
||||
index_name="your-index-name",
|
||||
environment="us-west1-gcp",
|
||||
project_name="your-project-name",
|
||||
)
|
||||
```
|
||||
|
||||
### 4.2 Creating an Index <a name="creating-an-index"></a>
|
||||
|
||||
Before you can upsert vectors, you need to create an index in Pinecone. You can use the `create_index` method for this purpose.
|
||||
|
||||
```python
|
||||
pv.create_index("your-index-name")
|
||||
```
|
||||
|
||||
### 4.3 Upserting Vectors <a name="upserting-vectors"></a>
|
||||
|
||||
You can upsert vectors into the Pine
|
||||
|
||||
cone index using the `upsert_vector` method. This method allows you to specify the vector, an optional vector ID, namespace, and metadata.
|
||||
|
||||
```python
|
||||
vector = [0.1, 0.2, 0.3, 0.4]
|
||||
vector_id = "unique-vector-id"
|
||||
namespace = "your-namespace"
|
||||
meta = {"key1": "value1", "key2": "value2"}
|
||||
|
||||
pv.upsert_vector(vector=vector, vector_id=vector_id, namespace=namespace, meta=meta)
|
||||
```
|
||||
|
||||
### 4.4 Querying the Index <a name="querying-the-index"></a>
|
||||
|
||||
You can query the Pinecone index to find vectors similar to a given query string using the `query` method. You can specify the query string, the maximum number of results to return, and other options.
|
||||
|
||||
```python
|
||||
query_string = "your-query-string"
|
||||
count = 10 # Maximum number of results to return
|
||||
namespace = "your-namespace"
|
||||
include_vectors = False # Set to True to include vectors in results
|
||||
include_metadata = True
|
||||
|
||||
results = pv.query(
|
||||
query=query_string,
|
||||
count=count,
|
||||
namespace=namespace,
|
||||
include_vectors=include_vectors,
|
||||
include_metadata=include_metadata,
|
||||
)
|
||||
|
||||
# Process the query results
|
||||
for result in results:
|
||||
vector_id = result.id
|
||||
vector = result.vector
|
||||
score = result.score
|
||||
meta = result.meta
|
||||
|
||||
# Handle the results as needed
|
||||
```
|
||||
|
||||
### 4.5 Loading an Entry <a name="loading-an-entry"></a>
|
||||
|
||||
You can load a single vector entry from the Pinecone index using the `load_entry` method. Provide the vector ID and an optional namespace.
|
||||
|
||||
```python
|
||||
vector_id = "your-vector-id"
|
||||
namespace = "your-namespace"
|
||||
|
||||
entry = pv.load_entry(vector_id=vector_id, namespace=namespace)
|
||||
|
||||
if entry is not None:
|
||||
loaded_vector = entry.vector
|
||||
loaded_meta = entry.meta
|
||||
|
||||
# Use the loaded vector and metadata
|
||||
else:
|
||||
# Vector not found
|
||||
```
|
||||
|
||||
### 4.6 Loading Entries <a name="loading-entries"></a>
|
||||
|
||||
To load all vectors from the Pinecone index, you can use the `load_entries` method. You can also specify an optional namespace.
|
||||
|
||||
```python
|
||||
namespace = "your-namespace"
|
||||
|
||||
entries = pv.load_entries(namespace=namespace)
|
||||
|
||||
# Process the loaded entries
|
||||
for entry in entries:
|
||||
vector_id = entry.id
|
||||
vector = entry.vector
|
||||
meta = entry.meta
|
||||
|
||||
# Handle the loaded entries as needed
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Additional Information <a name="additional-information"></a>
|
||||
|
||||
In this section, we provide additional information and tips for using the `PineconeVector` class effectively.
|
||||
|
||||
- When upserting vectors, you can generate a unique vector ID using a hash of the vector's content to ensure uniqueness.
|
||||
- Consider using namespaces to organize and categorize vectors within your Pinecone index.
|
||||
- Pinecone provides powerful querying capabilities, so be sure to explore and leverage its features to retrieve relevant vectors efficiently.
|
||||
- Keep your Pinecone API key secure and follow Pinecone's best practices for API key management.
|
||||
|
||||
---
|
||||
|
||||
## 6. References and Resources <a name="references-and-resources"></a>
|
||||
|
||||
Here are some references and resources for further information on Pinecone and Swarms:
|
||||
|
||||
- [Pinecone Website](https://www.pinecone.io/): Official Pinecone website for documentation and resources.
|
||||
- [Pinecone Documentation](https://docs.pinecone.io/): Detailed documentation for Pinecone.
|
||||
- [Swarms GitHub Repository](https://github.com/swarms): Swarms library on GitHub for updates and contributions.
|
||||
|
||||
---
|
||||
|
||||
This concludes the documentation for the Swarms library and the `PineconeVector` class. You now have a deep understanding of how to use Swarms with Pinecone for vector storage. If you have any further questions or need assistance, please refer to the provided references and resources. Happy coding!
|
@ -0,0 +1,86 @@
|
||||
# Qdrant Client Library
|
||||
|
||||
## Overview
|
||||
|
||||
The Qdrant Client Library is designed for interacting with the Qdrant vector database, allowing efficient storage and retrieval of high-dimensional vector data. It integrates with machine learning models for embedding and is particularly suited for search and recommendation systems.
|
||||
|
||||
## Installation
|
||||
|
||||
```python
|
||||
pip install qdrant-client sentence-transformers httpx
|
||||
```
|
||||
|
||||
## Class Definition: Qdrant
|
||||
|
||||
```python
|
||||
class Qdrant:
|
||||
def __init__(
|
||||
self,
|
||||
api_key: str,
|
||||
host: str,
|
||||
port: int = 6333,
|
||||
collection_name: str = "qdrant",
|
||||
model_name: str = "BAAI/bge-small-en-v1.5",
|
||||
https: bool = True,
|
||||
):
|
||||
...
|
||||
```
|
||||
|
||||
### Constructor Parameters
|
||||
|
||||
| Parameter | Type | Description | Default Value |
|
||||
|-----------------|---------|--------------------------------------------------|-----------------------|
|
||||
| api_key | str | API key for authentication. | - |
|
||||
| host | str | Host address of the Qdrant server. | - |
|
||||
| port | int | Port number for the Qdrant server. | 6333 |
|
||||
| collection_name | str | Name of the collection to be used or created. | "qdrant" |
|
||||
| model_name | str | Name of the sentence transformer model. | "BAAI/bge-small-en-v1.5" |
|
||||
| https | bool | Flag to use HTTPS for connection. | True |
|
||||
|
||||
### Methods
|
||||
|
||||
#### `_load_embedding_model(model_name: str)`
|
||||
|
||||
Loads the sentence embedding model.
|
||||
|
||||
#### `_setup_collection()`
|
||||
|
||||
Checks if the specified collection exists in Qdrant; if not, creates it.
|
||||
|
||||
#### `add_vectors(docs: List[dict]) -> OperationResponse`
|
||||
|
||||
Adds vectors to the Qdrant collection.
|
||||
|
||||
#### `search_vectors(query: str, limit: int = 3) -> SearchResult`
|
||||
|
||||
Searches the Qdrant collection for vectors similar to the query vector.
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Example 1: Setting Up the Qdrant Client
|
||||
|
||||
```python
|
||||
from qdrant_client import Qdrant
|
||||
|
||||
qdrant_client = Qdrant(api_key="your_api_key", host="localhost", port=6333)
|
||||
```
|
||||
|
||||
### Example 2: Adding Vectors to a Collection
|
||||
|
||||
```python
|
||||
documents = [{"page_content": "Sample text 1"}, {"page_content": "Sample text 2"}]
|
||||
|
||||
operation_info = qdrant_client.add_vectors(documents)
|
||||
print(operation_info)
|
||||
```
|
||||
|
||||
### Example 3: Searching for Vectors
|
||||
|
||||
```python
|
||||
search_result = qdrant_client.search_vectors("Sample search query")
|
||||
print(search_result)
|
||||
```
|
||||
|
||||
## Further Information
|
||||
|
||||
Refer to the [Qdrant Documentation](https://qdrant.tech/docs) for more details on the Qdrant vector database.
|
@ -0,0 +1,250 @@
|
||||
# Short-Term Memory Module Documentation
|
||||
|
||||
## Introduction
|
||||
The Short-Term Memory module is a component of the SWARMS framework designed for managing short-term and medium-term memory in a multi-agent system. This documentation provides a detailed explanation of the Short-Term Memory module, its purpose, functions, and usage.
|
||||
|
||||
### Purpose
|
||||
The Short-Term Memory module serves the following purposes:
|
||||
1. To store and manage messages in short-term memory.
|
||||
2. To provide functions for retrieving, updating, and clearing memory.
|
||||
3. To facilitate searching for specific terms within the memory.
|
||||
4. To enable saving and loading memory data to/from a file.
|
||||
|
||||
### Class Definition
|
||||
```python
|
||||
class ShortTermMemory(BaseStructure):
|
||||
def __init__(
|
||||
self,
|
||||
return_str: bool = True,
|
||||
autosave: bool = True,
|
||||
*args,
|
||||
**kwargs,
|
||||
):
|
||||
...
|
||||
```
|
||||
|
||||
#### Parameters
|
||||
| Parameter | Type | Default Value | Description |
|
||||
|---------------------|----------|---------------|------------------------------------------------------------------------------------------------------------------|
|
||||
| `return_str` | bool | True | If True, returns memory as a string. |
|
||||
| `autosave` | bool | True | If True, enables automatic saving of memory data to a file. |
|
||||
| `*args`, `**kwargs` | | | Additional arguments and keyword arguments (not used in the constructor but allowed for flexibility). |
|
||||
|
||||
### Functions
|
||||
|
||||
#### 1. `add`
|
||||
```python
|
||||
def add(self, role: str = None, message: str = None, *args, **kwargs):
|
||||
```
|
||||
|
||||
- Adds a message to the short-term memory.
|
||||
- Parameters:
|
||||
- `role` (str, optional): Role associated with the message.
|
||||
- `message` (str, optional): The message to be added.
|
||||
- Returns: The added memory.
|
||||
|
||||
##### Example 1: Adding a Message to Short-Term Memory
|
||||
```python
|
||||
memory.add(role="Agent 1", message="Received task assignment.")
|
||||
```
|
||||
|
||||
##### Example 2: Adding Multiple Messages to Short-Term Memory
|
||||
```python
|
||||
messages = [("Agent 1", "Received task assignment."), ("Agent 2", "Task completed.")]
|
||||
for role, message in messages:
|
||||
memory.add(role=role, message=message)
|
||||
```
|
||||
|
||||
#### 2. `get_short_term`
|
||||
```python
|
||||
def get_short_term(self):
|
||||
```
|
||||
|
||||
- Retrieves the short-term memory.
|
||||
- Returns: The contents of the short-term memory.
|
||||
|
||||
##### Example: Retrieving Short-Term Memory
|
||||
```python
|
||||
short_term_memory = memory.get_short_term()
|
||||
for entry in short_term_memory:
|
||||
print(entry["role"], ":", entry["message"])
|
||||
```
|
||||
|
||||
#### 3. `get_medium_term`
|
||||
```python
|
||||
def get_medium_term(self):
|
||||
```
|
||||
|
||||
- Retrieves the medium-term memory.
|
||||
- Returns: The contents of the medium-term memory.
|
||||
|
||||
##### Example: Retrieving Medium-Term Memory
|
||||
```python
|
||||
medium_term_memory = memory.get_medium_term()
|
||||
for entry in medium_term_memory:
|
||||
print(entry["role"], ":", entry["message"])
|
||||
```
|
||||
|
||||
#### 4. `clear_medium_term`
|
||||
```python
|
||||
def clear_medium_term(self):
|
||||
```
|
||||
|
||||
- Clears the medium-term memory.
|
||||
|
||||
##### Example: Clearing Medium-Term Memory
|
||||
```python
|
||||
memory.clear_medium_term()
|
||||
```
|
||||
|
||||
#### 5. `get_short_term_memory_str`
|
||||
```python
|
||||
def get_short_term_memory_str(self, *args, **kwargs):
|
||||
```
|
||||
|
||||
- Retrieves the short-term memory as a string.
|
||||
- Returns: A string representation of the short-term memory.
|
||||
|
||||
##### Example: Getting Short-Term Memory as a String
|
||||
```python
|
||||
short_term_memory_str = memory.get_short_term_memory_str()
|
||||
print(short_term_memory_str)
|
||||
```
|
||||
|
||||
#### 6. `update_short_term`
|
||||
```python
|
||||
def update_short_term(self, index, role: str, message: str, *args, **kwargs):
|
||||
```
|
||||
|
||||
- Updates a message in the short-term memory.
|
||||
- Parameters:
|
||||
- `index` (int): The index of the message to update.
|
||||
- `role` (str): New role for the message.
|
||||
- `message` (str): New message content.
|
||||
- Returns: None.
|
||||
|
||||
##### Example: Updating a Message in Short-Term Memory
|
||||
```python
|
||||
memory.update_short_term(
|
||||
index=0, role="Updated Role", message="Updated message content."
|
||||
)
|
||||
```
|
||||
|
||||
#### 7. `clear`
|
||||
```python
|
||||
def clear(self):
|
||||
```
|
||||
|
||||
- Clears the short-term memory.
|
||||
|
||||
##### Example: Clearing Short-Term Memory
|
||||
```python
|
||||
memory.clear()
|
||||
```
|
||||
|
||||
#### 8. `search_memory`
|
||||
```python
|
||||
def search_memory(self, term):
|
||||
```
|
||||
|
||||
- Searches the memory for a specific term.
|
||||
- Parameters:
|
||||
- `term` (str): The term to search for.
|
||||
- Returns: A dictionary containing search results for short-term and medium-term memory.
|
||||
|
||||
##### Example: Searching Memory for a Term
|
||||
```python
|
||||
search_results = memory.search_memory("task")
|
||||
print("Short-Term Memory Results:", search_results["short_term"])
|
||||
print("Medium-Term Memory Results:", search_results["medium_term"])
|
||||
```
|
||||
|
||||
#### 9. `return_shortmemory_as_str`
|
||||
```python
|
||||
def return_shortmemory_as_str(self):
|
||||
```
|
||||
|
||||
- Returns the memory as a string.
|
||||
|
||||
##### Example: Returning Short-Term Memory as a String
|
||||
```python
|
||||
short_term_memory_str = memory.return_shortmemory_as_str()
|
||||
print(short_term_memory_str)
|
||||
```
|
||||
|
||||
#### 10. `move_to_medium_term`
|
||||
```python
|
||||
def move_to_medium_term(self, index):
|
||||
```
|
||||
|
||||
- Moves a message from the short-term memory to the medium-term memory.
|
||||
- Parameters:
|
||||
- `index` (int): The index of the message to move.
|
||||
|
||||
##### Example: Moving a Message to Medium-Term Memory
|
||||
```python
|
||||
memory.move_to_medium_term(index=0)
|
||||
```
|
||||
|
||||
#### 11. `return_medium_memory_as_str`
|
||||
```python
|
||||
def return_medium_memory_as_str(self):
|
||||
```
|
||||
|
||||
- Returns the medium-term memory as a string.
|
||||
|
||||
##### Example: Returning Medium-Term Memory as a String
|
||||
```python
|
||||
medium_term_memory_str = memory.return_medium_memory_as_str()
|
||||
print(medium_term_memory_str)
|
||||
```
|
||||
|
||||
#### 12. `save_to_file`
|
||||
```python
|
||||
def save_to_file(self, filename: str):
|
||||
```
|
||||
|
||||
- Saves the memory data to a file.
|
||||
- Parameters:
|
||||
- `filename` (str): The name of the file to save the data to.
|
||||
|
||||
##### Example: Saving Memory Data to a File
|
||||
```python
|
||||
memory.save_to_file("memory_data.json")
|
||||
```
|
||||
|
||||
#### 13. `load_from_file`
|
||||
```python
|
||||
def load_from_file(self, filename: str, *args, **kwargs):
|
||||
```
|
||||
|
||||
- Loads memory data from a file.
|
||||
- Parameters:
|
||||
- `filename` (str): The name of the file to load data from.
|
||||
|
||||
##### Example: Loading Memory Data from a File
|
||||
```python
|
||||
memory.load_from_file("memory_data.json")
|
||||
```
|
||||
|
||||
### Additional Information and Tips
|
||||
|
||||
- To use the Short-Term Memory module effectively, consider the following tips:
|
||||
- Use the `add` function to store messages in short-term memory.
|
||||
-
|
||||
|
||||
Retrieve memory contents using `get_short_term` and `get_medium_term` functions.
|
||||
- Clear memory as needed using `clear` and `clear_medium_term` functions.
|
||||
- Search for specific terms within the memory using the `search_memory` function.
|
||||
- Save and load memory data to/from files using `save_to_file` and `load_from_file` functions.
|
||||
|
||||
- Ensure proper exception handling when using memory functions to handle potential errors gracefully.
|
||||
|
||||
- When using the `search_memory` function, iterate through the results dictionary to access search results for short-term and medium-term memory.
|
||||
|
||||
### References and Resources
|
||||
|
||||
- For more information on multi-agent systems and memory management, refer to the SWARMS framework documentation: [SWARMS Documentation](https://swarms.apac.ai/).
|
||||
|
||||
- For advanced memory management and customization, explore the SWARMS framework source code.
|
||||
|
@ -0,0 +1,204 @@
|
||||
# Weaviate API Client Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
The Weaviate API Client is an interface to Weaviate, a vector database with a GraphQL API. This client allows you to interact with Weaviate programmatically, making it easier to create collections, add objects, query data, update objects, and delete objects within your Weaviate instance.
|
||||
|
||||
This documentation provides a comprehensive guide on how to use the Weaviate API Client, including its initialization, methods, and usage examples.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Installation](#installation)
|
||||
- [Initialization](#initialization)
|
||||
- [Methods](#methods)
|
||||
- [create_collection](#create-collection)
|
||||
- [add](#add)
|
||||
- [query](#query)
|
||||
- [update](#update)
|
||||
- [delete](#delete)
|
||||
- [Examples](#examples)
|
||||
|
||||
## Installation
|
||||
|
||||
Before using the Weaviate API Client, make sure to install the `swarms` library. You can install it using pip:
|
||||
|
||||
```bash
|
||||
pip install swarms
|
||||
```
|
||||
|
||||
## Initialization
|
||||
|
||||
To use the Weaviate API Client, you need to initialize an instance of the `WeaviateDB` class. Here are the parameters you can pass to the constructor:
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|----------------------|----------------|----------------------------------------------------------------------------------------------------------------------------------|
|
||||
| `http_host` | str | The HTTP host of the Weaviate server. |
|
||||
| `http_port` | str | The HTTP port of the Weaviate server. |
|
||||
| `http_secure` | bool | Whether to use HTTPS. |
|
||||
| `grpc_host` | Optional[str] | The gRPC host of the Weaviate server. (Optional) |
|
||||
| `grpc_port` | Optional[str] | The gRPC port of the Weaviate server. (Optional) |
|
||||
| `grpc_secure` | Optional[bool] | Whether to use gRPC over TLS. (Optional) |
|
||||
| `auth_client_secret` | Optional[Any] | The authentication client secret. (Optional) |
|
||||
| `additional_headers` | Optional[Dict[str, str]] | Additional headers to send with requests. (Optional) |
|
||||
| `additional_config` | Optional[weaviate.AdditionalConfig] | Additional configuration for the client. (Optional) |
|
||||
| `connection_params` | Dict[str, Any] | Dictionary containing connection parameters. This parameter is used internally and can be ignored in most cases. |
|
||||
|
||||
Here's an example of how to initialize a WeaviateDB:
|
||||
|
||||
```python
|
||||
from swarms.memory import WeaviateDB
|
||||
|
||||
weaviate_client = WeaviateDB(
|
||||
http_host="YOUR_HTTP_HOST",
|
||||
http_port="YOUR_HTTP_PORT",
|
||||
http_secure=True,
|
||||
grpc_host="YOUR_gRPC_HOST",
|
||||
grpc_port="YOUR_gRPC_PORT",
|
||||
grpc_secure=True,
|
||||
auth_client_secret="YOUR_APIKEY",
|
||||
additional_headers={"X-OpenAI-Api-Key": "YOUR_OPENAI_APIKEY"},
|
||||
additional_config=None, # You can pass additional configuration here
|
||||
)
|
||||
```
|
||||
|
||||
## Methods
|
||||
|
||||
### `create_collection`
|
||||
|
||||
The `create_collection` method allows you to create a new collection in Weaviate. A collection is a container for storing objects with specific properties.
|
||||
|
||||
#### Parameters
|
||||
|
||||
- `name` (str): The name of the collection.
|
||||
- `properties` (List[Dict[str, Any]]): A list of dictionaries specifying the properties of objects to be stored in the collection.
|
||||
- `vectorizer_config` (Any, optional): Additional vectorizer configuration for the collection. (Optional)
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
weaviate_client.create_collection(
|
||||
name="my_collection",
|
||||
properties=[
|
||||
{"name": "property1", "dataType": ["string"]},
|
||||
{"name": "property2", "dataType": ["int"]},
|
||||
],
|
||||
vectorizer_config=None, # Optional vectorizer configuration
|
||||
)
|
||||
```
|
||||
|
||||
### `add`
|
||||
|
||||
The `add` method allows you to add an object to a specified collection in Weaviate.
|
||||
|
||||
#### Parameters
|
||||
|
||||
- `collection_name` (str): The name of the collection where the object will be added.
|
||||
- `properties` (Dict[str, Any]): A dictionary specifying the properties of the object to be added.
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
weaviate_client.add(
|
||||
collection_name="my_collection", properties={"property1": "value1", "property2": 42}
|
||||
)
|
||||
```
|
||||
|
||||
### `query`
|
||||
|
||||
The `query` method allows you to query objects from a specified collection in Weaviate.
|
||||
|
||||
#### Parameters
|
||||
|
||||
- `collection_name` (str): The name of the collection to query.
|
||||
- `query` (str): The query string specifying the search criteria.
|
||||
- `limit` (int, optional): The maximum number of results to return. (Default: 10)
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
results = weaviate_client.query(
|
||||
collection_name="my_collection",
|
||||
query="property1:value1",
|
||||
limit=20 # Optional, specify the limit
|
||||
|
||||
if needed
|
||||
)
|
||||
```
|
||||
|
||||
### `update`
|
||||
|
||||
The `update` method allows you to update an object in a specified collection in Weaviate.
|
||||
|
||||
#### Parameters
|
||||
|
||||
- `collection_name` (str): The name of the collection where the object exists.
|
||||
- `object_id` (str): The ID of the object to be updated.
|
||||
- `properties` (Dict[str, Any]): A dictionary specifying the properties to update.
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
weaviate_client.update(
|
||||
collection_name="my_collection",
|
||||
object_id="object123",
|
||||
properties={"property1": "new_value", "property2": 99},
|
||||
)
|
||||
```
|
||||
|
||||
### `delete`
|
||||
|
||||
The `delete` method allows you to delete an object from a specified collection in Weaviate.
|
||||
|
||||
#### Parameters
|
||||
|
||||
- `collection_name` (str): The name of the collection from which to delete the object.
|
||||
- `object_id` (str): The ID of the object to delete.
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
weaviate_client.delete(collection_name="my_collection", object_id="object123")
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
Here are three examples demonstrating how to use the Weaviate API Client for common tasks:
|
||||
|
||||
### Example 1: Creating a Collection
|
||||
|
||||
```python
|
||||
weaviate_client.create_collection(
|
||||
name="people",
|
||||
properties=[
|
||||
{"name": "name", "dataType": ["string"]},
|
||||
{"name": "age", "dataType": ["int"]},
|
||||
],
|
||||
)
|
||||
```
|
||||
|
||||
### Example 2: Adding an Object
|
||||
|
||||
```python
|
||||
weaviate_client.add(collection_name="people", properties={"name": "John", "age": 30})
|
||||
```
|
||||
|
||||
### Example 3: Querying Objects
|
||||
|
||||
```python
|
||||
results = weaviate_client.query(collection_name="people", query="name:John", limit=5)
|
||||
```
|
||||
|
||||
These examples cover the basic operations of creating collections, adding objects, and querying objects using the Weaviate API Client.
|
||||
|
||||
## Additional Information and Tips
|
||||
|
||||
- If you encounter any errors during the operations, the client will raise exceptions with informative error messages.
|
||||
- You can explore more advanced features and configurations in the Weaviate documentation.
|
||||
- Make sure to handle authentication and security appropriately when using the client in production environments.
|
||||
|
||||
## References and Resources
|
||||
|
||||
- [Weaviate Documentation](https://weaviate.readthedocs.io/en/latest/): Official documentation for Weaviate.
|
||||
- [Weaviate GitHub Repository](https://github.com/semi-technologies/weaviate): The source code and issue tracker for Weaviate.
|
||||
|
||||
This documentation provides a comprehensive guide on using the Weaviate API Client to interact with Weaviate, making it easier to manage and query your data.
|
@ -0,0 +1,109 @@
|
||||
# **Documentation for the `Anthropic` Class**
|
||||
|
||||
## **Overview and Introduction**
|
||||
|
||||
The `Anthropic` class provides an interface to interact with the Anthropic large language models. This class encapsulates the necessary functionality to request completions from the Anthropic API based on a provided prompt and other configurable parameters.
|
||||
|
||||
### **Key Concepts and Terminology**
|
||||
|
||||
- **Anthropic**: A large language model, akin to GPT-3 and its successors.
|
||||
- **Prompt**: A piece of text that serves as the starting point for model completions.
|
||||
- **Stop Sequences**: Specific tokens or sequences to indicate when the model should stop generating.
|
||||
- **Tokens**: Discrete pieces of information in a text. For example, in English, a token can be as short as one character or as long as one word.
|
||||
|
||||
## **Class Definition**
|
||||
|
||||
### `Anthropic`
|
||||
```python
|
||||
class Anthropic:
|
||||
"""Anthropic large language models."""
|
||||
```
|
||||
|
||||
### Parameters:
|
||||
|
||||
- `model (str)`: The name of the model to use for completions. Default is "claude-2".
|
||||
|
||||
- `max_tokens_to_sample (int)`: Maximum number of tokens to generate in the output. Default is 256.
|
||||
|
||||
- `temperature (float, optional)`: Sampling temperature. A higher value will make the output more random, while a lower value will make it more deterministic.
|
||||
|
||||
- `top_k (int, optional)`: Sample from the top-k most probable next tokens. Setting this parameter can reduce randomness in the output.
|
||||
|
||||
- `top_p (float, optional)`: Sample from the smallest set of tokens such that their cumulative probability exceeds the specified value. Used in nucleus sampling to provide a balance between randomness and determinism.
|
||||
|
||||
- `streaming (bool)`: Whether to stream the output or not. Default is False.
|
||||
|
||||
- `default_request_timeout (int, optional)`: Default timeout in seconds for API requests. Default is 600.
|
||||
|
||||
### **Methods and their Functionality**
|
||||
|
||||
#### `_default_params(self) -> dict`
|
||||
|
||||
- Provides the default parameters for calling the Anthropic API.
|
||||
|
||||
- **Returns**: A dictionary containing the default parameters.
|
||||
|
||||
#### `generate(self, prompt: str, stop: list[str] = None) -> str`
|
||||
|
||||
- Calls out to Anthropic's completion endpoint to generate text based on the given prompt.
|
||||
|
||||
- **Parameters**:
|
||||
- `prompt (str)`: The input text to provide context for the generated text.
|
||||
|
||||
- `stop (list[str], optional)`: Sequences to indicate when the model should stop generating.
|
||||
|
||||
- **Returns**: A string containing the model's generated completion based on the prompt.
|
||||
|
||||
#### `__call__(self, prompt: str, stop: list[str] = None) -> str`
|
||||
|
||||
- An alternative to the `generate` method that allows calling the class instance directly.
|
||||
|
||||
- **Parameters**:
|
||||
- `prompt (str)`: The input text to provide context for the generated text.
|
||||
|
||||
- `stop (list[str], optional)`: Sequences to indicate when the model should stop generating.
|
||||
|
||||
- **Returns**: A string containing the model's generated completion based on the prompt.
|
||||
|
||||
## **Usage Examples**
|
||||
|
||||
```python
|
||||
# Import necessary modules and classes
|
||||
from swarms.models import Anthropic
|
||||
|
||||
# Initialize an instance of the Anthropic class
|
||||
model = Anthropic(anthropic_api_key="")
|
||||
|
||||
# Using the run method
|
||||
completion_1 = model.run("What is the capital of France?")
|
||||
print(completion_1)
|
||||
|
||||
# Using the __call__ method
|
||||
completion_2 = model("How far is the moon from the earth?", stop=["miles", "km"])
|
||||
print(completion_2)
|
||||
```
|
||||
|
||||
## **Mathematical Formula**
|
||||
|
||||
The underlying operations of the `Anthropic` class involve probabilistic sampling based on token logits from the Anthropic model. Mathematically, the process of generating a token \( t \) from the given logits \( l \) can be described by the softmax function:
|
||||
|
||||
\[ P(t) = \frac{e^{l_t}}{\sum_{i} e^{l_i}} \]
|
||||
|
||||
Where:
|
||||
- \( P(t) \) is the probability of token \( t \).
|
||||
- \( l_t \) is the logit corresponding to token \( t \).
|
||||
- The summation runs over all possible tokens.
|
||||
|
||||
The temperature, top-k, and top-p parameters are further used to modulate the probabilities.
|
||||
|
||||
## **Additional Information and Tips**
|
||||
|
||||
- Ensure you have a valid `ANTHROPIC_API_KEY` set as an environment variable or passed during class instantiation.
|
||||
|
||||
- Always handle exceptions that may arise from API timeouts or invalid prompts.
|
||||
|
||||
## **References and Resources**
|
||||
|
||||
- [Anthropic's official documentation](https://www.anthropic.com/docs)
|
||||
|
||||
- [Token-based sampling in Language Models](https://arxiv.org/abs/1904.09751) for a deeper understanding of token sampling.
|
@ -0,0 +1,227 @@
|
||||
# Language Model Interface Documentation
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Introduction](#introduction)
|
||||
2. [Abstract Language Model](#abstract-language-model)
|
||||
- [Initialization](#initialization)
|
||||
- [Attributes](#attributes)
|
||||
- [Methods](#methods)
|
||||
3. [Implementation](#implementation)
|
||||
4. [Usage Examples](#usage-examples)
|
||||
5. [Additional Features](#additional-features)
|
||||
6. [Performance Metrics](#performance-metrics)
|
||||
7. [Logging and Checkpoints](#logging-and-checkpoints)
|
||||
8. [Resource Utilization Tracking](#resource-utilization-tracking)
|
||||
9. [Conclusion](#conclusion)
|
||||
|
||||
---
|
||||
|
||||
## 1. Introduction <a name="introduction"></a>
|
||||
|
||||
The Language Model Interface (`BaseLLM`) is a flexible and extensible framework for working with various language models. This documentation provides a comprehensive guide to the interface, its attributes, methods, and usage examples. Whether you're using a pre-trained language model or building your own, this interface can help streamline the process of text generation, chatbots, summarization, and more.
|
||||
|
||||
## 2. Abstract Language Model <a name="abstract-language-model"></a>
|
||||
|
||||
### Initialization <a name="initialization"></a>
|
||||
|
||||
The `BaseLLM` class provides a common interface for language models. It can be initialized with various parameters to customize model behavior. Here are the initialization parameters:
|
||||
|
||||
| Parameter | Description | Default Value |
|
||||
|------------------------|-------------------------------------------------------------------------------------------------|---------------|
|
||||
| `model_name` | The name of the language model to use. | None |
|
||||
| `max_tokens` | The maximum number of tokens in the generated text. | None |
|
||||
| `temperature` | The temperature parameter for controlling randomness in text generation. | None |
|
||||
| `top_k` | The top-k parameter for filtering words in text generation. | None |
|
||||
| `top_p` | The top-p parameter for filtering words in text generation. | None |
|
||||
| `system_prompt` | A system-level prompt to set context for generation. | None |
|
||||
| `beam_width` | The beam width for beam search. | None |
|
||||
| `num_return_sequences` | The number of sequences to return in the output. | None |
|
||||
| `seed` | The random seed for reproducibility. | None |
|
||||
| `frequency_penalty` | The frequency penalty parameter for promoting word diversity. | None |
|
||||
| `presence_penalty` | The presence penalty parameter for discouraging repetitions. | None |
|
||||
| `stop_token` | A stop token to indicate the end of generated text. | None |
|
||||
| `length_penalty` | The length penalty parameter for controlling the output length. | None |
|
||||
| `role` | The role of the language model (e.g., assistant, user, etc.). | None |
|
||||
| `max_length` | The maximum length of generated sequences. | None |
|
||||
| `do_sample` | Whether to use sampling during text generation. | None |
|
||||
| `early_stopping` | Whether to use early stopping during text generation. | None |
|
||||
| `num_beams` | The number of beams to use in beam search. | None |
|
||||
| `repition_penalty` | The repetition penalty parameter for discouraging repeated tokens. | None |
|
||||
| `pad_token_id` | The token ID for padding. | None |
|
||||
| `eos_token_id` | The token ID for the end of a sequence. | None |
|
||||
| `bos_token_id` | The token ID for the beginning of a sequence. | None |
|
||||
| `device` | The device to run the model on (e.g., 'cpu' or 'cuda'). | None |
|
||||
|
||||
### Attributes <a name="attributes"></a>
|
||||
|
||||
- `model_name`: The name of the language model being used.
|
||||
- `max_tokens`: The maximum number of tokens in generated text.
|
||||
- `temperature`: The temperature parameter controlling randomness.
|
||||
- `top_k`: The top-k parameter for word filtering.
|
||||
- `top_p`: The top-p parameter for word filtering.
|
||||
- `system_prompt`: A system-level prompt for context.
|
||||
- `beam_width`: The beam width for beam search.
|
||||
- `num_return_sequences`: The number of output sequences.
|
||||
- `seed`: The random seed for reproducibility.
|
||||
- `frequency_penalty`: The frequency penalty parameter.
|
||||
- `presence_penalty`: The presence penalty parameter.
|
||||
- `stop_token`: The stop token to indicate text end.
|
||||
- `length_penalty`: The length penalty parameter.
|
||||
- `role`: The role of the language model.
|
||||
- `max_length`: The maximum length of generated sequences.
|
||||
- `do_sample`: Whether to use sampling during generation.
|
||||
- `early_stopping`: Whether to use early stopping.
|
||||
- `num_beams`: The number of beams in beam search.
|
||||
- `repition_penalty`: The repetition penalty parameter.
|
||||
- `pad_token_id`: The token ID for padding.
|
||||
- `eos_token_id`: The token ID for the end of a sequence.
|
||||
- `bos_token_id`: The token ID for the beginning of a sequence.
|
||||
- `device`: The device used for model execution.
|
||||
- `history`: A list of conversation history.
|
||||
|
||||
### Methods <a name="methods"></a>
|
||||
|
||||
The `BaseLLM` class defines several methods for working with language models:
|
||||
|
||||
- `run(task: Optional[str] = None, *args, **kwargs) -> str`: Generate text using the language model. This method is abstract and must be implemented by subclasses.
|
||||
|
||||
- `arun(task: Optional[str] = None, *args, **kwargs)`: An asynchronous version of `run` for concurrent text generation.
|
||||
|
||||
- `batch_run(tasks: List[str], *args, **kwargs)`: Generate text for a batch of tasks.
|
||||
|
||||
- `abatch_run(tasks: List[str], *args, **kwargs)`: An asynchronous version of `batch_run` for concurrent batch generation.
|
||||
|
||||
- `chat(task: str, history: str = "") -> str`: Conduct a chat with the model, providing a conversation history.
|
||||
|
||||
- `__call__(task: str) -> str`: Call the model to generate text.
|
||||
|
||||
- `_tokens_per_second() -> float`: Calculate tokens generated per second.
|
||||
|
||||
- `_num_tokens(text: str) -> int`: Calculate the number of tokens in a text.
|
||||
|
||||
- `_time_for_generation(task: str) -> float`: Measure the time taken for text generation.
|
||||
|
||||
- `generate_summary(text: str) -> str`: Generate a summary of the provided text.
|
||||
|
||||
- `set_temperature(value: float)`: Set the temperature parameter.
|
||||
|
||||
- `set_max_tokens(value: int)`: Set the maximum number of tokens.
|
||||
|
||||
- `clear_history()`: Clear the conversation history.
|
||||
|
||||
- `enable_logging(log_file: str = "model.log")`: Initialize logging for the model.
|
||||
|
||||
- `log_event(message: str)`: Log an event.
|
||||
|
||||
- `save_checkpoint(checkpoint_dir: str = "checkpoints")`: Save the model state as a checkpoint.
|
||||
|
||||
- `load_checkpoint(checkpoint_path: str)`: Load the model state from a checkpoint.
|
||||
|
||||
- `toggle_creative_mode(enable: bool)`: Toggle creative mode for the model.
|
||||
|
||||
- `track_resource_utilization()`: Track and report resource utilization.
|
||||
|
||||
- `
|
||||
|
||||
get_generation_time() -> float`: Get the time taken for text generation.
|
||||
|
||||
- `set_max_length(max_length: int)`: Set the maximum length of generated sequences.
|
||||
|
||||
- `set_model_name(model_name: str)`: Set the model name.
|
||||
|
||||
- `set_frequency_penalty(frequency_penalty: float)`: Set the frequency penalty parameter.
|
||||
|
||||
- `set_presence_penalty(presence_penalty: float)`: Set the presence penalty parameter.
|
||||
|
||||
- `set_stop_token(stop_token: str)`: Set the stop token.
|
||||
|
||||
- `set_length_penalty(length_penalty: float)`: Set the length penalty parameter.
|
||||
|
||||
- `set_role(role: str)`: Set the role of the model.
|
||||
|
||||
- `set_top_k(top_k: int)`: Set the top-k parameter.
|
||||
|
||||
- `set_top_p(top_p: float)`: Set the top-p parameter.
|
||||
|
||||
- `set_num_beams(num_beams: int)`: Set the number of beams.
|
||||
|
||||
- `set_do_sample(do_sample: bool)`: Set whether to use sampling.
|
||||
|
||||
- `set_early_stopping(early_stopping: bool)`: Set whether to use early stopping.
|
||||
|
||||
- `set_seed(seed: int)`: Set the random seed.
|
||||
|
||||
- `set_device(device: str)`: Set the device for model execution.
|
||||
|
||||
## 3. Implementation <a name="implementation"></a>
|
||||
|
||||
The `BaseLLM` class serves as the base for implementing specific language models. Subclasses of `BaseLLM` should implement the `run` method to define how text is generated for a given task. This design allows flexibility in integrating different language models while maintaining a common interface.
|
||||
|
||||
## 4. Usage Examples <a name="usage-examples"></a>
|
||||
|
||||
To demonstrate how to use the `BaseLLM` interface, let's create an example using a hypothetical language model. We'll initialize an instance of the model and generate text for a simple task.
|
||||
|
||||
```python
|
||||
# Import the BaseLLM class
|
||||
from swarms.models import BaseLLM
|
||||
|
||||
# Create an instance of the language model
|
||||
language_model = BaseLLM(
|
||||
model_name="my_language_model",
|
||||
max_tokens=50,
|
||||
temperature=0.7,
|
||||
top_k=50,
|
||||
top_p=0.9,
|
||||
device="cuda",
|
||||
)
|
||||
|
||||
# Generate text for a task
|
||||
task = "Translate the following English text to French: 'Hello, world.'"
|
||||
generated_text = language_model.run(task)
|
||||
|
||||
# Print the generated text
|
||||
print(generated_text)
|
||||
```
|
||||
|
||||
In this example, we've created an instance of our hypothetical language model, configured its parameters, and used the `run` method to generate text for a translation task.
|
||||
|
||||
## 5. Additional Features <a name="additional-features"></a>
|
||||
|
||||
The `BaseLLM` interface provides additional features for customization and control:
|
||||
|
||||
- `batch_run`: Generate text for a batch of tasks efficiently.
|
||||
- `arun` and `abatch_run`: Asynchronous versions of `run` and `batch_run` for concurrent text generation.
|
||||
- `chat`: Conduct a conversation with the model by providing a history of the conversation.
|
||||
- `__call__`: Allow the model to be called directly to generate text.
|
||||
|
||||
These features enhance the flexibility and utility of the interface in various applications, including chatbots, language translation, and content generation.
|
||||
|
||||
## 6. Performance Metrics <a name="performance-metrics"></a>
|
||||
|
||||
The `BaseLLM` class offers methods for tracking performance metrics:
|
||||
|
||||
- `_tokens_per_second`: Calculate tokens generated per second.
|
||||
- `_num_tokens`: Calculate the number of tokens in a text.
|
||||
- `_time_for_generation`: Measure the time taken for text generation.
|
||||
|
||||
These metrics help assess the efficiency and speed of text generation, enabling optimizations as needed.
|
||||
|
||||
## 7. Logging and Checkpoints <a name="logging-and-checkpoints"></a>
|
||||
|
||||
Logging and checkpointing are crucial for tracking model behavior and ensuring reproducibility:
|
||||
|
||||
- `enable_logging`: Initialize logging for the model.
|
||||
- `log_event`: Log events and activities.
|
||||
- `save_checkpoint`: Save the model state as a checkpoint.
|
||||
- `load_checkpoint`: Load the model state from a checkpoint.
|
||||
|
||||
These capabilities aid in debugging, monitoring, and resuming model experiments.
|
||||
|
||||
## 8. Resource Utilization Tracking <a name="resource-utilization-tracking"></a>
|
||||
|
||||
The `track_resource_utilization` method is a placeholder for tracking and reporting resource utilization, such as CPU and memory usage. It can be customized to suit specific monitoring needs.
|
||||
|
||||
## 9. Conclusion <a name="conclusion"></a>
|
||||
|
||||
The Language Model Interface (`BaseLLM`) is a versatile framework for working with language models. Whether you're using pre-trained models or developing your own, this interface provides a consistent and extensible foundation. By following the provided guidelines and examples, you can integrate and customize language models for various natural language processing tasks.
|
@ -0,0 +1,299 @@
|
||||
# `BaseMultiModalModel` Documentation
|
||||
|
||||
Swarms is a Python library that provides a framework for running multimodal AI models. It allows you to combine text and image inputs and generate coherent and context-aware responses. This library is designed to be extensible, allowing you to integrate various multimodal models.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Introduction](#introduction)
|
||||
2. [Installation](#installation)
|
||||
3. [Getting Started](#getting-started)
|
||||
4. [BaseMultiModalModel Class](#basemultimodalmodel-class)
|
||||
- [Initialization](#initialization)
|
||||
- [Methods](#methods)
|
||||
5. [Usage Examples](#usage-examples)
|
||||
6. [Additional Tips](#additional-tips)
|
||||
7. [References and Resources](#references-and-resources)
|
||||
|
||||
## 1. Introduction <a name="introduction"></a>
|
||||
|
||||
Swarms is designed to simplify the process of working with multimodal AI models. These models are capable of understanding and generating content based on both textual and image inputs. With this library, you can run such models and receive context-aware responses.
|
||||
|
||||
## 2. Installation <a name="installation"></a>
|
||||
|
||||
To install swarms, you can use pip:
|
||||
|
||||
```bash
|
||||
pip install swarms
|
||||
```
|
||||
|
||||
## 3. Getting Started <a name="getting-started"></a>
|
||||
|
||||
To get started with Swarms, you'll need to import the library and create an instance of the `BaseMultiModalModel` class. This class serves as the foundation for running multimodal models.
|
||||
|
||||
```python
|
||||
from swarms.models import BaseMultiModalModel
|
||||
|
||||
model = BaseMultiModalModel(
|
||||
model_name="your_model_name",
|
||||
temperature=0.5,
|
||||
max_tokens=500,
|
||||
max_workers=10,
|
||||
top_p=1,
|
||||
top_k=50,
|
||||
beautify=False,
|
||||
device="cuda",
|
||||
max_new_tokens=500,
|
||||
retries=3,
|
||||
)
|
||||
```
|
||||
|
||||
You can customize the initialization parameters based on your model's requirements.
|
||||
|
||||
## 4. BaseMultiModalModel Class <a name="basemultimodalmodel-class"></a>
|
||||
|
||||
### Initialization <a name="initialization"></a>
|
||||
|
||||
The `BaseMultiModalModel` class is initialized with several parameters that control its behavior. Here's a breakdown of the initialization parameters:
|
||||
|
||||
| Parameter | Description | Default Value |
|
||||
|------------------|-------------------------------------------------------------------------------------------------------|---------------|
|
||||
| `model_name` | The name of the multimodal model to use. | None |
|
||||
| `temperature` | The temperature parameter for controlling randomness in text generation. | 0.5 |
|
||||
| `max_tokens` | The maximum number of tokens in the generated text. | 500 |
|
||||
| `max_workers` | The maximum number of concurrent workers for running tasks. | 10 |
|
||||
| `top_p` | The top-p parameter for filtering words in text generation. | 1 |
|
||||
| `top_k` | The top-k parameter for filtering words in text generation. | 50 |
|
||||
| `beautify` | Whether to beautify the output text. | False |
|
||||
| `device` | The device to run the model on (e.g., 'cuda' or 'cpu'). | 'cuda' |
|
||||
| `max_new_tokens` | The maximum number of new tokens allowed in generated responses. | 500 |
|
||||
| `retries` | The number of retries in case of an error during text generation. | 3 |
|
||||
| `system_prompt` | A system-level prompt to set context for generation. | None |
|
||||
| `meta_prompt` | A meta prompt to provide guidance for including image labels in responses. | None |
|
||||
|
||||
### Methods <a name="methods"></a>
|
||||
|
||||
The `BaseMultiModalModel` class defines various methods for running multimodal models and managing interactions:
|
||||
|
||||
- `run(task: str, img: str) -> str`: Run the multimodal model with a text task and an image URL to generate a response.
|
||||
|
||||
- `arun(task: str, img: str) -> str`: Run the multimodal model asynchronously with a text task and an image URL to generate a response.
|
||||
|
||||
- `get_img_from_web(img: str) -> Image`: Fetch an image from a URL and return it as a PIL Image.
|
||||
|
||||
- `encode_img(img: str) -> str`: Encode an image to base64 format.
|
||||
|
||||
- `get_img(img: str) -> Image`: Load an image from the local file system and return it as a PIL Image.
|
||||
|
||||
- `clear_chat_history()`: Clear the chat history maintained by the model.
|
||||
|
||||
- `run_many(tasks: List[str], imgs: List[str]) -> List[str]`: Run the model on multiple text tasks and image URLs concurrently and return a list of responses.
|
||||
|
||||
- `run_batch(tasks_images: List[Tuple[str, str]]) -> List[str]`: Process a batch of text tasks and image URLs and return a list of responses.
|
||||
|
||||
- `run_batch_async(tasks_images: List[Tuple[str, str]]) -> List[str]`: Process a batch of text tasks and image URLs asynchronously and return a list of responses.
|
||||
|
||||
- `run_batch_async_with_retries(tasks_images: List[Tuple[str, str]]) -> List[str]`: Process a batch of text tasks and image URLs asynchronously with retries in case of errors and return a list of responses.
|
||||
|
||||
- `unique_chat_history() -> List[str]`: Get the unique chat history stored by the model.
|
||||
|
||||
- `run_with_retries(task: str, img: str) -> str`: Run the model with retries in case of an error.
|
||||
|
||||
- `run_batch_with_retries(tasks_images: List[Tuple[str, str]]) -> List[str]`: Run a batch of tasks with retries in case of errors and return a list of responses.
|
||||
|
||||
- `_tokens_per_second() -> float`: Calculate the tokens generated per second during text generation.
|
||||
|
||||
- `_time_for_generation(task: str) -> float`: Measure the time taken for text generation for a specific task.
|
||||
|
||||
- `generate_summary(text: str) -> str`: Generate a summary of the provided text.
|
||||
|
||||
- `set_temperature(value: float)`: Set the temperature parameter for controlling randomness in text generation.
|
||||
|
||||
- `set_max_tokens(value: int)`: Set the maximum number of tokens allowed in generated responses.
|
||||
|
||||
- `get_generation_time() -> float`: Get the time taken for text generation for the last task.
|
||||
|
||||
- `get_chat_history() -> List[str]`: Get the chat history, including all interactions.
|
||||
|
||||
- `get_unique_chat_history() -> List[str]`: Get the unique chat history, removing duplicate interactions.
|
||||
|
||||
- `get_chat_history_length() -> int`: Get the length of the chat history.
|
||||
|
||||
- `get_unique_chat_history_length() -> int`: Get the length of the unique chat history.
|
||||
|
||||
- `get_chat_history_tokens() -> int`: Get the total number of tokens in the chat history.
|
||||
|
||||
- `print_beautiful(content: str, color: str = 'cyan')`: Print content beautifully using colored text.
|
||||
|
||||
- `stream(content: str)`: Stream the content, printing it character by character.
|
||||
|
||||
- `meta_prompt() -> str`: Get the meta prompt that provides guidance for including image labels in responses.
|
||||
|
||||
## 5. Usage Examples <a name="usage-examples"></a>
|
||||
|
||||
Let's explore some usage examples of the MultiModalAI library:
|
||||
|
||||
### Example 1: Running
|
||||
|
||||
the Model
|
||||
|
||||
```python
|
||||
# Import the library
|
||||
from swarms.models import BaseMultiModalModel
|
||||
|
||||
# Create an instance of the model
|
||||
model = BaseMultiModalModel(
|
||||
model_name="your_model_name",
|
||||
temperature=0.5,
|
||||
max_tokens=500,
|
||||
device="cuda",
|
||||
)
|
||||
|
||||
# Run the model with a text task and an image URL
|
||||
response = model.run(
|
||||
"Generate a summary of this text", "https://www.example.com/image.jpg"
|
||||
)
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Example 2: Running Multiple Tasks Concurrently
|
||||
|
||||
```python
|
||||
# Import the library
|
||||
from swarms.models import BaseMultiModalModel
|
||||
|
||||
# Create an instance of the model
|
||||
model = BaseMultiModalModel(
|
||||
model_name="your_model_name",
|
||||
temperature=0.5,
|
||||
max_tokens=500,
|
||||
max_workers=4,
|
||||
device="cuda",
|
||||
)
|
||||
|
||||
# Define a list of tasks and image URLs
|
||||
tasks = ["Task 1", "Task 2", "Task 3"]
|
||||
images = ["https://image1.jpg", "https://image2.jpg", "https://image3.jpg"]
|
||||
|
||||
# Run the model on multiple tasks concurrently
|
||||
responses = model.run_many(tasks, images)
|
||||
for response in responses:
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Example 3: Running the Model Asynchronously
|
||||
|
||||
```python
|
||||
# Import the library
|
||||
from swarms.models import BaseMultiModalModel
|
||||
|
||||
# Create an instance of the model
|
||||
model = BaseMultiModalModel(
|
||||
model_name="your_model_name",
|
||||
temperature=0.5,
|
||||
max_tokens=500,
|
||||
device="cuda",
|
||||
)
|
||||
|
||||
# Define a list of tasks and image URLs
|
||||
tasks_images = [
|
||||
("Task 1", "https://image1.jpg"),
|
||||
("Task 2", "https://image2.jpg"),
|
||||
("Task 3", "https://image3.jpg"),
|
||||
]
|
||||
|
||||
# Run the model on multiple tasks asynchronously
|
||||
responses = model.run_batch_async(tasks_images)
|
||||
for response in responses:
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Example 4: Inheriting `BaseMultiModalModel` for it's prebuilt classes
|
||||
```python
|
||||
from swarms.models import BaseMultiModalModel
|
||||
|
||||
|
||||
class CustomMultiModalModel(BaseMultiModalModel):
|
||||
def __init__(self, model_name, custom_parameter, *args, **kwargs):
|
||||
# Call the parent class constructor
|
||||
super().__init__(model_name=model_name, *args, **kwargs)
|
||||
# Initialize custom parameters specific to your model
|
||||
self.custom_parameter = custom_parameter
|
||||
|
||||
def __call__(self, text, img):
|
||||
# Implement the multimodal model logic here
|
||||
# You can use self.custom_parameter and other inherited attributes
|
||||
pass
|
||||
|
||||
def generate_summary(self, text):
|
||||
# Implement the summary generation logic using your model
|
||||
# You can use self.custom_parameter and other inherited attributes
|
||||
pass
|
||||
|
||||
|
||||
# Create an instance of your custom multimodal model
|
||||
custom_model = CustomMultiModalModel(
|
||||
model_name="your_custom_model_name",
|
||||
custom_parameter="your_custom_value",
|
||||
temperature=0.5,
|
||||
max_tokens=500,
|
||||
device="cuda",
|
||||
)
|
||||
|
||||
# Run your custom model
|
||||
response = custom_model.run(
|
||||
"Generate a summary of this text", "https://www.example.com/image.jpg"
|
||||
)
|
||||
print(response)
|
||||
|
||||
# Generate a summary using your custom model
|
||||
summary = custom_model.generate_summary("This is a sample text to summarize.")
|
||||
print(summary)
|
||||
```
|
||||
|
||||
In the code above:
|
||||
|
||||
1. We define a `CustomMultiModalModel` class that inherits from `BaseMultiModalModel`.
|
||||
|
||||
2. In the constructor of our custom class, we call the parent class constructor using `super()` and initialize any custom parameters specific to our model. In this example, we introduced a `custom_parameter`.
|
||||
|
||||
3. We override the `__call__` method, which is responsible for running the multimodal model logic. Here, you can implement the specific behavior of your model, considering both text and image inputs.
|
||||
|
||||
4. We override the `generate_summary` method, which is used to generate a summary of text input. You can implement your custom summarization logic here.
|
||||
|
||||
5. We create an instance of our custom model, passing the required parameters, including the custom parameter.
|
||||
|
||||
6. We demonstrate how to run the custom model and generate a summary using it.
|
||||
|
||||
By inheriting from `BaseMultiModalModel`, you can leverage the prebuilt features and methods provided by the library while customizing the behavior of your multimodal model. This allows you to create powerful and specialized models for various multimodal tasks.
|
||||
|
||||
These examples demonstrate how to use MultiModalAI to run multimodal models with text and image inputs. You can adjust the parameters and methods to suit your specific use cases.
|
||||
|
||||
## 6. Additional Tips <a name="additional-tips"></a>
|
||||
|
||||
Here are some additional tips and considerations for using MultiModalAI effectively:
|
||||
|
||||
- **Custom Models**: You can create your own multimodal models and inherit from the `BaseMultiModalModel` class to integrate them with this library.
|
||||
|
||||
- **Retries**: In cases where text generation might fail due to various reasons (e.g., server issues), using methods with retries can be helpful.
|
||||
|
||||
- **Monitoring**: You can monitor the performance of your model using methods like `_tokens_per_second()` and `_time_for_generation()`.
|
||||
|
||||
- **Chat History**: The library maintains a chat history, allowing you to keep track of interactions.
|
||||
|
||||
- **Streaming**: The `stream()` method can be useful for displaying output character by character, which can be helpful for certain applications.
|
||||
|
||||
## 7. References and Resources <a name="references-and-resources"></a>
|
||||
|
||||
Here are some references and resources that you may find useful for working with multimodal models:
|
||||
|
||||
- [Hugging Face Transformers Library](https://huggingface.co/transformers/): A library for working with various transformer-based models.
|
||||
|
||||
- [PIL (Python Imaging Library)](https://pillow.readthedocs.io/en/stable/): Documentation for working with images in Python using the Pillow library.
|
||||
|
||||
- [Concurrent Programming in Python](https://docs.python.org/3/library/concurrent.futures.html): Official Python documentation for concurrent programming.
|
||||
|
||||
- [Requests Library Documentation](https://docs.python-requests.org/en/latest/): Documentation for the Requests library, which is used for making HTTP requests.
|
||||
|
||||
- [Base64 Encoding in Python](https://docs.python.org/3/library/base64.html): Official Python documentation for base64 encoding and decoding.
|
||||
|
||||
This concludes the documentation for the MultiModalAI library. You can now explore the library further and integrate it with your multimodal AI projects.
|
@ -0,0 +1,107 @@
|
||||
# How to Create A Custom Language Model
|
||||
|
||||
When working with advanced language models, there might come a time when you need a custom solution tailored to your specific needs. Inheriting from an `BaseLLM` in a Python framework allows developers to create custom language model classes with ease. This developer guide will take you through the process step by step.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
Before you begin, ensure that you have:
|
||||
|
||||
- A working knowledge of Python programming.
|
||||
- Basic understanding of object-oriented programming (OOP) in Python.
|
||||
- Familiarity with language models and natural language processing (NLP).
|
||||
- The appropriate Python framework installed, with access to `BaseLLM`.
|
||||
|
||||
### Step-by-Step Guide
|
||||
|
||||
#### Step 1: Understand `BaseLLM`
|
||||
|
||||
The `BaseLLM` is an abstract base class that defines a set of methods and properties which your custom language model (LLM) should implement. Abstract classes in Python are not designed to be instantiated directly but are meant to be subclasses.
|
||||
|
||||
#### Step 2: Create a New Class
|
||||
|
||||
Start by defining a new class that inherits from `BaseLLM`. This class will implement the required methods defined in the abstract base class.
|
||||
|
||||
```python
|
||||
from swarms import BaseLLM
|
||||
|
||||
class vLLMLM(BaseLLM):
|
||||
pass
|
||||
```
|
||||
|
||||
#### Step 3: Initialize Your Class
|
||||
|
||||
Implement the `__init__` method to initialize your custom LLM. You'll want to initialize the base class as well and define any additional parameters for your model.
|
||||
|
||||
```python
|
||||
class vLLMLM(BaseLLM):
|
||||
def __init__(self, model_name='default_model', tensor_parallel_size=1, *args, **kwargs):
|
||||
super().__init__(*args, **kwargs)
|
||||
self.model_name = model_name
|
||||
self.tensor_parallel_size = tensor_parallel_size
|
||||
# Add any additional initialization here
|
||||
```
|
||||
|
||||
#### Step 4: Implement Required Methods
|
||||
|
||||
Implement the `run` method or any other abstract methods required by `BaseLLM`. This is where you define how your model processes input and returns output.
|
||||
|
||||
```python
|
||||
class vLLMLM(BaseLLM):
|
||||
# ... existing code ...
|
||||
|
||||
def run(self, task, *args, **kwargs):
|
||||
# Logic for running your model goes here
|
||||
return "Processed output"
|
||||
```
|
||||
|
||||
#### Step 5: Test Your Model
|
||||
|
||||
Instantiate your custom LLM and test it to ensure that it works as expected.
|
||||
|
||||
```python
|
||||
model = vLLMLM(model_name='my_custom_model', tensor_parallel_size=2)
|
||||
output = model.run("What are the symptoms of COVID-19?")
|
||||
print(output) # Outputs: "Processed output"
|
||||
```
|
||||
|
||||
#### Step 6: Integrate Additional Components
|
||||
|
||||
Depending on the requirements, you might need to integrate additional components such as database connections, parallel computing resources, or custom processing pipelines.
|
||||
|
||||
#### Step 7: Documentation
|
||||
|
||||
Write comprehensive docstrings for your class and its methods. Good documentation is crucial for maintaining the code and for other developers who might use your model.
|
||||
|
||||
```python
|
||||
class vLLMLM(BaseLLM):
|
||||
"""
|
||||
A custom language model class that extends BaseLLM.
|
||||
|
||||
... more detailed docstring ...
|
||||
"""
|
||||
# ... existing code ...
|
||||
```
|
||||
|
||||
#### Step 8: Best Practices
|
||||
|
||||
Follow best practices such as error handling, input validation, and resource management to ensure your model is robust and reliable.
|
||||
|
||||
#### Step 9: Packaging Your Model
|
||||
|
||||
Package your custom LLM class into a module or package that can be easily distributed and imported into other projects.
|
||||
|
||||
#### Step 10: Version Control and Collaboration
|
||||
|
||||
Use a version control system like Git to track changes to your model. This makes collaboration easier and helps you keep a history of your work.
|
||||
|
||||
### Conclusion
|
||||
|
||||
By following this guide, you should now have a custom model that extends the `BaseLLM`. Remember that the key to a successful custom LLM is understanding the base functionalities, implementing necessary changes, and testing thoroughly. Keep iterating and improving based on feedback and performance metrics.
|
||||
|
||||
### Further Reading
|
||||
|
||||
- Official Python documentation on abstract base classes.
|
||||
- In-depth tutorials on object-oriented programming in Python.
|
||||
- Advanced NLP techniques and optimization strategies for language models.
|
||||
|
||||
This guide provides the fundamental steps to create custom models using `BaseLLM`. For detailed implementation and advanced customization, it's essential to dive deeper into the specific functionalities and capabilities of the language model framework you are using.
|
@ -0,0 +1,261 @@
|
||||
# `Dalle3` Documentation
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Introduction](#introduction)
|
||||
2. [Installation](#installation)
|
||||
3. [Quick Start](#quick-start)
|
||||
4. [Dalle3 Class](#dalle3-class)
|
||||
- [Attributes](#attributes)
|
||||
- [Methods](#methods)
|
||||
5. [Usage Examples](#usage-examples)
|
||||
6. [Error Handling](#error-handling)
|
||||
7. [Advanced Usage](#advanced-usage)
|
||||
8. [References](#references)
|
||||
|
||||
---
|
||||
|
||||
## Introduction<a name="introduction"></a>
|
||||
|
||||
The Dalle3 library is a Python module that provides an easy-to-use interface for generating images from text descriptions using the DALL·E 3 model by OpenAI. DALL·E 3 is a powerful language model capable of converting textual prompts into images. This documentation will guide you through the installation, setup, and usage of the Dalle3 library.
|
||||
|
||||
---
|
||||
|
||||
## Installation<a name="installation"></a>
|
||||
|
||||
To use the Dalle3 model, you must first install swarms:
|
||||
|
||||
```bash
|
||||
pip install swarms
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Start<a name="quick-start"></a>
|
||||
|
||||
Let's get started with a quick example of using the Dalle3 library to generate an image from a text prompt:
|
||||
|
||||
```python
|
||||
from swarms.models.dalle3 import Dalle3
|
||||
|
||||
# Create an instance of the Dalle3 class
|
||||
dalle = Dalle3()
|
||||
|
||||
# Define a text prompt
|
||||
task = "A painting of a dog"
|
||||
|
||||
# Generate an image from the text prompt
|
||||
image_url = dalle3(task)
|
||||
|
||||
# Print the generated image URL
|
||||
print(image_url)
|
||||
```
|
||||
|
||||
This example demonstrates the basic usage of the Dalle3 library to convert a text prompt into an image. The generated image URL will be printed to the console.
|
||||
|
||||
---
|
||||
|
||||
## Dalle3 Class<a name="dalle3-class"></a>
|
||||
|
||||
The Dalle3 library provides a `Dalle3` class that allows you to interact with the DALL·E 3 model. This class has several attributes and methods for generating images from text prompts.
|
||||
|
||||
### Attributes<a name="attributes"></a>
|
||||
|
||||
- `model` (str): The name of the DALL·E 3 model. Default: "dall-e-3".
|
||||
- `img` (str): The image URL generated by the Dalle3 API.
|
||||
- `size` (str): The size of the generated image. Default: "1024x1024".
|
||||
- `max_retries` (int): The maximum number of API request retries. Default: 3.
|
||||
- `quality` (str): The quality of the generated image. Default: "standard".
|
||||
- `n` (int): The number of variations to create. Default: 4.
|
||||
|
||||
### Methods<a name="methods"></a>
|
||||
|
||||
#### `__call__(self, task: str) -> Dalle3`
|
||||
|
||||
This method makes a call to the Dalle3 API and returns the image URL generated from the provided text prompt.
|
||||
|
||||
Parameters:
|
||||
- `task` (str): The text prompt to be converted to an image.
|
||||
|
||||
Returns:
|
||||
- `Dalle3`: An instance of the Dalle3 class with the image URL generated by the Dalle3 API.
|
||||
|
||||
#### `create_variations(self, img: str)`
|
||||
|
||||
This method creates variations of an image using the Dalle3 API.
|
||||
|
||||
Parameters:
|
||||
- `img` (str): The image to be used for the API request.
|
||||
|
||||
Returns:
|
||||
- `img` (str): The image URL of the generated variations.
|
||||
|
||||
---
|
||||
|
||||
## Usage Examples<a name="usage-examples"></a>
|
||||
|
||||
### Example 1: Basic Image Generation
|
||||
|
||||
```python
|
||||
from swarms.models.dalle3 import Dalle3
|
||||
|
||||
# Create an instance of the Dalle3 class
|
||||
dalle3 = Dalle3()
|
||||
|
||||
# Define a text prompt
|
||||
task = "A painting of a dog"
|
||||
|
||||
# Generate an image from the text prompt
|
||||
image_url = dalle3(task)
|
||||
|
||||
# Print the generated image URL
|
||||
print(image_url)
|
||||
```
|
||||
|
||||
### Example 2: Creating Image Variations
|
||||
|
||||
```python
|
||||
from swarms.models.dalle3 import Dalle3
|
||||
|
||||
# Create an instance of the Dalle3 class
|
||||
dalle3 = Dalle3()
|
||||
|
||||
# Define the URL of an existing image
|
||||
img_url = "https://images.unsplash.com/photo-1694734479898-6ac4633158ac?q=80&w=1287&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D
|
||||
|
||||
# Create variations of the image
|
||||
variations_url = dalle3.create_variations(img_url)
|
||||
|
||||
# Print the URLs of the generated variations
|
||||
print(variations_url)
|
||||
```
|
||||
|
||||
Certainly! Here are additional examples that cover various edge cases and methods of the `Dalle3` class in the Dalle3 library:
|
||||
|
||||
### Example 3: Customizing Image Size
|
||||
|
||||
You can customize the size of the generated image by specifying the `size` parameter when creating an instance of the `Dalle3` class. Here's how to generate a smaller image:
|
||||
|
||||
```python
|
||||
from swarms.models.dalle3 import Dalle3
|
||||
|
||||
# Create an instance of the Dalle3 class with a custom image size
|
||||
dalle3 = Dalle3(size="512x512")
|
||||
|
||||
# Define a text prompt
|
||||
task = "A small painting of a cat"
|
||||
|
||||
# Generate a smaller image from the text prompt
|
||||
image_url = dalle3(task)
|
||||
|
||||
# Print the generated image URL
|
||||
print(image_url)
|
||||
```
|
||||
|
||||
### Example 4: Adjusting Retry Limit
|
||||
|
||||
You can adjust the maximum number of API request retries using the `max_retries` parameter. Here's how to increase the retry limit:
|
||||
|
||||
```python
|
||||
from swarms.models.dalle3 import Dalle3
|
||||
|
||||
# Create an instance of the Dalle3 class with a higher retry limit
|
||||
dalle3 = Dalle3(max_retries=5)
|
||||
|
||||
# Define a text prompt
|
||||
task = "An image of a landscape"
|
||||
|
||||
# Generate an image with a higher retry limit
|
||||
image_url = dalle3(task)
|
||||
|
||||
# Print the generated image URL
|
||||
print(image_url)
|
||||
```
|
||||
|
||||
### Example 5: Generating Image Variations
|
||||
|
||||
To create variations of an existing image, you can use the `create_variations` method. Here's an example:
|
||||
|
||||
```python
|
||||
from swarms.models.dalle3 import Dalle3
|
||||
|
||||
# Create an instance of the Dalle3 class
|
||||
dalle3 = Dalle3()
|
||||
|
||||
# Define the URL of an existing image
|
||||
img_url = "https://images.unsplash.com/photo-1677290043066-12eccd944004?q=80&w=1287&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D"
|
||||
|
||||
# Create variations of the image
|
||||
variations_url = dalle3.create_variations(img_url)
|
||||
|
||||
# Print the URLs of the generated variations
|
||||
print(variations_url)
|
||||
```
|
||||
|
||||
### Example 6: Handling API Errors
|
||||
|
||||
The Dalle3 library provides error handling for API-related issues. Here's how to handle and display API errors:
|
||||
|
||||
```python
|
||||
from swarms.models.dalle3 import Dalle3
|
||||
|
||||
# Create an instance of the Dalle3 class
|
||||
dalle3 = Dalle3()
|
||||
|
||||
# Define a text prompt
|
||||
task = "Invalid prompt that may cause an API error"
|
||||
|
||||
try:
|
||||
# Attempt to generate an image with an invalid prompt
|
||||
image_url = dalle3(task)
|
||||
print(image_url)
|
||||
except Exception as e:
|
||||
print(f"Error occurred: {str(e)}")
|
||||
```
|
||||
|
||||
### Example 7: Customizing Image Quality
|
||||
|
||||
You can customize the quality of the generated image by specifying the `quality` parameter. Here's how to generate a high-quality image:
|
||||
|
||||
```python
|
||||
from swarms.models.dalle3 import Dalle3
|
||||
|
||||
# Create an instance of the Dalle3 class with high quality
|
||||
dalle3 = Dalle3(quality="high")
|
||||
|
||||
# Define a text prompt
|
||||
task = "A high-quality image of a sunset"
|
||||
|
||||
# Generate a high-quality image from the text prompt
|
||||
image_url = dalle3(task)
|
||||
|
||||
# Print the generated image URL
|
||||
print(image_url)
|
||||
```
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Error Handling<a name="error-handling"></a>
|
||||
|
||||
The Dalle3 library provides error handling for API-related issues. If an error occurs during API communication, the library will handle it and provide detailed error messages. Make sure to handle exceptions appropriately in your code.
|
||||
|
||||
---
|
||||
|
||||
## Advanced Usage<a name="advanced-usage"></a>
|
||||
|
||||
For advanced usage and customization of the Dalle3 library, you can explore the attributes and methods of the `Dalle3` class. Adjusting parameters such as `size`, `max_retries`, and `quality` allows you to fine-tune the image generation process to your specific needs.
|
||||
|
||||
---
|
||||
|
||||
## References<a name="references"></a>
|
||||
|
||||
For more information about the DALL·E 3 model and the Dalle3 library, you can refer to the official OpenAI documentation and resources.
|
||||
|
||||
- [OpenAI API Documentation](https://beta.openai.com/docs/)
|
||||
- [DALL·E 3 Model Information](https://openai.com/research/dall-e-3)
|
||||
- [Dalle3 GitHub Repository](https://github.com/openai/dall-e-3)
|
||||
|
||||
---
|
||||
|
||||
This concludes the documentation for the Dalle3 library. You can now use the library to generate images from text prompts and explore its advanced features for various applications.
|
@ -0,0 +1,123 @@
|
||||
# DistilWhisperModel Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
The `DistilWhisperModel` is a Python class designed to handle English speech recognition tasks. It leverages the capabilities of the Whisper model, which is fine-tuned for speech-to-text processes. It is designed for both synchronous and asynchronous transcription of audio inputs, offering flexibility for real-time applications or batch processing.
|
||||
|
||||
## Installation
|
||||
|
||||
Before you can use `DistilWhisperModel`, ensure you have the required libraries installed:
|
||||
|
||||
```sh
|
||||
pip3 install --upgrade swarms
|
||||
```
|
||||
|
||||
## Initialization
|
||||
|
||||
The `DistilWhisperModel` class is initialized with the following parameters:
|
||||
|
||||
| Parameter | Type | Description | Default |
|
||||
|-----------|------|-------------|---------|
|
||||
| `model_id` | `str` | The identifier for the pre-trained Whisper model | `"distil-whisper/distil-large-v2"` |
|
||||
|
||||
Example of initialization:
|
||||
|
||||
```python
|
||||
from swarms.models import DistilWhisperModel
|
||||
|
||||
# Initialize with default model
|
||||
model_wrapper = DistilWhisperModel()
|
||||
|
||||
# Initialize with a specific model ID
|
||||
model_wrapper = DistilWhisperModel(model_id="distil-whisper/distil-large-v2")
|
||||
```
|
||||
|
||||
## Attributes
|
||||
|
||||
After initialization, the `DistilWhisperModel` has several attributes:
|
||||
|
||||
| Attribute | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `device` | `str` | The device used for computation (`"cuda:0"` for GPU or `"cpu"`). |
|
||||
| `torch_dtype` | `torch.dtype` | The data type used for the Torch tensors. |
|
||||
| `model_id` | `str` | The model identifier string. |
|
||||
| `model` | `torch.nn.Module` | The actual Whisper model loaded from the identifier. |
|
||||
| `processor` | `transformers.AutoProcessor` | The processor for handling input data. |
|
||||
|
||||
## Methods
|
||||
|
||||
### `transcribe`
|
||||
|
||||
Transcribes audio input synchronously.
|
||||
|
||||
**Arguments**:
|
||||
|
||||
| Argument | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `inputs` | `Union[str, dict]` | File path or audio data dictionary. |
|
||||
|
||||
**Returns**: `str` - The transcribed text.
|
||||
|
||||
**Usage Example**:
|
||||
|
||||
```python
|
||||
# Synchronous transcription
|
||||
transcription = model_wrapper.transcribe("path/to/audio.mp3")
|
||||
print(transcription)
|
||||
```
|
||||
|
||||
### `async_transcribe`
|
||||
|
||||
Transcribes audio input asynchronously.
|
||||
|
||||
**Arguments**:
|
||||
|
||||
| Argument | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `inputs` | `Union[str, dict]` | File path or audio data dictionary. |
|
||||
|
||||
**Returns**: `Coroutine` - A coroutine that when awaited, returns the transcribed text.
|
||||
|
||||
**Usage Example**:
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
|
||||
# Asynchronous transcription
|
||||
transcription = asyncio.run(model_wrapper.async_transcribe("path/to/audio.mp3"))
|
||||
print(transcription)
|
||||
```
|
||||
|
||||
### `real_time_transcribe`
|
||||
|
||||
Simulates real-time transcription of an audio file.
|
||||
|
||||
**Arguments**:
|
||||
|
||||
| Argument | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `audio_file_path` | `str` | Path to the audio file. |
|
||||
| `chunk_duration` | `int` | Duration of audio chunks in seconds. |
|
||||
|
||||
**Usage Example**:
|
||||
|
||||
```python
|
||||
# Real-time transcription simulation
|
||||
model_wrapper.real_time_transcribe("path/to/audio.mp3", chunk_duration=5)
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
The `DistilWhisperModel` class incorporates error handling for file not found errors and generic exceptions during the transcription process. If a non-recoverable exception is raised, it is printed to the console in red to indicate failure.
|
||||
|
||||
## Conclusion
|
||||
|
||||
The `DistilWhisperModel` offers a convenient interface to the powerful Whisper model for speech recognition. Its design supports both batch and real-time transcription, catering to different application needs. The class's error handling and retry logic make it robust for real-world applications.
|
||||
|
||||
## Additional Notes
|
||||
|
||||
- Ensure you have appropriate permissions to read audio files when using file paths.
|
||||
- Transcription quality depends on the audio quality and the Whisper model's performance on your dataset.
|
||||
- Adjust `chunk_duration` according to the processing power of your system for real-time transcription.
|
||||
|
||||
For a full list of models supported by `transformers.AutoModelForSpeechSeq2Seq`, visit the [Hugging Face Model Hub](https://huggingface.co/models).
|
@ -0,0 +1,89 @@
|
||||
# Fuyu Documentation
|
||||
|
||||
## Introduction
|
||||
|
||||
Welcome to the documentation for Fuyu, a versatile model for generating text conditioned on both textual prompts and images. Fuyu is based on the Adept's Fuyu model and offers a convenient way to create text that is influenced by the content of an image. In this documentation, you will find comprehensive information on the Fuyu class, its architecture, usage, and examples.
|
||||
|
||||
## Overview
|
||||
|
||||
Fuyu is a text generation model that leverages both text and images to generate coherent and contextually relevant text. It combines state-of-the-art language modeling techniques with image processing capabilities to produce text that is semantically connected to the content of an image. Whether you need to create captions for images or generate text that describes visual content, Fuyu can assist you.
|
||||
|
||||
## Class Definition
|
||||
|
||||
```python
|
||||
class Fuyu:
|
||||
def __init__(
|
||||
self,
|
||||
pretrained_path: str = "adept/fuyu-8b",
|
||||
device_map: str = "cuda:0",
|
||||
max_new_tokens: int = 7,
|
||||
):
|
||||
```
|
||||
|
||||
## Purpose
|
||||
|
||||
The Fuyu class serves as a convenient interface for using the Adept's Fuyu model. It allows you to generate text based on a textual prompt and an image. The primary purpose of Fuyu is to provide a user-friendly way to create text that is influenced by visual content, making it suitable for various applications, including image captioning, storytelling, and creative text generation.
|
||||
|
||||
## Parameters
|
||||
|
||||
- `pretrained_path` (str): The path to the pretrained Fuyu model. By default, it uses the "adept/fuyu-8b" model.
|
||||
- `device_map` (str): The device to use for model inference (e.g., "cuda:0" for GPU or "cpu" for CPU). Default: "cuda:0".
|
||||
- `max_new_tokens` (int): The maximum number of tokens to generate in the output text. Default: 7.
|
||||
|
||||
## Usage
|
||||
|
||||
To use Fuyu, follow these steps:
|
||||
|
||||
1. Initialize the Fuyu instance:
|
||||
|
||||
```python
|
||||
from swarms.models.fuyu import Fuyu
|
||||
|
||||
fuyu = Fuyu()
|
||||
```
|
||||
|
||||
|
||||
2. Generate Text with Fuyu:
|
||||
|
||||
```python
|
||||
text = "Hello, my name is"
|
||||
img_path = "path/to/image.png"
|
||||
output_text = fuyu(text, img_path)
|
||||
```
|
||||
|
||||
### Example 2 - Text Generation
|
||||
|
||||
```python
|
||||
from swarms.models.fuyu import Fuyu
|
||||
|
||||
fuyu = Fuyu()
|
||||
|
||||
text = "Hello, my name is"
|
||||
|
||||
img_path = "path/to/image.png"
|
||||
|
||||
output_text = fuyu(text, img_path)
|
||||
print(output_text)
|
||||
```
|
||||
|
||||
## How Fuyu Works
|
||||
|
||||
Fuyu combines text and image processing to generate meaningful text outputs. Here's how it works:
|
||||
|
||||
1. **Initialization**: When you create a Fuyu instance, you specify the pretrained model path, the device for inference, and the maximum number of tokens to generate.
|
||||
|
||||
2. **Processing Text and Images**: Fuyu can process both textual prompts and images. You provide a text prompt and the path to an image as input.
|
||||
|
||||
3. **Tokenization**: Fuyu tokenizes the input text and encodes the image using its tokenizer.
|
||||
|
||||
4. **Model Inference**: The model takes the tokenized inputs and generates text that is conditioned on both the text and the image.
|
||||
|
||||
5. **Output Text**: Fuyu returns the generated text as the output.
|
||||
|
||||
## Additional Information
|
||||
|
||||
- Fuyu uses the Adept's Fuyu model, which is pretrained on a large corpus of text and images, making it capable of generating coherent and contextually relevant text.
|
||||
- You can specify the device for inference to utilize GPU acceleration if available.
|
||||
- The `max_new_tokens` parameter allows you to control the length of the generated text.
|
||||
|
||||
That concludes the documentation for Fuyu. We hope you find this model useful for your text generation tasks that involve images. If you have any questions or encounter any issues, please refer to the Fuyu documentation for further assistance. Enjoy working with Fuyu!
|
@ -0,0 +1,178 @@
|
||||
## `Gemini` Documentation
|
||||
|
||||
### Introduction
|
||||
|
||||
The Gemini module is a versatile tool for leveraging the power of multimodal AI models to generate content. It allows users to combine textual and image inputs to generate creative and informative outputs. In this documentation, we will explore the Gemini module in detail, covering its purpose, architecture, methods, and usage examples.
|
||||
|
||||
#### Purpose
|
||||
|
||||
The Gemini module is designed to bridge the gap between text and image data, enabling users to harness the capabilities of multimodal AI models effectively. By providing both a textual task and an image as input, Gemini generates content that aligns with the specified task and incorporates the visual information from the image.
|
||||
|
||||
### Installation
|
||||
|
||||
Before using Gemini, ensure that you have the required dependencies installed. You can install them using the following commands:
|
||||
|
||||
```bash
|
||||
pip install swarms
|
||||
pip install google-generativeai
|
||||
pip install python-dotenv
|
||||
```
|
||||
|
||||
### Class: Gemini
|
||||
|
||||
#### Overview
|
||||
|
||||
The `Gemini` class is the central component of the Gemini module. It inherits from the `BaseMultiModalModel` class and provides methods to interact with the Gemini AI model. Let's dive into its architecture and functionality.
|
||||
|
||||
##### Class Constructor
|
||||
|
||||
```python
|
||||
class Gemini(BaseMultiModalModel):
|
||||
def __init__(
|
||||
self,
|
||||
model_name: str = "gemini-pro",
|
||||
gemini_api_key: str = get_gemini_api_key_env,
|
||||
*args,
|
||||
**kwargs,
|
||||
):
|
||||
```
|
||||
|
||||
| Parameter | Type | Description | Default Value |
|
||||
|---------------------|---------|------------------------------------------------------------------|--------------------|
|
||||
| `model_name` | str | The name of the Gemini model. | "gemini-pro" |
|
||||
| `gemini_api_key` | str | The Gemini API key. If not provided, it is fetched from the environment. | (None) |
|
||||
|
||||
- `model_name`: Specifies the name of the Gemini model to use. By default, it is set to "gemini-pro," but you can specify a different model if needed.
|
||||
|
||||
- `gemini_api_key`: This parameter allows you to provide your Gemini API key directly. If not provided, the constructor attempts to fetch it from the environment using the `get_gemini_api_key_env` helper function.
|
||||
|
||||
##### Methods
|
||||
|
||||
1. **run()**
|
||||
|
||||
```python
|
||||
def run(
|
||||
self,
|
||||
task: str = None,
|
||||
img: str = None,
|
||||
*args,
|
||||
**kwargs,
|
||||
) -> str:
|
||||
```
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|---------------|----------|--------------------------------------------|
|
||||
| `task` | str | The textual task for content generation. |
|
||||
| `img` | str | The path to the image to be processed. |
|
||||
| `*args` | Variable | Additional positional arguments. |
|
||||
| `**kwargs` | Variable | Additional keyword arguments. |
|
||||
|
||||
- `task`: Specifies the textual task for content generation. It can be a sentence or a phrase that describes the desired content.
|
||||
|
||||
- `img`: Provides the path to the image that will be processed along with the textual task. Gemini combines the visual information from the image with the textual task to generate content.
|
||||
|
||||
- `*args` and `**kwargs`: Allow for additional, flexible arguments that can be passed to the underlying Gemini model. These arguments can vary based on the specific Gemini model being used.
|
||||
|
||||
**Returns**: A string containing the generated content.
|
||||
|
||||
**Examples**:
|
||||
|
||||
```python
|
||||
from swarms.models import Gemini
|
||||
|
||||
# Initialize the Gemini model
|
||||
gemini = Gemini()
|
||||
|
||||
# Generate content for a textual task with an image
|
||||
generated_content = gemini.run(
|
||||
task="Describe this image",
|
||||
img="image.jpg",
|
||||
)
|
||||
|
||||
# Print the generated content
|
||||
print(generated_content)
|
||||
```
|
||||
|
||||
In this example, we initialize the Gemini model, provide a textual task, and specify an image for processing. The `run()` method generates content based on the input and returns the result.
|
||||
|
||||
2. **process_img()**
|
||||
|
||||
```python
|
||||
def process_img(
|
||||
self,
|
||||
img: str = None,
|
||||
type: str = "image/png",
|
||||
*args,
|
||||
**kwargs,
|
||||
):
|
||||
```
|
||||
|
||||
| Parameter | Type | Description | Default Value |
|
||||
|---------------|----------|------------------------------------------------------|----------------|
|
||||
| `img` | str | The path to the image to be processed. | (None) |
|
||||
| `type` | str | The MIME type of the image (e.g., "image/png"). | "image/png" |
|
||||
| `*args` | Variable | Additional positional arguments. |
|
||||
| `**kwargs` | Variable | Additional keyword arguments. |
|
||||
|
||||
- `img`: Specifies the path to the image that will be processed. It's essential to provide a valid image path for image-based content generation.
|
||||
|
||||
- `type`: Indicates the MIME type of the image. By default, it is set to "image/png," but you can change it based on the image format you're using.
|
||||
|
||||
- `*args` and `**kwargs`: Allow for additional, flexible arguments that can be passed to the underlying Gemini model. These arguments can vary based on the specific Gemini model being used.
|
||||
|
||||
**Raises**: ValueError if any of the following conditions are met:
|
||||
- No image is provided.
|
||||
- The image type is not specified.
|
||||
- The Gemini API key is missing.
|
||||
|
||||
**Examples**:
|
||||
|
||||
```python
|
||||
from swarms.models.gemini import Gemini
|
||||
|
||||
# Initialize the Gemini model
|
||||
gemini = Gemini()
|
||||
|
||||
# Process an image
|
||||
processed_image = gemini.process_img(
|
||||
img="image.jpg",
|
||||
type="image/jpeg",
|
||||
)
|
||||
|
||||
# Further use the processed image in content generation
|
||||
generated_content = gemini.run(
|
||||
task="Describe this image",
|
||||
img=processed_image,
|
||||
)
|
||||
|
||||
# Print the generated content
|
||||
print(generated_content)
|
||||
```
|
||||
|
||||
In this example, we demonstrate how to process an image using the `process_img()` method and then use the processed image in content generation.
|
||||
|
||||
#### Additional Information
|
||||
|
||||
- Gemini is designed to work seamlessly with various multimodal AI models, making it a powerful tool for content generation tasks.
|
||||
|
||||
- The module uses the `google.generativeai` package to access the underlying AI models. Ensure that you have this package installed to leverage the full capabilities of Gemini.
|
||||
|
||||
- It's essential to provide a valid Gemini API key for authentication. You can either pass it directly during initialization or store it in the environment variable "GEMINI_API_KEY."
|
||||
|
||||
- Gemini's flexibility allows you to experiment with different Gemini models and tailor the content generation process to your specific needs.
|
||||
|
||||
- Keep in mind that Gemini is designed to handle both textual and image inputs, making it a valuable asset for various applications, including natural language processing and computer vision tasks.
|
||||
|
||||
- If you encounter any issues or have specific requirements, refer to the Gemini documentation for more details and advanced usage.
|
||||
|
||||
### References and Resources
|
||||
|
||||
- [Gemini GitHub Repository](https://github.com/swarms/gemini): Explore the Gemini repository for additional information, updates, and examples.
|
||||
|
||||
- [Google GenerativeAI Documentation](https://docs.google.com/document/d/1WZSBw6GsOhOCYm0ArydD_9uy6nPPA1KFIbKPhjj43hA): Dive deeper into the capabilities of the Google GenerativeAI package used by Gemini.
|
||||
|
||||
- [Gemini API Documentation](https://gemini-api-docs.example.com): Access the official documentation for the Gemini API to explore advanced features and integrations.
|
||||
|
||||
## Conclusion
|
||||
|
||||
In this comprehensive documentation, we've explored the Gemini module, its purpose, architecture, methods, and usage examples. Gemini empowers developers to generate content by combining textual tasks and images, making it a valuable asset for multimodal AI applications. Whether you're working on natural language processing or computer vision projects, Gemini can help you achieve impressive results.
|
@ -0,0 +1,150 @@
|
||||
# Documentation for GPT4o Module
|
||||
|
||||
## Overview and Introduction
|
||||
|
||||
The `GPT4o` module is a multi-modal conversational model based on OpenAI's GPT-4 architecture. It extends the functionality of the `BaseMultiModalModel` class, enabling it to handle both text and image inputs for generating diverse and contextually rich responses. This module leverages the power of the GPT-4 model to enhance interactions by integrating visual information with textual prompts, making it highly relevant for applications requiring multi-modal understanding and response generation.
|
||||
|
||||
### Key Concepts
|
||||
- **Multi-Modal Model**: A model that can process and generate responses based on multiple types of inputs, such as text and images.
|
||||
- **System Prompt**: A predefined prompt to guide the conversation flow.
|
||||
- **Temperature**: A parameter that controls the randomness of the response generation.
|
||||
- **Max Tokens**: The maximum number of tokens (words or word pieces) in the generated response.
|
||||
|
||||
## Class Definition
|
||||
|
||||
### `GPT4o` Class
|
||||
|
||||
|
||||
### Parameters
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|-----------------|--------|--------------------------------------------------------------------------------------|
|
||||
| `system_prompt` | `str` | The system prompt to be used in the conversation. |
|
||||
| `temperature` | `float`| The temperature parameter for generating diverse responses. Default is `0.1`. |
|
||||
| `max_tokens` | `int` | The maximum number of tokens in the generated response. Default is `300`. |
|
||||
| `openai_api_key`| `str` | The API key for accessing the OpenAI GPT-4 API. |
|
||||
| `*args` | | Additional positional arguments. |
|
||||
| `**kwargs` | | Additional keyword arguments. |
|
||||
|
||||
## Functionality and Usage
|
||||
|
||||
### `encode_image` Function
|
||||
|
||||
The `encode_image` function is used to encode an image file into a base64 string format, which can then be included in the request to the GPT-4 API.
|
||||
|
||||
#### Parameters
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|---------------|--------|----------------------------------------------|
|
||||
| `image_path` | `str` | The local path to the image file to be encoded. |
|
||||
|
||||
#### Returns
|
||||
|
||||
| Return Type | Description |
|
||||
|-------------|---------------------------------|
|
||||
| `str` | The base64 encoded string of the image. |
|
||||
|
||||
### `GPT4o.__init__` Method
|
||||
|
||||
The constructor for the `GPT4o` class initializes the model with the specified parameters and sets up the OpenAI client.
|
||||
|
||||
### `GPT4o.run` Method
|
||||
|
||||
The `run` method executes the GPT-4o model to generate a response based on the provided task and optional image.
|
||||
|
||||
#### Parameters
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|---------------|--------|----------------------------------------------------|
|
||||
| `task` | `str` | The task or user prompt for the conversation. |
|
||||
| `local_img` | `str` | The local path to the image file. |
|
||||
| `img` | `str` | The URL of the image. |
|
||||
| `*args` | | Additional positional arguments. |
|
||||
| `**kwargs` | | Additional keyword arguments. |
|
||||
|
||||
#### Returns
|
||||
|
||||
| Return Type | Description |
|
||||
|-------------|--------------------------------------------------|
|
||||
| `str` | The generated response from the GPT-4o model. |
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Example 1: Basic Text Prompt
|
||||
|
||||
```python
|
||||
from swarms import GPT4o
|
||||
|
||||
# Initialize the model
|
||||
model = GPT4o(
|
||||
system_prompt="You are a helpful assistant.",
|
||||
temperature=0.7,
|
||||
max_tokens=150,
|
||||
openai_api_key="your_openai_api_key"
|
||||
)
|
||||
|
||||
# Define the task
|
||||
task = "What is the capital of France?"
|
||||
|
||||
# Generate response
|
||||
response = model.run(task)
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Example 2: Text Prompt with Local Image
|
||||
|
||||
```python
|
||||
from swarms import GPT4o
|
||||
|
||||
# Initialize the model
|
||||
model = GPT4o(
|
||||
system_prompt="Describe the image content.",
|
||||
temperature=0.5,
|
||||
max_tokens=200,
|
||||
openai_api_key="your_openai_api_key"
|
||||
)
|
||||
|
||||
# Define the task and image path
|
||||
task = "Describe the content of this image."
|
||||
local_img = "path/to/your/image.jpg"
|
||||
|
||||
# Generate response
|
||||
response = model.run(task, local_img=local_img)
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Example 3: Text Prompt with Image URL
|
||||
|
||||
```python
|
||||
from swarms import GPT4o
|
||||
|
||||
# Initialize the model
|
||||
model = GPT4o(
|
||||
system_prompt="You are a visual assistant.",
|
||||
temperature=0.6,
|
||||
max_tokens=250,
|
||||
openai_api_key="your_openai_api_key"
|
||||
)
|
||||
|
||||
# Define the task and image URL
|
||||
task = "What can you tell about the scenery in this image?"
|
||||
img_url = "http://example.com/image.jpg"
|
||||
|
||||
# Generate response
|
||||
response = model.run(task, img=img_url)
|
||||
print(response)
|
||||
```
|
||||
|
||||
## Additional Information and Tips
|
||||
|
||||
- **API Key Management**: Ensure that your OpenAI API key is securely stored and managed. Do not hard-code it in your scripts. Use environment variables or secure storage solutions.
|
||||
- **Image Encoding**: The `encode_image` function is crucial for converting images to a base64 format suitable for API requests. Ensure that the images are accessible and properly formatted.
|
||||
- **Temperature Parameter**: Adjust the `temperature` parameter to control the creativity of the model's responses. Lower values make the output more deterministic, while higher values increase randomness.
|
||||
- **Token Limit**: Be mindful of the `max_tokens` parameter to avoid exceeding the API's token limits. This parameter controls the length of the generated responses.
|
||||
|
||||
## References and Resources
|
||||
|
||||
- [OpenAI API Documentation](https://beta.openai.com/docs/)
|
||||
- [Python Base64 Encoding](https://docs.python.org/3/library/base64.html)
|
||||
- [dotenv Documentation](https://saurabh-kumar.com/python-dotenv/)
|
||||
- [BaseMultiModalModel Documentation](https://swarms.apac.ai)
|
@ -0,0 +1,201 @@
|
||||
# `GPT4VisionAPI` Documentation
|
||||
|
||||
**Table of Contents**
|
||||
- [Introduction](#introduction)
|
||||
- [Installation](#installation)
|
||||
- [Module Overview](#module-overview)
|
||||
- [Class: GPT4VisionAPI](#class-gpt4visionapi)
|
||||
- [Initialization](#initialization)
|
||||
- [Methods](#methods)
|
||||
- [encode_image](#encode_image)
|
||||
- [run](#run)
|
||||
- [__call__](#__call__)
|
||||
- [Examples](#examples)
|
||||
- [Example 1: Basic Usage](#example-1-basic-usage)
|
||||
- [Example 2: Custom API Key](#example-2-custom-api-key)
|
||||
- [Example 3: Adjusting Maximum Tokens](#example-3-adjusting-maximum-tokens)
|
||||
- [Additional Information](#additional-information)
|
||||
- [References](#references)
|
||||
|
||||
## Introduction<a name="introduction"></a>
|
||||
|
||||
Welcome to the documentation for the `GPT4VisionAPI` module! This module is a powerful wrapper for the OpenAI GPT-4 Vision model. It allows you to interact with the model to generate descriptions or answers related to images. This documentation will provide you with comprehensive information on how to use this module effectively.
|
||||
|
||||
## Installation<a name="installation"></a>
|
||||
|
||||
Before you start using the `GPT4VisionAPI` module, make sure you have the required dependencies installed. You can install them using the following commands:
|
||||
|
||||
```bash
|
||||
pip3 install --upgrade swarms
|
||||
```
|
||||
|
||||
## Module Overview<a name="module-overview"></a>
|
||||
|
||||
The `GPT4VisionAPI` module serves as a bridge between your application and the OpenAI GPT-4 Vision model. It allows you to send requests to the model and retrieve responses related to images. Here are some key features and functionality provided by this module:
|
||||
|
||||
- Encoding images to base64 format.
|
||||
- Running the GPT-4 Vision model with specified tasks and images.
|
||||
- Customization options such as setting the OpenAI API key and maximum token limit.
|
||||
|
||||
## Class: GPT4VisionAPI<a name="class-gpt4visionapi"></a>
|
||||
|
||||
The `GPT4VisionAPI` class is the core component of this module. It encapsulates the functionality required to interact with the GPT-4 Vision model. Below, we'll dive into the class in detail.
|
||||
|
||||
### Initialization<a name="initialization"></a>
|
||||
|
||||
When initializing the `GPT4VisionAPI` class, you have the option to provide the OpenAI API key and set the maximum token limit. Here are the parameters and their descriptions:
|
||||
|
||||
| Parameter | Type | Default Value | Description |
|
||||
|---------------------|----------|-------------------------------|----------------------------------------------------------------------------------------------------------|
|
||||
| openai_api_key | str | `OPENAI_API_KEY` environment variable (if available) | The OpenAI API key. If not provided, it defaults to the `OPENAI_API_KEY` environment variable. |
|
||||
| max_tokens | int | 300 | The maximum number of tokens to generate in the model's response. |
|
||||
|
||||
Here's how you can initialize the `GPT4VisionAPI` class:
|
||||
|
||||
```python
|
||||
from swarms.models import GPT4VisionAPI
|
||||
|
||||
# Initialize with default API key and max_tokens
|
||||
api = GPT4VisionAPI()
|
||||
|
||||
# Initialize with custom API key and max_tokens
|
||||
custom_api_key = "your_custom_api_key"
|
||||
api = GPT4VisionAPI(openai_api_key=custom_api_key, max_tokens=500)
|
||||
```
|
||||
|
||||
### Methods<a name="methods"></a>
|
||||
|
||||
#### encode_image<a name="encode_image"></a>
|
||||
|
||||
This method allows you to encode an image from a URL to base64 format. It's a utility function used internally by the module.
|
||||
|
||||
```python
|
||||
def encode_image(img: str) -> str:
|
||||
"""
|
||||
Encode image to base64.
|
||||
|
||||
Parameters:
|
||||
- img (str): URL of the image to encode.
|
||||
|
||||
Returns:
|
||||
str: Base64 encoded image.
|
||||
"""
|
||||
```
|
||||
|
||||
#### run<a name="run"></a>
|
||||
|
||||
The `run` method is the primary way to interact with the GPT-4 Vision model. It sends a request to the model with a task and an image URL, and it returns the model's response.
|
||||
|
||||
```python
|
||||
def run(task: str, img: str) -> str:
|
||||
"""
|
||||
Run the GPT-4 Vision model.
|
||||
|
||||
Parameters:
|
||||
- task (str): The task or question related to the image.
|
||||
- img (str): URL of the image to analyze.
|
||||
|
||||
Returns:
|
||||
str: The model's response.
|
||||
"""
|
||||
```
|
||||
|
||||
#### __call__<a name="__call__"></a>
|
||||
|
||||
The `__call__` method is a convenient way to run the GPT-4 Vision model. It has the same functionality as the `run` method.
|
||||
|
||||
```python
|
||||
def __call__(task: str, img: str) -> str:
|
||||
"""
|
||||
Run the GPT-4 Vision model (callable).
|
||||
|
||||
Parameters:
|
||||
- task (str): The task or question related to the image.
|
||||
- img
|
||||
|
||||
(str): URL of the image to analyze.
|
||||
|
||||
Returns:
|
||||
str: The model's response.
|
||||
"""
|
||||
```
|
||||
|
||||
## Examples<a name="examples"></a>
|
||||
|
||||
Let's explore some usage examples of the `GPT4VisionAPI` module to better understand how to use it effectively.
|
||||
|
||||
### Example 1: Basic Usage<a name="example-1-basic-usage"></a>
|
||||
|
||||
In this example, we'll use the module with the default API key and maximum tokens to analyze an image.
|
||||
|
||||
```python
|
||||
from swarms.models import GPT4VisionAPI
|
||||
|
||||
# Initialize with default API key and max_tokens
|
||||
api = GPT4VisionAPI()
|
||||
|
||||
# Define the task and image URL
|
||||
task = "What is the color of the object?"
|
||||
img = "https://i.imgur.com/2M2ZGwC.jpeg"
|
||||
|
||||
# Run the GPT-4 Vision model
|
||||
response = api.run(task, img)
|
||||
|
||||
# Print the model's response
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Example 2: Custom API Key<a name="example-2-custom-api-key"></a>
|
||||
|
||||
If you have a custom API key, you can initialize the module with it as shown in this example.
|
||||
|
||||
```python
|
||||
from swarms.models import GPT4VisionAPI
|
||||
|
||||
# Initialize with custom API key and max_tokens
|
||||
custom_api_key = "your_custom_api_key"
|
||||
api = GPT4VisionAPI(openai_api_key=custom_api_key, max_tokens=500)
|
||||
|
||||
# Define the task and image URL
|
||||
task = "What is the object in the image?"
|
||||
img = "https://i.imgur.com/3T3ZHwD.jpeg"
|
||||
|
||||
# Run the GPT-4 Vision model
|
||||
response = api.run(task, img)
|
||||
|
||||
# Print the model's response
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Example 3: Adjusting Maximum Tokens<a name="example-3-adjusting-maximum-tokens"></a>
|
||||
|
||||
You can also customize the maximum token limit when initializing the module. In this example, we set it to 1000 tokens.
|
||||
|
||||
```python
|
||||
from swarms.models import GPT4VisionAPI
|
||||
|
||||
# Initialize with default API key and custom max_tokens
|
||||
api = GPT4VisionAPI(max_tokens=1000)
|
||||
|
||||
# Define the task and image URL
|
||||
task = "Describe the scene in the image."
|
||||
img = "https://i.imgur.com/4P4ZRxU.jpeg"
|
||||
|
||||
# Run the GPT-4 Vision model
|
||||
response = api.run(task, img)
|
||||
|
||||
# Print the model's response
|
||||
print(response)
|
||||
```
|
||||
|
||||
## Additional Information<a name="additional-information"></a>
|
||||
|
||||
- If you encounter any errors or issues with the module, make sure to check your API key and internet connectivity.
|
||||
- It's recommended to handle exceptions when using the module to gracefully handle errors.
|
||||
- You can further customize the module to fit your specific use case by modifying the code as needed.
|
||||
|
||||
## References<a name="references"></a>
|
||||
|
||||
- [OpenAI API Documentation](https://beta.openai.com/docs/)
|
||||
|
||||
This documentation provides a comprehensive guide on how to use the `GPT4VisionAPI` module effectively. It covers initialization, methods, usage examples, and additional information to ensure a smooth experience when working with the GPT-4 Vision model.
|
@ -0,0 +1,91 @@
|
||||
# HuggingFaceLLM
|
||||
|
||||
## Overview & Introduction
|
||||
|
||||
The `HuggingFaceLLM` class in the Zeta library provides a simple and easy-to-use interface to harness the power of Hugging Face's transformer-based language models, specifically for causal language modeling. This enables developers to generate coherent and contextually relevant sentences or paragraphs given a prompt, without delving deep into the intricate details of the underlying model or the tokenization process.
|
||||
|
||||
Causal Language Modeling (CLM) is a task where given a series of tokens (or words), the model predicts the next token in the sequence. This functionality is central to many natural language processing tasks, including chatbots, story generation, and code autocompletion.
|
||||
|
||||
---
|
||||
|
||||
## Class Definition
|
||||
|
||||
```python
|
||||
class HuggingFaceLLM:
|
||||
```
|
||||
|
||||
### Parameters:
|
||||
|
||||
- `model_id (str)`: Identifier for the pre-trained model on the Hugging Face model hub. Examples include "gpt2-medium", "openai-gpt", etc.
|
||||
|
||||
- `device (str, optional)`: The device on which to load and run the model. Defaults to 'cuda' if GPU is available, else 'cpu'.
|
||||
|
||||
- `max_length (int, optional)`: Maximum length of the generated sequence. Defaults to 20.
|
||||
|
||||
- `quantization_config (dict, optional)`: Configuration dictionary for model quantization (if applicable). Default is `None`.
|
||||
|
||||
---
|
||||
|
||||
## Functionality & Usage
|
||||
|
||||
### Initialization:
|
||||
|
||||
```python
|
||||
llm = HuggingFaceLLM(model_id="gpt2-medium")
|
||||
```
|
||||
|
||||
Upon initialization, the specified pre-trained model and tokenizer are loaded from Hugging Face's model hub. The model is then moved to the designated device. If there's an issue loading either the model or the tokenizer, an error will be logged.
|
||||
|
||||
### Generation:
|
||||
|
||||
The main functionality of this class is text generation. The class provides two methods for this: `__call__` and `generate`. Both methods take in a prompt text and an optional `max_length` parameter and return the generated text.
|
||||
|
||||
Usage:
|
||||
```python
|
||||
from swarms import HuggingFaceLLM
|
||||
|
||||
# Initialize
|
||||
llm = HuggingFaceLLM(model_id="gpt2-medium")
|
||||
|
||||
# Generate text using __call__ method
|
||||
result = llm("Once upon a time,")
|
||||
print(result)
|
||||
|
||||
# Alternatively, using the generate method
|
||||
result = llm.generate("The future of AI is")
|
||||
print(result)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Mathematical Explanation:
|
||||
|
||||
Given a sequence of tokens \( x_1, x_2, ..., x_n \), a causal language model aims to maximize the likelihood of the next token \( x_{n+1} \) in the sequence. Formally, it tries to optimize:
|
||||
|
||||
\[ P(x_{n+1} | x_1, x_2, ..., x_n) \]
|
||||
|
||||
Where \( P \) is the probability distribution over all possible tokens in the vocabulary.
|
||||
|
||||
The model takes the tokenized input sequence, feeds it through several transformer blocks, and finally through a linear layer to produce logits for each token in the vocabulary. The token with the highest logit value is typically chosen as the next token in the sequence.
|
||||
|
||||
---
|
||||
|
||||
## Additional Information & Tips:
|
||||
|
||||
- Ensure you have an active internet connection when initializing the class for the first time, as the models and tokenizers are fetched from Hugging Face's servers.
|
||||
|
||||
- Although the default `max_length` is set to 20, it's advisable to adjust this parameter based on the context of the problem.
|
||||
|
||||
- Keep an eye on GPU memory when using large models or generating long sequences.
|
||||
|
||||
---
|
||||
|
||||
## References & Resources:
|
||||
|
||||
- Hugging Face Model Hub: [https://huggingface.co/models](https://huggingface.co/models)
|
||||
|
||||
- Introduction to Transformers: [https://huggingface.co/transformers/introduction.html](https://huggingface.co/transformers/introduction.html)
|
||||
|
||||
- Causal Language Modeling: Vaswani, A., et al. (2017). Attention is All You Need. [arXiv:1706.03762](https://arxiv.org/abs/1706.03762)
|
||||
|
||||
Note: This documentation template provides a comprehensive overview of the `HuggingFaceLLM` class. Developers can follow similar structures when documenting other classes or functionalities.
|
@ -0,0 +1,155 @@
|
||||
## `HuggingfaceLLM` Documentation
|
||||
|
||||
### Introduction
|
||||
|
||||
The `HuggingfaceLLM` class is designed for running inference using models from the Hugging Face Transformers library. This documentation provides an in-depth understanding of the class, its purpose, attributes, methods, and usage examples.
|
||||
|
||||
#### Purpose
|
||||
|
||||
The `HuggingfaceLLM` class serves the following purposes:
|
||||
|
||||
1. Load pre-trained Hugging Face models and tokenizers.
|
||||
2. Generate text-based responses from the loaded model using a given prompt.
|
||||
3. Provide flexibility in device selection, quantization, and other configuration options.
|
||||
|
||||
### Class Definition
|
||||
|
||||
The `HuggingfaceLLM` class is defined as follows:
|
||||
|
||||
```python
|
||||
class HuggingfaceLLM:
|
||||
def __init__(
|
||||
self,
|
||||
model_id: str,
|
||||
device: str = None,
|
||||
max_length: int = 20,
|
||||
quantize: bool = False,
|
||||
quantization_config: dict = None,
|
||||
verbose=False,
|
||||
distributed=False,
|
||||
decoding=False,
|
||||
):
|
||||
# Attributes and initialization logic explained below
|
||||
pass
|
||||
|
||||
def load_model(self):
|
||||
# Method to load the pre-trained model and tokenizer
|
||||
pass
|
||||
|
||||
def run(self, prompt_text: str, max_length: int = None):
|
||||
# Method to generate text-based responses
|
||||
pass
|
||||
|
||||
def __call__(self, prompt_text: str, max_length: int = None):
|
||||
# Alternate method for generating text-based responses
|
||||
pass
|
||||
```
|
||||
|
||||
### Attributes
|
||||
|
||||
| Attribute | Description |
|
||||
|----------------------|---------------------------------------------------------------------------------------------------------------------------|
|
||||
| `model_id` | The ID of the pre-trained model to be used. |
|
||||
| `device` | The device on which the model runs (`'cuda'` for GPU or `'cpu'` for CPU). |
|
||||
| `max_length` | The maximum length of the generated text. |
|
||||
| `quantize` | A boolean indicating whether quantization should be used. |
|
||||
| `quantization_config`| A dictionary with configuration options for quantization. |
|
||||
| `verbose` | A boolean indicating whether verbose logs should be printed. |
|
||||
| `logger` | An optional logger for logging messages (defaults to a basic logger). |
|
||||
| `distributed` | A boolean indicating whether distributed processing should be used. |
|
||||
| `decoding` | A boolean indicating whether to perform decoding during text generation. |
|
||||
|
||||
### Class Methods
|
||||
|
||||
#### `__init__` Method
|
||||
|
||||
The `__init__` method initializes an instance of the `HuggingfaceLLM` class with the specified parameters. It also loads the pre-trained model and tokenizer.
|
||||
|
||||
- `model_id` (str): The ID of the pre-trained model to use.
|
||||
- `device` (str, optional): The device to run the model on ('cuda' or 'cpu').
|
||||
- `max_length` (int, optional): The maximum length of the generated text.
|
||||
- `quantize` (bool, optional): Whether to use quantization.
|
||||
- `quantization_config` (dict, optional): Configuration for quantization.
|
||||
- `verbose` (bool, optional): Whether to print verbose logs.
|
||||
- `logger` (logging.Logger, optional): The logger to use.
|
||||
- `distributed` (bool, optional): Whether to use distributed processing.
|
||||
- `decoding` (bool, optional): Whether to perform decoding during text generation.
|
||||
|
||||
#### `load_model` Method
|
||||
|
||||
The `load_model` method loads the pre-trained model and tokenizer specified by `model_id`.
|
||||
|
||||
#### `run` and `__call__` Methods
|
||||
|
||||
Both `run` and `__call__` methods generate text-based responses based on a given prompt. They accept the following parameters:
|
||||
|
||||
- `prompt_text` (str): The text prompt to initiate text generation.
|
||||
- `max_length` (int, optional): The maximum length of the generated text.
|
||||
|
||||
### Usage Examples
|
||||
|
||||
Here are three ways to use the `HuggingfaceLLM` class:
|
||||
|
||||
#### Example 1: Basic Usage
|
||||
|
||||
```python
|
||||
from swarms.models import HuggingfaceLLM
|
||||
|
||||
# Initialize the HuggingfaceLLM instance with a model ID
|
||||
model_id = "NousResearch/Nous-Hermes-2-Vision-Alpha"
|
||||
inference = HuggingfaceLLM(model_id=model_id)
|
||||
|
||||
# Generate text based on a prompt
|
||||
prompt_text = "Once upon a time"
|
||||
generated_text = inference(prompt_text)
|
||||
print(generated_text)
|
||||
```
|
||||
|
||||
#### Example 2: Custom Configuration
|
||||
|
||||
```python
|
||||
from swarms.models import HuggingfaceLLM
|
||||
|
||||
# Initialize with custom configuration
|
||||
custom_config = {
|
||||
"quantize": True,
|
||||
"quantization_config": {"load_in_4bit": True},
|
||||
"verbose": True,
|
||||
}
|
||||
inference = HuggingfaceLLM(
|
||||
model_id="NousResearch/Nous-Hermes-2-Vision-Alpha", **custom_config
|
||||
)
|
||||
|
||||
# Generate text based on a prompt
|
||||
prompt_text = "Tell me a joke"
|
||||
generated_text = inference(prompt_text)
|
||||
print(generated_text)
|
||||
```
|
||||
|
||||
#### Example 3: Distributed Processing
|
||||
|
||||
```python
|
||||
from swarms.models import HuggingfaceLLM
|
||||
|
||||
# Initialize for distributed processing
|
||||
inference = HuggingfaceLLM(model_id="gpt2-medium", distributed=True)
|
||||
|
||||
# Generate text based on a prompt
|
||||
prompt_text = "Translate the following sentence to French"
|
||||
generated_text = inference(prompt_text)
|
||||
print(generated_text)
|
||||
```
|
||||
|
||||
### Additional Information
|
||||
|
||||
- The `HuggingfaceLLM` class provides the flexibility to load and use pre-trained models from the Hugging Face Transformers library.
|
||||
- Quantization can be enabled to reduce model size and inference time.
|
||||
- Distributed processing can be used for parallelized inference.
|
||||
- Verbose logging can help in debugging and understanding the text generation process.
|
||||
|
||||
### References
|
||||
|
||||
- [Hugging Face Transformers Documentation](https://huggingface.co/transformers/)
|
||||
- [PyTorch Documentation](https://pytorch.org/docs/stable/index.html)
|
||||
|
||||
This documentation provides a comprehensive understanding of the `HuggingfaceLLM` class, its attributes, methods, and usage examples. Developers can use this class to perform text generation tasks efficiently using pre-trained models from the Hugging Face Transformers library.
|
@ -0,0 +1,107 @@
|
||||
# `Idefics` Documentation
|
||||
|
||||
## Introduction
|
||||
|
||||
Welcome to the documentation for Idefics, a versatile multimodal inference tool using pre-trained models from the Hugging Face Hub. Idefics is designed to facilitate the generation of text from various prompts, including text and images. This documentation provides a comprehensive understanding of Idefics, its architecture, usage, and how it can be integrated into your projects.
|
||||
|
||||
## Overview
|
||||
|
||||
Idefics leverages the power of pre-trained models to generate textual responses based on a wide range of prompts. It is capable of handling both text and images, making it suitable for various multimodal tasks, including text generation from images.
|
||||
|
||||
## Class Definition
|
||||
|
||||
```python
|
||||
class Idefics:
|
||||
def __init__(
|
||||
self,
|
||||
checkpoint="HuggingFaceM4/idefics-9b-instruct",
|
||||
device=None,
|
||||
torch_dtype=torch.bfloat16,
|
||||
max_length=100,
|
||||
):
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To use Idefics, follow these steps:
|
||||
|
||||
1. Initialize the Idefics instance:
|
||||
|
||||
```python
|
||||
from swarms.models import Idefics
|
||||
|
||||
model = Idefics()
|
||||
```
|
||||
|
||||
2. Generate text based on prompts:
|
||||
|
||||
```python
|
||||
prompts = [
|
||||
"User: What is in this image? https://upload.wikimedia.org/wikipedia/commons/8/86/Id%C3%A9fix.JPG"
|
||||
]
|
||||
response = model(prompts)
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Example 1 - Image Questioning
|
||||
|
||||
```python
|
||||
from swarms.models import Idefics
|
||||
|
||||
model = Idefics()
|
||||
prompts = [
|
||||
"User: What is in this image? https://upload.wikimedia.org/wikipedia/commons/8/86/Id%C3%A9fix.JPG"
|
||||
]
|
||||
response = model(prompts)
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Example 2 - Bidirectional Conversation
|
||||
|
||||
```python
|
||||
from swarms.models import Idefics
|
||||
|
||||
model = Idefics()
|
||||
user_input = "User: What is in this image? https://upload.wikimedia.org/wikipedia/commons/8/86/Id%C3%A9fix.JPG"
|
||||
response = model.chat(user_input)
|
||||
print(response)
|
||||
|
||||
user_input = "User: Who is that? https://static.wikia.nocookie.net/asterix/images/2/25/R22b.gif/revision/latest?cb=20110815073052"
|
||||
response = model.chat(user_input)
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Example 3 - Configuration Changes
|
||||
|
||||
```python
|
||||
model.set_checkpoint("new_checkpoint")
|
||||
model.set_device("cpu")
|
||||
model.set_max_length(200)
|
||||
model.clear_chat_history()
|
||||
```
|
||||
|
||||
## How Idefics Works
|
||||
|
||||
Idefics operates by leveraging pre-trained models from the Hugging Face Hub. Here's how it works:
|
||||
|
||||
1. **Initialization**: When you create an Idefics instance, it initializes the model using a specified checkpoint, sets the device for inference, and configures other parameters like data type and maximum text length.
|
||||
|
||||
2. **Prompt-Based Inference**: You can use the `infer` method to generate text based on prompts. It processes prompts in batched or non-batched mode, depending on your preference. It uses a pre-trained processor to handle text and images.
|
||||
|
||||
3. **Bidirectional Conversation**: The `chat` method enables bidirectional conversations. You provide user input, and the model responds accordingly. The chat history is maintained for context.
|
||||
|
||||
4. **Configuration Changes**: You can change the model checkpoint, device, maximum text length, or clear the chat history as needed during runtime.
|
||||
|
||||
## Parameters
|
||||
|
||||
- `checkpoint`: The name of the pre-trained model checkpoint (default is "HuggingFaceM4/idefics-9b-instruct").
|
||||
- `device`: The device to use for inference. By default, it uses CUDA if available; otherwise, it uses CPU.
|
||||
- `torch_dtype`: The data type to use for inference. By default, it uses torch.bfloat16.
|
||||
- `max_length`: The maximum length of the generated text (default is 100).
|
||||
|
||||
## Additional Information
|
||||
|
||||
- Idefics provides a convenient way to engage in bidirectional conversations with pre-trained models.
|
||||
- You can easily change the model checkpoint, device, and other settings to adapt to your specific use case.
|
||||
|
||||
That concludes the documentation for Idefics. We hope you find this tool valuable for your multimodal text generation tasks. If you have any questions or encounter any issues, please refer to the Hugging Face Transformers documentation for further assistance. Enjoy working with Idefics!
|
@ -0,0 +1,178 @@
|
||||
## LLMs in Swarms Documentation
|
||||
|
||||
Welcome to the documentation for the llm section of the swarms package, designed to facilitate seamless integration with various AI language models and APIs. This package empowers developers, end-users, and system administrators to interact with AI models from different providers, such as OpenAI, Hugging Face, Google PaLM, and Anthropic.
|
||||
|
||||
### Table of Contents
|
||||
1. [OpenAI](#openai)
|
||||
2. [HuggingFace](#huggingface)
|
||||
3. [Google PaLM](#google-palm)
|
||||
4. [Anthropic](#anthropic)
|
||||
|
||||
### 1. OpenAI (swarms.agents.models.OpenAI)
|
||||
|
||||
The OpenAI class provides an interface to interact with OpenAI's language models. It allows both synchronous and asynchronous interactions.
|
||||
|
||||
**Constructor:**
|
||||
```python
|
||||
OpenAI(api_key: str, system: str = None, console: bool = True, model: str = None, params: dict = None, save_messages: bool = True)
|
||||
```
|
||||
|
||||
**Attributes:**
|
||||
- `api_key` (str): Your OpenAI API key.
|
||||
|
||||
- `system` (str, optional): A system message to be used in conversations.
|
||||
|
||||
- `console` (bool, default=True): Display console logs.
|
||||
|
||||
- `model` (str, optional): Name of the language model to use.
|
||||
|
||||
- `params` (dict, optional): Additional parameters for model interactions.
|
||||
|
||||
- `save_messages` (bool, default=True): Save conversation messages.
|
||||
|
||||
**Methods:**
|
||||
|
||||
- `generate(message: str, **kwargs) -> str`: Generate a response using the OpenAI model.
|
||||
|
||||
- `generate_async(message: str, **kwargs) -> str`: Generate a response asynchronously.
|
||||
|
||||
- `ask_multiple(ids: List[str], question_template: str) -> List[str]`: Query multiple IDs simultaneously.
|
||||
|
||||
- `stream_multiple(ids: List[str], question_template: str) -> List[str]`: Stream multiple responses.
|
||||
|
||||
**Usage Example:**
|
||||
```python
|
||||
import asyncio
|
||||
|
||||
from swarms import OpenAI
|
||||
|
||||
chat = OpenAI(api_key="YOUR_OPENAI_API_KEY")
|
||||
|
||||
response = chat.generate("Hello, how can I assist you?")
|
||||
print(response)
|
||||
|
||||
ids = ["id1", "id2", "id3"]
|
||||
async_responses = asyncio.run(chat.ask_multiple(ids, "How is {id}?"))
|
||||
print(async_responses)
|
||||
```
|
||||
|
||||
### 2. HuggingFace (swarms.agents.models.HuggingFaceLLM)
|
||||
|
||||
The HuggingFaceLLM class allows interaction with language models from Hugging Face.
|
||||
|
||||
**Constructor:**
|
||||
```python
|
||||
HuggingFaceLLM(model_id: str, device: str = None, max_length: int = 20, quantize: bool = False, quantization_config: dict = None)
|
||||
```
|
||||
|
||||
**Attributes:**
|
||||
|
||||
- `model_id` (str): ID or name of the Hugging Face model.
|
||||
|
||||
- `device` (str, optional): Device to run the model on (e.g., 'cuda', 'cpu').
|
||||
|
||||
- `max_length` (int, default=20): Maximum length of generated text.
|
||||
|
||||
- `quantize` (bool, default=False): Apply model quantization.
|
||||
|
||||
- `quantization_config` (dict, optional): Configuration for quantization.
|
||||
|
||||
**Methods:**
|
||||
|
||||
- `generate(prompt_text: str, max_length: int = None) -> str`: Generate text based on a prompt.
|
||||
|
||||
**Usage Example:**
|
||||
```python
|
||||
from swarms import HuggingFaceLLM
|
||||
|
||||
model_id = "gpt2"
|
||||
hugging_face_model = HuggingFaceLLM(model_id=model_id)
|
||||
|
||||
prompt = "Once upon a time"
|
||||
generated_text = hugging_face_model.generate(prompt)
|
||||
print(generated_text)
|
||||
```
|
||||
|
||||
### 3. Google PaLM (swarms.agents.models.GooglePalm)
|
||||
|
||||
The GooglePalm class provides an interface for Google's PaLM Chat API.
|
||||
|
||||
**Constructor:**
|
||||
```python
|
||||
GooglePalm(model_name: str = "models/chat-bison-001", google_api_key: str = None, temperature: float = None, top_p: float = None, top_k: int = None, n: int = 1)
|
||||
```
|
||||
|
||||
**Attributes:**
|
||||
|
||||
- `model_name` (str): Name of the Google PaLM model.
|
||||
|
||||
- `google_api_key` (str, optional): Google API key.
|
||||
|
||||
- `temperature` (float, optional): Temperature for text generation.
|
||||
|
||||
- `top_p` (float, optional): Top-p sampling value.
|
||||
|
||||
- `top_k` (int, optional): Top-k sampling value.
|
||||
|
||||
- `n` (int, default=1): Number of candidate completions.
|
||||
|
||||
**Methods:**
|
||||
|
||||
- `generate(messages: List[Dict[str, Any]], stop: List[str] = None, **kwargs) -> Dict[str, Any]`: Generate text based on a list of messages.
|
||||
|
||||
- `__call__(messages: List[Dict[str, Any]], stop: List[str] = None, **kwargs) -> Dict[str, Any]`: Generate text using the call syntax.
|
||||
|
||||
**Usage Example:**
|
||||
```python
|
||||
from swarms import GooglePalm
|
||||
|
||||
google_palm = GooglePalm()
|
||||
messages = [
|
||||
{"role": "system", "content": "You are a helpful assistant"},
|
||||
{"role": "user", "content": "Tell me a joke"},
|
||||
]
|
||||
|
||||
response = google_palm.generate(messages)
|
||||
print(response["choices"][0]["text"])
|
||||
```
|
||||
|
||||
### 4. Anthropic (swarms.agents.models.Anthropic)
|
||||
|
||||
The Anthropic class enables interaction with Anthropic's large language models.
|
||||
|
||||
**Constructor:**
|
||||
```python
|
||||
Anthropic(model: str = "claude-2", max_tokens_to_sample: int = 256, temperature: float = None, top_k: int = None, top_p: float = None, streaming: bool = False, default_request_timeout: int = None)
|
||||
```
|
||||
|
||||
**Attributes:**
|
||||
|
||||
- `model` (str): Name of the Anthropic model.
|
||||
|
||||
- `max_tokens_to_sample` (int, default=256): Maximum tokens to sample.
|
||||
|
||||
- `temperature` (float, optional): Temperature for text generation.
|
||||
|
||||
- `top_k` (int, optional): Top-k sampling value.
|
||||
|
||||
- `top_p` (float, optional): Top-p sampling value.
|
||||
|
||||
- `streaming` (bool, default=False): Enable streaming mode.
|
||||
|
||||
- `default_request_timeout` (int, optional): Default request timeout.
|
||||
|
||||
**Methods:**
|
||||
|
||||
- `generate(prompt: str, stop: List[str] = None) -> str`: Generate text based on a prompt.
|
||||
|
||||
**Usage Example:**
|
||||
```python
|
||||
from swarms import Anthropic
|
||||
|
||||
anthropic = Anthropic()
|
||||
prompt = "Once upon a time"
|
||||
generated_text = anthropic.generate(prompt)
|
||||
print(generated_text)
|
||||
```
|
||||
|
||||
This concludes the documentation for the "models" folder, providing you with tools to seamlessly integrate with various language models and APIs. Happy coding!
|
@ -0,0 +1,217 @@
|
||||
# `Kosmos` Documentation
|
||||
|
||||
## Introduction
|
||||
|
||||
Welcome to the documentation for Kosmos, a powerful multimodal AI model that can perform various tasks, including multimodal grounding, referring expression comprehension, referring expression generation, grounded visual question answering (VQA), and grounded image captioning. Kosmos is based on the ydshieh/kosmos-2-patch14-224 model and is designed to process both text and images to provide meaningful outputs. In this documentation, you will find a detailed explanation of the Kosmos class, its functions, parameters, and usage examples.
|
||||
|
||||
## Overview
|
||||
|
||||
Kosmos is a state-of-the-art multimodal AI model that combines the power of natural language understanding with image analysis. It can perform several tasks that involve processing both textual prompts and images to provide informative responses. Whether you need to find objects in an image, understand referring expressions, generate descriptions, answer questions, or create captions, Kosmos has you covered.
|
||||
|
||||
## Class Definition
|
||||
|
||||
```python
|
||||
class Kosmos:
|
||||
def __init__(self, model_name="ydshieh/kosmos-2-patch14-224"):
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To use Kosmos, follow these steps:
|
||||
|
||||
1. Initialize the Kosmos instance:
|
||||
|
||||
```python
|
||||
from swarms.models.kosmos_two import Kosmos
|
||||
|
||||
kosmos = Kosmos()
|
||||
```
|
||||
|
||||
2. Perform Multimodal Grounding:
|
||||
|
||||
```python
|
||||
kosmos.multimodal_grounding(
|
||||
"Find the red apple in the image.", "https://example.com/apple.jpg"
|
||||
)
|
||||
```
|
||||
|
||||
### Example 1 - Multimodal Grounding
|
||||
|
||||
```python
|
||||
from swarms.models.kosmos_two import Kosmos
|
||||
|
||||
kosmos = Kosmos()
|
||||
|
||||
kosmos.multimodal_grounding(
|
||||
"Find the red apple in the image.", "https://example.com/apple.jpg"
|
||||
)
|
||||
```
|
||||
|
||||
3. Perform Referring Expression Comprehension:
|
||||
|
||||
```python
|
||||
kosmos.referring_expression_comprehension(
|
||||
"Show me the green bottle.", "https://example.com/bottle.jpg"
|
||||
)
|
||||
```
|
||||
|
||||
### Example 2 - Referring Expression Comprehension
|
||||
|
||||
```python
|
||||
from swarms.models.kosmos_two import Kosmos
|
||||
|
||||
kosmos = Kosmos()
|
||||
|
||||
kosmos.referring_expression_comprehension(
|
||||
"Show me the green bottle.", "https://example.com/bottle.jpg"
|
||||
)
|
||||
```
|
||||
|
||||
4. Generate Referring Expressions:
|
||||
|
||||
```python
|
||||
kosmos.referring_expression_generation(
|
||||
"It is on the table.", "https://example.com/table.jpg"
|
||||
)
|
||||
```
|
||||
|
||||
### Example 3 - Referring Expression Generation
|
||||
|
||||
```python
|
||||
from swarms.models.kosmos_two import Kosmos
|
||||
|
||||
kosmos = Kosmos()
|
||||
|
||||
kosmos.referring_expression_generation(
|
||||
"It is on the table.", "https://example.com/table.jpg"
|
||||
)
|
||||
```
|
||||
|
||||
5. Perform Grounded Visual Question Answering (VQA):
|
||||
|
||||
```python
|
||||
kosmos.grounded_vqa("What is the color of the car?", "https://example.com/car.jpg")
|
||||
```
|
||||
|
||||
### Example 4 - Grounded Visual Question Answering
|
||||
|
||||
```python
|
||||
from swarms.models.kosmos_two import Kosmos
|
||||
|
||||
kosmos = Kosmos()
|
||||
|
||||
kosmos.grounded_vqa("What is the color of the car?", "https://example.com/car.jpg")
|
||||
```
|
||||
|
||||
6. Generate Grounded Image Captions:
|
||||
|
||||
```python
|
||||
kosmos.grounded_image_captioning("https://example.com/beach.jpg")
|
||||
```
|
||||
|
||||
### Example 5 - Grounded Image Captioning
|
||||
|
||||
```python
|
||||
from swarms.models.kosmos_two import Kosmos
|
||||
|
||||
kosmos = Kosmos()
|
||||
|
||||
kosmos.grounded_image_captioning("https://example.com/beach.jpg")
|
||||
```
|
||||
|
||||
7. Generate Detailed Grounded Image Captions:
|
||||
|
||||
```python
|
||||
kosmos.grounded_image_captioning_detailed("https://example.com/beach.jpg")
|
||||
```
|
||||
|
||||
### Example 6 - Detailed Grounded Image Captioning
|
||||
|
||||
```python
|
||||
from swarms.models.kosmos_two import Kosmos
|
||||
|
||||
kosmos = Kosmos()
|
||||
|
||||
kosmos.grounded_image_captioning_detailed("https://example.com/beach.jpg")
|
||||
```
|
||||
|
||||
8. Draw Entity Boxes on Image:
|
||||
|
||||
```python
|
||||
image = kosmos.get_image("https://example.com/image.jpg")
|
||||
entities = [
|
||||
("apple", (0, 3), [(0.2, 0.3, 0.4, 0.5)]),
|
||||
("banana", (4, 9), [(0.6, 0.2, 0.8, 0.4)]),
|
||||
]
|
||||
kosmos.draw_entity_boxes_on_image(image, entities, show=True)
|
||||
```
|
||||
|
||||
### Example 7 - Drawing Entity Boxes on Image
|
||||
|
||||
```python
|
||||
from swarms.models.kosmos_two import Kosmos
|
||||
|
||||
kosmos = Kosmos()
|
||||
|
||||
image = kosmos.get_image("https://example.com/image.jpg")
|
||||
entities = [
|
||||
("apple", (0, 3), [(0.2, 0.3, 0.4, 0.5)]),
|
||||
("banana", (4, 9), [(0.6, 0.2, 0.8, 0.4)]),
|
||||
]
|
||||
kosmos.draw_entity_boxes_on_image(image, entities, show=True)
|
||||
```
|
||||
|
||||
9. Generate Boxes for Entities:
|
||||
|
||||
```python
|
||||
entities = [
|
||||
("apple", (0, 3), [(0.2, 0.3, 0.4, 0.5)]),
|
||||
("banana", (4, 9), [(0.6, 0.2, 0.8, 0.4)]),
|
||||
]
|
||||
image = kosmos.generate_boxes(
|
||||
"Find the apple and the banana in the image.", "https://example.com/image.jpg"
|
||||
)
|
||||
```
|
||||
|
||||
### Example 8 - Generating Boxes for Entities
|
||||
|
||||
```python
|
||||
from swarms.models.kosmos_two import Kosmos
|
||||
|
||||
kosmos = Kosmos()
|
||||
entities = [
|
||||
("apple", (0, 3), [(0.2, 0.3, 0.4, 0.5)]),
|
||||
("banana", (4, 9), [(0.6, 0.2, 0.8, 0.4)]),
|
||||
]
|
||||
image = kosmos.generate_boxes(
|
||||
"Find the apple and the banana in the image.", "https://example.com/image.jpg"
|
||||
)
|
||||
```
|
||||
|
||||
## How Kosmos Works
|
||||
|
||||
Kosmos is a multimodal AI model that combines text and image processing. It uses the ydshieh/kosmos-2-patch14-224 model for understanding and generating responses. Here's how it works:
|
||||
|
||||
1. **Initialization**: When you create a Kosmos instance, it loads the ydshieh/kosmos-2-patch14-224 model for multimodal tasks.
|
||||
|
||||
2. **Processing Text and Images**: Kosmos can process both text prompts and images. It takes a textual prompt and an image URL as input.
|
||||
|
||||
3. **Task Execution**: Based on the task you specify, Kosmos generates informative responses by combining natural language understanding with image analysis.
|
||||
|
||||
4. **Drawing Entity Boxes**: You can use the `draw_entity_boxes_on_image` method to draw bounding boxes around entities in an image.
|
||||
|
||||
5. **Generating Boxes for Entities**: The `generate_boxes` method allows you to generate bounding boxes for entities mentioned in a prompt.
|
||||
|
||||
## Parameters
|
||||
|
||||
- `model_name`: The name or path of the Kosmos model to be used. By default, it uses the ydshieh/kosmos-2-patch14-224 model.
|
||||
|
||||
## Additional Information
|
||||
|
||||
- Kosmos can handle various multimodal tasks, making it a versatile tool for understanding and generating content.
|
||||
- You can provide image URLs for image-based tasks, and Kosmos will automatically retrieve and process the images.
|
||||
- The `draw_entity_boxes_on_image` method is useful for visualizing the results of multimodal grounding tasks.
|
||||
- The `generate_boxes` method is handy for generating bounding boxes around entities mentioned in a textual prompt.
|
||||
|
||||
That concludes the documentation for Kosmos. We hope you find this multimodal AI model valuable for your projects. If you have any questions or encounter any issues, please refer to the Kosmos documentation for
|
||||
further assistance. Enjoy working with Kosmos!
|
@ -0,0 +1,88 @@
|
||||
# `LayoutLMDocumentQA` Documentation
|
||||
|
||||
## Introduction
|
||||
|
||||
Welcome to the documentation for LayoutLMDocumentQA, a multimodal model designed for visual question answering (QA) on real-world documents, such as invoices, PDFs, and more. This comprehensive documentation will provide you with a deep understanding of the LayoutLMDocumentQA class, its architecture, usage, and examples.
|
||||
|
||||
## Overview
|
||||
|
||||
LayoutLMDocumentQA is a versatile model that combines layout-based understanding of documents with natural language processing to answer questions about the content of documents. It is particularly useful for automating tasks like invoice processing, extracting information from PDFs, and handling various document-based QA scenarios.
|
||||
|
||||
## Class Definition
|
||||
|
||||
```python
|
||||
class LayoutLMDocumentQA(AbstractModel):
|
||||
def __init__(
|
||||
self,
|
||||
model_name: str = "impira/layoutlm-document-qa",
|
||||
task: str = "document-question-answering",
|
||||
):
|
||||
```
|
||||
|
||||
## Purpose
|
||||
|
||||
The LayoutLMDocumentQA class serves the following primary purposes:
|
||||
|
||||
1. **Document QA**: LayoutLMDocumentQA is specifically designed for document-based question answering. It can process both the textual content and the layout of a document to answer questions.
|
||||
|
||||
2. **Multimodal Understanding**: It combines natural language understanding with document layout analysis, making it suitable for documents with complex structures.
|
||||
|
||||
## Parameters
|
||||
|
||||
- `model_name` (str): The name or path of the pretrained LayoutLMDocumentQA model. Default: "impira/layoutlm-document-qa".
|
||||
- `task` (str): The specific task for which the model will be used. Default: "document-question-answering".
|
||||
|
||||
## Usage
|
||||
|
||||
To use LayoutLMDocumentQA, follow these steps:
|
||||
|
||||
1. Initialize the LayoutLMDocumentQA instance:
|
||||
|
||||
```python
|
||||
from swarms.models import LayoutLMDocumentQA
|
||||
|
||||
layout_lm_doc_qa = LayoutLMDocumentQA()
|
||||
```
|
||||
|
||||
### Example 1 - Initialization
|
||||
|
||||
```python
|
||||
layout_lm_doc_qa = LayoutLMDocumentQA()
|
||||
```
|
||||
|
||||
2. Ask a question about a document and provide the document's image path:
|
||||
|
||||
```python
|
||||
question = "What is the total amount?"
|
||||
image_path = "path/to/document_image.png"
|
||||
answer = layout_lm_doc_qa(question, image_path)
|
||||
```
|
||||
|
||||
### Example 2 - Document QA
|
||||
|
||||
```python
|
||||
layout_lm_doc_qa = LayoutLMDocumentQA()
|
||||
question = "What is the total amount?"
|
||||
image_path = "path/to/document_image.png"
|
||||
answer = layout_lm_doc_qa(question, image_path)
|
||||
```
|
||||
|
||||
## How LayoutLMDocumentQA Works
|
||||
|
||||
LayoutLMDocumentQA employs a multimodal approach to document QA. Here's how it works:
|
||||
|
||||
1. **Initialization**: When you create a LayoutLMDocumentQA instance, you can specify the model to use and the task, which is "document-question-answering" by default.
|
||||
|
||||
2. **Question and Document**: You provide a question about the document and the image path of the document to the LayoutLMDocumentQA instance.
|
||||
|
||||
3. **Multimodal Processing**: LayoutLMDocumentQA processes both the question and the document image. It combines layout-based analysis with natural language understanding.
|
||||
|
||||
4. **Answer Generation**: The model generates an answer to the question based on its analysis of the document layout and content.
|
||||
|
||||
## Additional Information
|
||||
|
||||
- LayoutLMDocumentQA uses the "impira/layoutlm-document-qa" pretrained model, which is specifically designed for document-based question answering.
|
||||
- You can adapt this model to various document QA scenarios by changing the task and providing relevant questions and documents.
|
||||
- This model is particularly useful for automating document-based tasks and extracting valuable information from structured documents.
|
||||
|
||||
That concludes the documentation for LayoutLMDocumentQA. We hope you find this tool valuable for your document-based question answering needs. If you have any questions or encounter any issues, please refer to the LayoutLMDocumentQA documentation for further assistance. Enjoy using LayoutLMDocumentQA!
|
@ -0,0 +1,96 @@
|
||||
## Llava3
|
||||
|
||||
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
import torch
|
||||
from swarms.models.base_llm import BaseLLM
|
||||
|
||||
|
||||
class Llama3(BaseLLM):
|
||||
"""
|
||||
Llama3 class represents a Llama model for natural language generation.
|
||||
|
||||
Args:
|
||||
model_id (str): The ID of the Llama model to use.
|
||||
system_prompt (str): The system prompt to use for generating responses.
|
||||
temperature (float): The temperature value for controlling the randomness of the generated responses.
|
||||
top_p (float): The top-p value for controlling the diversity of the generated responses.
|
||||
max_tokens (int): The maximum number of tokens to generate in the response.
|
||||
**kwargs: Additional keyword arguments.
|
||||
|
||||
Attributes:
|
||||
model_id (str): The ID of the Llama model being used.
|
||||
system_prompt (str): The system prompt for generating responses.
|
||||
temperature (float): The temperature value for generating responses.
|
||||
top_p (float): The top-p value for generating responses.
|
||||
max_tokens (int): The maximum number of tokens to generate in the response.
|
||||
tokenizer (AutoTokenizer): The tokenizer for the Llama model.
|
||||
model (AutoModelForCausalLM): The Llama model for generating responses.
|
||||
|
||||
Methods:
|
||||
run(task, *args, **kwargs): Generates a response for the given task.
|
||||
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
model_id="meta-llama/Meta-Llama-3-8B-Instruct",
|
||||
system_prompt: str = None,
|
||||
temperature: float = 0.6,
|
||||
top_p: float = 0.9,
|
||||
max_tokens: int = 4000,
|
||||
**kwargs,
|
||||
):
|
||||
self.model_id = model_id
|
||||
self.system_prompt = system_prompt
|
||||
self.temperature = temperature
|
||||
self.top_p = top_p
|
||||
self.max_tokens = max_tokens
|
||||
self.tokenizer = AutoTokenizer.from_pretrained(model_id)
|
||||
self.model = AutoModelForCausalLM.from_pretrained(
|
||||
model_id,
|
||||
torch_dtype=torch.bfloat16,
|
||||
device_map="auto",
|
||||
)
|
||||
|
||||
def run(self, task: str, *args, **kwargs):
|
||||
"""
|
||||
Generates a response for the given task.
|
||||
|
||||
Args:
|
||||
task (str): The user's task or input.
|
||||
|
||||
Returns:
|
||||
str: The generated response.
|
||||
|
||||
"""
|
||||
messages = [
|
||||
{"role": "system", "content": self.system_prompt},
|
||||
{"role": "user", "content": task},
|
||||
]
|
||||
|
||||
input_ids = self.tokenizer.apply_chat_template(
|
||||
messages, add_generation_prompt=True, return_tensors="pt"
|
||||
).to(self.model.device)
|
||||
|
||||
terminators = [
|
||||
self.tokenizer.eos_token_id,
|
||||
self.tokenizer.convert_tokens_to_ids("<|eot_id|>"),
|
||||
]
|
||||
|
||||
outputs = self.model.generate(
|
||||
input_ids,
|
||||
max_new_tokens=self.max_tokens,
|
||||
eos_token_id=terminators,
|
||||
do_sample=True,
|
||||
temperature=self.temperature,
|
||||
top_p=self.top_p,
|
||||
*args,
|
||||
**kwargs,
|
||||
)
|
||||
response = outputs[0][input_ids.shape[-1] :]
|
||||
return self.tokenizer.decode(
|
||||
response, skip_special_tokens=True
|
||||
)
|
||||
```
|
@ -0,0 +1,118 @@
|
||||
# `Nougat` Documentation
|
||||
|
||||
## Introduction
|
||||
|
||||
Welcome to the documentation for Nougat, a versatile model designed by Meta for transcribing scientific PDFs into user-friendly Markdown format, extracting information from PDFs, and extracting metadata from PDF documents. This documentation will provide you with a deep understanding of the Nougat class, its architecture, usage, and examples.
|
||||
|
||||
## Overview
|
||||
|
||||
Nougat is a powerful tool that combines language modeling and image processing capabilities to convert scientific PDF documents into Markdown format. It is particularly useful for researchers, students, and professionals who need to extract valuable information from PDFs quickly. With Nougat, you can simplify complex PDFs, making their content more accessible and easy to work with.
|
||||
|
||||
## Class Definition
|
||||
|
||||
```python
|
||||
class Nougat:
|
||||
def __init__(
|
||||
self,
|
||||
model_name_or_path="facebook/nougat-base",
|
||||
min_length: int = 1,
|
||||
max_new_tokens: int = 30,
|
||||
):
|
||||
```
|
||||
|
||||
## Purpose
|
||||
|
||||
The Nougat class serves the following primary purposes:
|
||||
|
||||
1. **PDF Transcription**: Nougat is designed to transcribe scientific PDFs into Markdown format. It helps convert complex PDF documents into a more readable and structured format, making it easier to extract information.
|
||||
|
||||
2. **Information Extraction**: It allows users to extract valuable information and content from PDFs efficiently. This can be particularly useful for researchers and professionals who need to extract data, figures, or text from scientific papers.
|
||||
|
||||
3. **Metadata Extraction**: Nougat can also extract metadata from PDF documents, providing essential details about the document, such as title, author, and publication date.
|
||||
|
||||
## Parameters
|
||||
|
||||
- `model_name_or_path` (str): The name or path of the pretrained Nougat model. Default: "facebook/nougat-base".
|
||||
- `min_length` (int): The minimum length of the generated transcription. Default: 1.
|
||||
- `max_new_tokens` (int): The maximum number of new tokens to generate in the Markdown transcription. Default: 30.
|
||||
|
||||
## Usage
|
||||
|
||||
To use Nougat, follow these steps:
|
||||
|
||||
1. Initialize the Nougat instance:
|
||||
|
||||
```python
|
||||
from swarms.models import Nougat
|
||||
|
||||
nougat = Nougat()
|
||||
```
|
||||
|
||||
### Example 1 - Initialization
|
||||
|
||||
```python
|
||||
nougat = Nougat()
|
||||
```
|
||||
|
||||
2. Transcribe a PDF image using Nougat:
|
||||
|
||||
```python
|
||||
markdown_transcription = nougat("path/to/pdf_file.png")
|
||||
```
|
||||
|
||||
### Example 2 - PDF Transcription
|
||||
|
||||
```python
|
||||
nougat = Nougat()
|
||||
markdown_transcription = nougat("path/to/pdf_file.png")
|
||||
```
|
||||
|
||||
3. Extract information from a PDF:
|
||||
|
||||
```python
|
||||
information = nougat.extract_information("path/to/pdf_file.png")
|
||||
```
|
||||
|
||||
### Example 3 - Information Extraction
|
||||
|
||||
```python
|
||||
nougat = Nougat()
|
||||
information = nougat.extract_information("path/to/pdf_file.png")
|
||||
```
|
||||
|
||||
4. Extract metadata from a PDF:
|
||||
|
||||
```python
|
||||
metadata = nougat.extract_metadata("path/to/pdf_file.png")
|
||||
```
|
||||
|
||||
### Example 4 - Metadata Extraction
|
||||
|
||||
```python
|
||||
nougat = Nougat()
|
||||
metadata = nougat.extract_metadata("path/to/pdf_file.png")
|
||||
```
|
||||
|
||||
## How Nougat Works
|
||||
|
||||
Nougat employs a vision encoder-decoder model, along with a dedicated processor, to transcribe PDFs into Markdown format and perform information and metadata extraction. Here's how it works:
|
||||
|
||||
1. **Initialization**: When you create a Nougat instance, you can specify the model to use, the minimum transcription length, and the maximum number of new tokens to generate.
|
||||
|
||||
2. **Processing PDFs**: Nougat can process PDFs as input. You can provide the path to a PDF document.
|
||||
|
||||
3. **Image Processing**: The processor converts PDF pages into images, which are then encoded by the model.
|
||||
|
||||
4. **Transcription**: Nougat generates Markdown transcriptions of PDF content, ensuring a minimum length and respecting the token limit.
|
||||
|
||||
5. **Information Extraction**: Information extraction involves parsing the Markdown transcription to identify key details or content of interest.
|
||||
|
||||
6. **Metadata Extraction**: Metadata extraction involves identifying and extracting document metadata, such as title, author, and publication date.
|
||||
|
||||
## Additional Information
|
||||
|
||||
- Nougat leverages the "facebook/nougat-base" pretrained model, which is specifically designed for document transcription and extraction tasks.
|
||||
- You can adjust the minimum transcription length and the maximum number of new tokens to control the output's length and quality.
|
||||
- Nougat can be run on both CPU and GPU devices.
|
||||
|
||||
That concludes the documentation for Nougat. We hope you find this tool valuable for your PDF transcription, information extraction, and metadata extraction needs. If you have any questions or encounter any issues, please refer to the Nougat documentation for further assistance. Enjoy using Nougat!
|
@ -0,0 +1,200 @@
|
||||
# `BaseOpenAI` and `OpenAI` Documentation
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#overview)
|
||||
2. [Class Architecture](#class-architecture)
|
||||
3. [Purpose](#purpose)
|
||||
4. [Class Attributes](#class-attributes)
|
||||
5. [Methods](#methods)
|
||||
- [Construction](#construction)
|
||||
- [Configuration](#configuration)
|
||||
- [Tokenization](#tokenization)
|
||||
- [Generation](#generation)
|
||||
- [Asynchronous Generation](#asynchronous-generation)
|
||||
6. [Usage Examples](#usage-examples)
|
||||
- [Creating an OpenAI Object](#creating-an-openai-object)
|
||||
- [Generating Text](#generating-text)
|
||||
- [Advanced Configuration](#advanced-configuration)
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview <a name="overview"></a>
|
||||
|
||||
The `BaseOpenAI` and `OpenAI` classes are part of the LangChain library, designed to interact with OpenAI's large language models (LLMs). These classes provide a seamless interface for utilizing OpenAI's API to generate natural language text.
|
||||
|
||||
## 2. Class Architecture <a name="class-architecture"></a>
|
||||
|
||||
Both `BaseOpenAI` and `OpenAI` classes inherit from `BaseLLM`, demonstrating an inheritance-based architecture. This architecture allows for easy extensibility and customization while adhering to the principles of object-oriented programming.
|
||||
|
||||
## 3. Purpose <a name="purpose"></a>
|
||||
|
||||
The purpose of these classes is to simplify the interaction with OpenAI's LLMs. They encapsulate API calls, handle tokenization, and provide a high-level interface for generating text. By instantiating an object of the `OpenAI` class, developers can quickly leverage the power of OpenAI's models to generate text for various applications, such as chatbots, content generation, and more.
|
||||
|
||||
## 4. Class Attributes <a name="class-attributes"></a>
|
||||
|
||||
Here are the key attributes and their descriptions for the `BaseOpenAI` and `OpenAI` classes:
|
||||
|
||||
| Attribute | Description |
|
||||
|---------------------------|-------------|
|
||||
| `lc_secrets` | A dictionary of secrets required for LangChain, including the OpenAI API key. |
|
||||
| `lc_attributes` | A dictionary of attributes relevant to LangChain. |
|
||||
| `is_lc_serializable()` | A method indicating if the class is serializable for LangChain. |
|
||||
| `model_name` | The name of the language model to use. |
|
||||
| `temperature` | The sampling temperature for text generation. |
|
||||
| `max_tokens` | The maximum number of tokens to generate in a completion. |
|
||||
| `top_p` | The total probability mass of tokens to consider at each step. |
|
||||
| `frequency_penalty` | Penalizes repeated tokens according to frequency. |
|
||||
| `presence_penalty` | Penalizes repeated tokens. |
|
||||
| `n` | How many completions to generate for each prompt. |
|
||||
| `best_of` | Generates `best_of` completions server-side and returns the "best." |
|
||||
| `model_kwargs` | Holds any model parameters valid for `create` calls not explicitly specified. |
|
||||
| `openai_api_key` | The OpenAI API key used for authentication. |
|
||||
| `openai_api_base` | The base URL for the OpenAI API. |
|
||||
| `openai_organization` | The OpenAI organization name, if applicable. |
|
||||
| `openai_proxy` | An explicit proxy URL for OpenAI requests. |
|
||||
| `batch_size` | The batch size to use when passing multiple documents for generation. |
|
||||
| `request_timeout` | The timeout for requests to the OpenAI completion API. |
|
||||
| `logit_bias` | Adjustment to the probability of specific tokens being generated. |
|
||||
| `max_retries` | The maximum number of retries to make when generating. |
|
||||
| `streaming` | Whether to stream the results or not. |
|
||||
| `allowed_special` | A set of special tokens that are allowed. |
|
||||
| `disallowed_special` | A collection of special tokens that are not allowed. |
|
||||
| `tiktoken_model_name` | The model name to pass to `tiktoken` for token counting. |
|
||||
|
||||
## 5. Methods <a name="methods"></a>
|
||||
|
||||
### 5.1 Construction <a name="construction"></a>
|
||||
|
||||
#### 5.1.1 `__new__(cls, **data: Any) -> Union[OpenAIChat, BaseOpenAI]`
|
||||
- Description: Initializes the OpenAI object.
|
||||
- Arguments:
|
||||
- `cls` (class): The class instance.
|
||||
- `data` (dict): Additional data for initialization.
|
||||
- Returns:
|
||||
- Union[OpenAIChat, BaseOpenAI]: An instance of the OpenAI class.
|
||||
|
||||
### 5.2 Configuration <a name="configuration"></a>
|
||||
|
||||
#### 5.2.1 `build_extra(cls, values: Dict[str, Any]) -> Dict[str, Any]`
|
||||
- Description: Builds extra kwargs from additional params passed in.
|
||||
- Arguments:
|
||||
- `cls` (class): The class instance.
|
||||
- `values` (dict): Values and parameters to build extra kwargs.
|
||||
- Returns:
|
||||
- Dict[str, Any]: A dictionary of built extra kwargs.
|
||||
|
||||
#### 5.2.2 `validate_environment(cls, values: Dict) -> Dict`
|
||||
- Description: Validates that the API key and python package exist in the environment.
|
||||
- Arguments:
|
||||
- `values` (dict): The class values and parameters.
|
||||
- Returns:
|
||||
- Dict: A dictionary of validated values.
|
||||
|
||||
### 5.3 Tokenization <a name="tokenization"></a>
|
||||
|
||||
#### 5.3.1 `get_sub_prompts(self, params: Dict[str, Any], prompts: List[str], stop: Optional[List[str]] = None) -> List[List[str]]`
|
||||
- Description: Gets sub-prompts for LLM call.
|
||||
- Arguments:
|
||||
- `params` (dict): Parameters for LLM call.
|
||||
- `prompts` (list): List of prompts.
|
||||
- `stop` (list, optional): List of stop words.
|
||||
- Returns:
|
||||
- List[List[str]]: List of sub-prompts.
|
||||
|
||||
#### 5.3.2 `get_token_ids(self, text: str) -> List[int]`
|
||||
- Description: Gets token IDs using the `tiktoken` package.
|
||||
- Arguments:
|
||||
- `text` (str): The text for which to calculate token IDs.
|
||||
- Returns:
|
||||
- List[int]: A list of token IDs.
|
||||
|
||||
#### 5.3.3 `modelname_to_contextsize(modelname: str) -> int`
|
||||
- Description: Calculates the maximum number of tokens possible to generate for a model.
|
||||
- Arguments:
|
||||
- `modelname` (str): The model name to determine the context size for.
|
||||
- Returns:
|
||||
- int: The maximum context size.
|
||||
|
||||
#### 5.3.4 `max_tokens_for_prompt(self, prompt: str) -> int`
|
||||
- Description: Calculates the maximum number of tokens possible to generate for a prompt.
|
||||
- Arguments:
|
||||
- `prompt` (str): The prompt for which to
|
||||
|
||||
determine the maximum token limit.
|
||||
- Returns:
|
||||
- int: The maximum token limit.
|
||||
|
||||
### 5.4 Generation <a name="generation"></a>
|
||||
|
||||
#### 5.4.1 `generate(self, text: Union[str, List[str]], **kwargs) -> Union[str, List[str]]`
|
||||
- Description: Generates text using the OpenAI API.
|
||||
- Arguments:
|
||||
- `text` (str or list): The input text or list of inputs.
|
||||
- `**kwargs` (dict): Additional parameters for the generation process.
|
||||
- Returns:
|
||||
- Union[str, List[str]]: The generated text or list of generated texts.
|
||||
|
||||
### 5.5 Asynchronous Generation <a name="asynchronous-generation"></a>
|
||||
|
||||
#### 5.5.1 `generate_async(self, text: Union[str, List[str]], **kwargs) -> Union[str, List[str]]`
|
||||
- Description: Generates text asynchronously using the OpenAI API.
|
||||
- Arguments:
|
||||
- `text` (str or list): The input text or list of inputs.
|
||||
- `**kwargs` (dict): Additional parameters for the asynchronous generation process.
|
||||
- Returns:
|
||||
- Union[str, List[str]]: The generated text or list of generated texts.
|
||||
|
||||
## 6. Usage Examples <a name="usage-examples"></a>
|
||||
|
||||
### 6.1 Creating an OpenAI Object <a name="creating-an-openai-object"></a>
|
||||
|
||||
```python
|
||||
# Import the OpenAI class
|
||||
from swarms.models import OpenAI
|
||||
|
||||
# Set your OpenAI API key
|
||||
api_key = "YOUR_API_KEY"
|
||||
|
||||
# Create an OpenAI object
|
||||
openai = OpenAI(api_key)
|
||||
```
|
||||
|
||||
### 6.2 Generating Text <a name="generating-text"></a>
|
||||
|
||||
```python
|
||||
# Generate text from a single prompt
|
||||
prompt = "Translate the following English text to French: 'Hello, how are you?'"
|
||||
generated_text = openai.generate(prompt, max_tokens=50)
|
||||
|
||||
# Generate text from multiple prompts
|
||||
prompts = [
|
||||
"Translate this: 'Good morning' to Spanish.",
|
||||
"Summarize the following article:",
|
||||
article_text,
|
||||
]
|
||||
generated_texts = openai.generate(prompts, max_tokens=100)
|
||||
|
||||
# Generate text asynchronously
|
||||
async_prompt = "Translate 'Thank you' into German."
|
||||
async_result = openai.generate_async(async_prompt, max_tokens=30)
|
||||
|
||||
# Access the result of an asynchronous generation
|
||||
async_result_text = async_result.get()
|
||||
```
|
||||
|
||||
### 6.3 Advanced Configuration <a name="advanced-configuration"></a>
|
||||
|
||||
```python
|
||||
# Configure generation with advanced options
|
||||
custom_options = {
|
||||
"temperature": 0.7,
|
||||
"max_tokens": 100,
|
||||
"top_p": 0.9,
|
||||
"frequency_penalty": 0.2,
|
||||
"presence_penalty": 0.4,
|
||||
}
|
||||
generated_text = openai.generate(prompt, **custom_options)
|
||||
```
|
||||
|
||||
This documentation provides a comprehensive understanding of the `BaseOpenAI` and `OpenAI` classes, their attributes, methods, and usage examples. Developers can utilize these classes to interact with OpenAI's language models efficiently, enabling various natural language generation tasks.
|
@ -0,0 +1,185 @@
|
||||
# `OpenAIChat` Documentation
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Introduction](#introduction)
|
||||
2. [Class Overview](#class-overview)
|
||||
3. [Class Architecture](#class-architecture)
|
||||
4. [Class Attributes](#class-attributes)
|
||||
5. [Methods](#methods)
|
||||
- [Construction](#construction)
|
||||
- [Configuration](#configuration)
|
||||
- [Message Handling](#message-handling)
|
||||
- [Generation](#generation)
|
||||
- [Tokenization](#tokenization)
|
||||
6. [Usage Examples](#usage-examples)
|
||||
7. [Additional Information](#additional-information)
|
||||
|
||||
---
|
||||
|
||||
## 1. Introduction <a name="introduction"></a>
|
||||
|
||||
The `OpenAIChat` class is part of the LangChain library and serves as an interface to interact with OpenAI's Chat large language models. This documentation provides an in-depth understanding of the class, its attributes, methods, and usage examples.
|
||||
|
||||
## 2. Class Overview <a name="class-overview"></a>
|
||||
|
||||
The `OpenAIChat` class is designed for conducting chat-like conversations with OpenAI's language models, such as GPT-3.5 Turbo. It allows you to create interactive conversations by sending messages and receiving model-generated responses. This class simplifies the process of integrating OpenAI's models into chatbot applications and other natural language processing tasks.
|
||||
|
||||
## 3. Class Architecture <a name="class-architecture"></a>
|
||||
|
||||
The `OpenAIChat` class is built on top of the `BaseLLM` class, which provides a foundation for working with large language models. This inheritance-based architecture allows for customization and extension while adhering to object-oriented programming principles.
|
||||
|
||||
## 4. Class Attributes <a name="class-attributes"></a>
|
||||
|
||||
Here are the key attributes and their descriptions for the `OpenAIChat` class:
|
||||
|
||||
| Attribute | Description |
|
||||
|-----------------------------|-------------------------------------------------------------------------------|
|
||||
| `client` | An internal client for making API calls to OpenAI. |
|
||||
| `model_name` | The name of the language model to use (default: "gpt-3.5-turbo"). |
|
||||
| `model_kwargs` | Additional model parameters valid for `create` calls not explicitly specified.|
|
||||
| `openai_api_key` | The OpenAI API key used for authentication. |
|
||||
| `openai_api_base` | The base URL for the OpenAI API. |
|
||||
| `openai_proxy` | An explicit proxy URL for OpenAI requests. |
|
||||
| `max_retries` | The maximum number of retries to make when generating (default: 6). |
|
||||
| `prefix_messages` | A list of messages to set the initial conversation state (default: []). |
|
||||
| `streaming` | Whether to stream the results or not (default: False). |
|
||||
| `allowed_special` | A set of special tokens that are allowed (default: an empty set). |
|
||||
| `disallowed_special` | A collection of special tokens that are not allowed (default: "all"). |
|
||||
|
||||
## 5. Methods <a name="methods"></a>
|
||||
|
||||
### 5.1 Construction <a name="construction"></a>
|
||||
|
||||
#### 5.1.1 `__init__(self, model_name: str = "gpt-3.5-turbo", openai_api_key: Optional[str] = None, openai_api_base: Optional[str] = None, openai_proxy: Optional[str] = None, max_retries: int = 6, prefix_messages: List = [])`
|
||||
- Description: Initializes an OpenAIChat object.
|
||||
- Arguments:
|
||||
- `model_name` (str): The name of the language model to use (default: "gpt-3.5-turbo").
|
||||
- `openai_api_key` (str, optional): The OpenAI API key used for authentication.
|
||||
- `openai_api_base` (str, optional): The base URL for the OpenAI API.
|
||||
- `openai_proxy` (str, optional): An explicit proxy URL for OpenAI requests.
|
||||
- `max_retries` (int): The maximum number of retries to make when generating (default: 6).
|
||||
- `prefix_messages` (List): A list of messages to set the initial conversation state (default: []).
|
||||
|
||||
### 5.2 Configuration <a name="configuration"></a>
|
||||
|
||||
#### 5.2.1 `build_extra(self, values: Dict[str, Any]) -> Dict[str, Any]`
|
||||
- Description: Builds extra kwargs from additional parameters passed in.
|
||||
- Arguments:
|
||||
- `values` (dict): Values and parameters to build extra kwargs.
|
||||
- Returns:
|
||||
- Dict[str, Any]: A dictionary of built extra kwargs.
|
||||
|
||||
#### 5.2.2 `validate_environment(self, values: Dict) -> Dict`
|
||||
- Description: Validates that the API key and Python package exist in the environment.
|
||||
- Arguments:
|
||||
- `values` (dict): The class values and parameters.
|
||||
- Returns:
|
||||
- Dict: A dictionary of validated values.
|
||||
|
||||
### 5.3 Message Handling <a name="message-handling"></a>
|
||||
|
||||
#### 5.3.1 `_get_chat_params(self, prompts: List[str], stop: Optional[List[str]] = None) -> Tuple`
|
||||
- Description: Gets chat-related parameters for generating responses.
|
||||
- Arguments:
|
||||
- `prompts` (list): List of user messages.
|
||||
- `stop` (list, optional): List of stop words.
|
||||
- Returns:
|
||||
- Tuple: Messages and parameters.
|
||||
|
||||
### 5.4 Generation <a name="generation"></a>
|
||||
|
||||
#### 5.4.1 `_stream(self, prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[CallbackManagerForLLMRun] = None, **kwargs: Any) -> Iterator[GenerationChunk]`
|
||||
- Description: Generates text asynchronously using the OpenAI API.
|
||||
- Arguments:
|
||||
- `prompt` (str): The user's message.
|
||||
- `stop` (list, optional): List of stop words.
|
||||
- `run_manager` (optional): Callback manager for asynchronous generation.
|
||||
- `**kwargs` (dict): Additional parameters for asynchronous generation.
|
||||
- Returns:
|
||||
- Iterator[GenerationChunk]: An iterator of generated text chunks.
|
||||
|
||||
#### 5.4.2 `_agenerate(self, prompts: List[str], stop: Optional[List[str]] = None, run_manager: Optional[AsyncCallbackManagerForLLMRun] = None, **kwargs: Any) -> LLMResult`
|
||||
- Description: Generates text asynchronously using the OpenAI API (async version).
|
||||
- Arguments:
|
||||
- `prompts` (list): List of user messages.
|
||||
- `stop` (list, optional): List of stop words.
|
||||
- `run_manager` (optional): Callback manager for asynchronous generation.
|
||||
- `**kwargs` (dict): Additional parameters for asynchronous generation.
|
||||
- Returns:
|
||||
- LLMResult: A result object containing the generated text.
|
||||
|
||||
### 5.5 Tokenization <a name="tokenization"></a>
|
||||
|
||||
#### 5.5.1 `get_token_ids(self, text: str) -> List[int]`
|
||||
- Description: Gets token IDs using the tiktoken package.
|
||||
- Arguments:
|
||||
- `text` (str): The text for which to calculate token IDs.
|
||||
- Returns:
|
||||
- List[int]: A list of
|
||||
|
||||
token IDs.
|
||||
|
||||
## 6. Usage Examples <a name="usage-examples"></a>
|
||||
|
||||
### Example 1: Initializing `OpenAIChat`
|
||||
|
||||
```python
|
||||
from swarms.models import OpenAIChat
|
||||
|
||||
# Initialize OpenAIChat with model name and API key
|
||||
openai_chat = OpenAIChat(model_name="gpt-3.5-turbo", openai_api_key="YOUR_API_KEY")
|
||||
```
|
||||
|
||||
### Example 2: Sending Messages and Generating Responses
|
||||
|
||||
```python
|
||||
# Define a conversation
|
||||
conversation = [
|
||||
"User: Tell me a joke.",
|
||||
"Assistant: Why did the chicken cross the road?",
|
||||
"User: I don't know. Why?",
|
||||
"Assistant: To get to the other side!",
|
||||
]
|
||||
|
||||
# Set the conversation as the prefix messages
|
||||
openai_chat.prefix_messages = conversation
|
||||
|
||||
# Generate a response
|
||||
user_message = "User: Tell me another joke."
|
||||
response = openai_chat.generate([user_message])
|
||||
|
||||
# Print the generated response
|
||||
print(
|
||||
response[0][0].text
|
||||
) # Output: "Assistant: Why don't scientists trust atoms? Because they make up everything!"
|
||||
```
|
||||
|
||||
### Example 3: Asynchronous Generation
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
|
||||
|
||||
# Define an asynchronous function for generating responses
|
||||
async def generate_responses():
|
||||
user_message = "User: Tell me a fun fact."
|
||||
async for chunk in openai_chat.stream([user_message]):
|
||||
print(chunk.text)
|
||||
|
||||
|
||||
# Run the asynchronous generation function
|
||||
asyncio.run(generate_responses())
|
||||
```
|
||||
|
||||
## 7. Additional Information <a name="additional-information"></a>
|
||||
|
||||
- To use the `OpenAIChat` class, you should have the `openai` Python package installed, and the environment variable `OPENAI_API_KEY` set with your API key.
|
||||
- Any parameters that are valid to be passed to the `openai.create` call can be passed to the `OpenAIChat` constructor.
|
||||
- You can customize the behavior of the class by setting various attributes, such as `model_name`, `openai_api_key`, `prefix_messages`, and more.
|
||||
- For asynchronous generation, you can use the `_stream` and `_agenerate` methods to interactively receive model-generated text chunks.
|
||||
- To calculate token IDs, you can use the `get_token_ids` method, which utilizes the `tiktoken` package. Make sure to install the `tiktoken` package with `pip install tiktoken` if needed.
|
||||
|
||||
---
|
||||
|
||||
This documentation provides a comprehensive overview of the `OpenAIChat` class, its attributes, methods, and usage examples. You can use this class to create chatbot applications, conduct conversations with language models, and explore the capabilities of OpenAI's GPT-3.5 Turbo model.
|
@ -0,0 +1,135 @@
|
||||
# `OpenAITTS` Documentation
|
||||
|
||||
## Table of Contents
|
||||
1. [Overview](#overview)
|
||||
2. [Installation](#installation)
|
||||
3. [Usage](#usage)
|
||||
- [Initialization](#initialization)
|
||||
- [Running TTS](#running-tts)
|
||||
- [Running TTS and Saving](#running-tts-and-saving)
|
||||
4. [Examples](#examples)
|
||||
- [Basic Usage](#basic-usage)
|
||||
- [Saving the Output](#saving-the-output)
|
||||
5. [Advanced Options](#advanced-options)
|
||||
6. [Troubleshooting](#troubleshooting)
|
||||
7. [References](#references)
|
||||
|
||||
## 1. Overview <a name="overview"></a>
|
||||
|
||||
The `OpenAITTS` module is a Python library that provides an interface for converting text to speech (TTS) using the OpenAI TTS API. It allows you to generate high-quality speech from text input, making it suitable for various applications such as voice assistants, speech synthesis, and more.
|
||||
|
||||
### Features:
|
||||
- Convert text to speech using OpenAI's TTS model.
|
||||
- Supports specifying the model name, voice, and other parameters.
|
||||
- Option to save the generated speech to a WAV file.
|
||||
|
||||
## 2. Installation <a name="installation"></a>
|
||||
|
||||
To use the `OpenAITTS` model, you need to install the necessary dependencies. You can do this using `pip`:
|
||||
|
||||
```bash
|
||||
pip install swarms requests wave
|
||||
```
|
||||
|
||||
## 3. Usage <a name="usage"></a>
|
||||
|
||||
### Initialization <a name="initialization"></a>
|
||||
|
||||
To use the `OpenAITTS` module, you need to initialize an instance of the `OpenAITTS` class. Here's how you can do it:
|
||||
|
||||
```python
|
||||
from swarms.models.openai_tts import OpenAITTS
|
||||
|
||||
# Initialize the OpenAITTS instance
|
||||
tts = OpenAITTS(
|
||||
model_name="tts-1-1106",
|
||||
proxy_url="https://api.openai.com/v1/audio/speech",
|
||||
openai_api_key=openai_api_key_env,
|
||||
voice="onyx",
|
||||
)
|
||||
```
|
||||
|
||||
#### Parameters:
|
||||
- `model_name` (str): The name of the TTS model to use (default is "tts-1-1106").
|
||||
- `proxy_url` (str): The URL for the OpenAI TTS API (default is "https://api.openai.com/v1/audio/speech").
|
||||
- `openai_api_key` (str): Your OpenAI API key. It can be obtained from the OpenAI website.
|
||||
- `voice` (str): The voice to use for generating speech (default is "onyx").
|
||||
- `chunk_size` (int): The size of data chunks when fetching audio (default is 1024 * 1024 bytes).
|
||||
- `autosave` (bool): Whether to automatically save the generated speech to a file (default is False).
|
||||
- `saved_filepath` (str): The path to the file where the speech will be saved (default is "runs/tts_speech.wav").
|
||||
|
||||
### Running TTS <a name="running-tts"></a>
|
||||
|
||||
Once the `OpenAITTS` instance is initialized, you can use it to convert text to speech using the `run` method:
|
||||
|
||||
```python
|
||||
# Generate speech from text
|
||||
speech_data = tts.run("Hello, world!")
|
||||
```
|
||||
|
||||
#### Parameters:
|
||||
- `task` (str): The text you want to convert to speech.
|
||||
|
||||
#### Returns:
|
||||
- `speech_data` (bytes): The generated speech data.
|
||||
|
||||
### Running TTS and Saving <a name="running-tts-and-saving"></a>
|
||||
|
||||
You can also use the `run_and_save` method to generate speech from text and save it to a file:
|
||||
|
||||
```python
|
||||
# Generate speech from text and save it to a file
|
||||
speech_data = tts.run_and_save("Hello, world!")
|
||||
```
|
||||
|
||||
#### Parameters:
|
||||
- `task` (str): The text you want to convert to speech.
|
||||
|
||||
#### Returns:
|
||||
- `speech_data` (bytes): The generated speech data.
|
||||
|
||||
## 4. Examples <a name="examples"></a>
|
||||
|
||||
### Basic Usage <a name="basic-usage"></a>
|
||||
|
||||
Here's a basic example of how to use the `OpenAITTS` module to generate speech from text:
|
||||
|
||||
```python
|
||||
from swarms.models.openai_tts import OpenAITTS
|
||||
|
||||
# Initialize the OpenAITTS instance
|
||||
tts = OpenAITTS(
|
||||
model_name="tts-1-1106",
|
||||
proxy_url="https://api.openai.com/v1/audio/speech",
|
||||
openai_api_key=openai_api_key_env,
|
||||
voice="onyx",
|
||||
)
|
||||
|
||||
# Generate speech from text
|
||||
speech_data = tts.run("Hello, world!")
|
||||
```
|
||||
|
||||
### Saving the Output <a name="saving-the-output"></a>
|
||||
|
||||
You can save the generated speech to a WAV file using the `run_and_save` method:
|
||||
|
||||
```python
|
||||
# Generate speech from text and save it to a file
|
||||
speech_data = tts.run_and_save("Hello, world!")
|
||||
```
|
||||
|
||||
## 5. Advanced Options <a name="advanced-options"></a>
|
||||
|
||||
The `OpenAITTS` module supports various advanced options for customizing the TTS generation process. You can specify the model name, voice, and other parameters during initialization. Additionally, you can configure the chunk size for audio data fetching and choose whether to automatically save the generated speech to a file.
|
||||
|
||||
## 6. Troubleshooting <a name="troubleshooting"></a>
|
||||
|
||||
If you encounter any issues while using the `OpenAITTS` module, please make sure you have installed all the required dependencies and that your OpenAI API key is correctly configured. If you still face problems, refer to the OpenAI documentation or contact their support for assistance.
|
||||
|
||||
## 7. References <a name="references"></a>
|
||||
|
||||
- [OpenAI API Documentation](https://beta.openai.com/docs/)
|
||||
- [Python Requests Library](https://docs.python-requests.org/en/latest/)
|
||||
- [Python Wave Library](https://docs.python.org/3/library/wave.html)
|
||||
|
||||
This documentation provides a comprehensive guide on how to use the `OpenAITTS` module to convert text to speech using OpenAI's TTS model. It covers initialization, basic usage, advanced options, troubleshooting, and references for further exploration.
|
@ -0,0 +1,95 @@
|
||||
# `Vilt` Documentation
|
||||
|
||||
## Introduction
|
||||
|
||||
Welcome to the documentation for Vilt, a Vision-and-Language Transformer (ViLT) model fine-tuned on the VQAv2 dataset. Vilt is a powerful model capable of answering questions about images. This documentation will provide a comprehensive understanding of Vilt, its architecture, usage, and how it can be integrated into your projects.
|
||||
|
||||
## Overview
|
||||
|
||||
Vilt is based on the Vision-and-Language Transformer (ViLT) architecture, designed for tasks that involve understanding both text and images. It has been fine-tuned on the VQAv2 dataset, making it adept at answering questions about images. This model is particularly useful for tasks where textual and visual information needs to be combined to provide meaningful answers.
|
||||
|
||||
## Class Definition
|
||||
|
||||
```python
|
||||
class Vilt:
|
||||
def __init__(self):
|
||||
"""
|
||||
Initialize the Vilt model.
|
||||
"""
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To use the Vilt model, follow these steps:
|
||||
|
||||
1. Initialize the Vilt model:
|
||||
|
||||
```python
|
||||
from swarms.models import Vilt
|
||||
|
||||
model = Vilt()
|
||||
```
|
||||
|
||||
2. Call the model with a text question and an image URL:
|
||||
|
||||
```python
|
||||
output = model(
|
||||
"What is this image?", "http://images.cocodataset.org/val2017/000000039769.jpg"
|
||||
)
|
||||
```
|
||||
|
||||
### Example 1 - Image Questioning
|
||||
|
||||
```python
|
||||
model = Vilt()
|
||||
output = model(
|
||||
"What are the objects in this image?",
|
||||
"http://images.cocodataset.org/val2017/000000039769.jpg",
|
||||
)
|
||||
print(output)
|
||||
```
|
||||
|
||||
### Example 2 - Image Analysis
|
||||
|
||||
```python
|
||||
model = Vilt()
|
||||
output = model(
|
||||
"Describe the scene in this image.",
|
||||
"http://images.cocodataset.org/val2017/000000039769.jpg",
|
||||
)
|
||||
print(output)
|
||||
```
|
||||
|
||||
### Example 3 - Visual Knowledge Retrieval
|
||||
|
||||
```python
|
||||
model = Vilt()
|
||||
output = model(
|
||||
"Tell me more about the landmark in this image.",
|
||||
"http://images.cocodataset.org/val2017/000000039769.jpg",
|
||||
)
|
||||
print(output)
|
||||
```
|
||||
|
||||
## How Vilt Works
|
||||
|
||||
Vilt operates by combining text and image information to generate meaningful answers to questions about the provided image. Here's how it works:
|
||||
|
||||
1. **Initialization**: When you create a Vilt instance, it initializes the processor and the model. The processor is responsible for handling the image and text input, while the model is the fine-tuned ViLT model.
|
||||
|
||||
2. **Processing Input**: When you call the Vilt model with a text question and an image URL, it downloads the image and processes it along with the text question. This processing step involves tokenization and encoding of the input.
|
||||
|
||||
3. **Forward Pass**: The encoded input is then passed through the ViLT model. It calculates the logits, and the answer with the highest probability is selected.
|
||||
|
||||
4. **Output**: The predicted answer is returned as the output of the model.
|
||||
|
||||
## Parameters
|
||||
|
||||
Vilt does not require any specific parameters during initialization. It is pre-configured to work with the "dandelin/vilt-b32-finetuned-vqa" model.
|
||||
|
||||
## Additional Information
|
||||
|
||||
- Vilt is fine-tuned on the VQAv2 dataset, making it proficient at answering questions about a wide range of images.
|
||||
- You can use Vilt for various applications, including image question-answering, image analysis, and visual knowledge retrieval.
|
||||
|
||||
That concludes the documentation for Vilt. We hope you find this model useful for your vision-and-language tasks. If you have any questions or encounter any issues, please refer to the Hugging Face Transformers documentation for further assistance. Enjoy working with Vilt!
|
@ -0,0 +1,3 @@
|
||||
# awesome-multi-agent-papers
|
||||
|
||||
An awesome list of multi-agent papers that show you various swarm architectures and much more. [Get started](https://github.com/kyegomez/awesome-multi-agent-papers)
|
@ -0,0 +1,516 @@
|
||||
# `BaseSwarm` Documentation
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Introduction](#introduction)
|
||||
2. [Class Definition](#class-definition)
|
||||
3. [Methods](#methods)
|
||||
- [communicate()](#communicate)
|
||||
- [run()](#run)
|
||||
- [arun()](#arun)
|
||||
- [add_worker(worker)](#add_worker)
|
||||
- [remove_worker(worker)](#remove_worker)
|
||||
- [broadcast(message, sender)](#broadcast)
|
||||
- [reset()](#reset)
|
||||
- [plan(task)](#plan)
|
||||
- [direct_message(message, sender, recipient)](#direct_message)
|
||||
- [autoscaler(num_workers, worker)](#autoscaler)
|
||||
- [get_worker_by_id(id)](#get_worker_by_id)
|
||||
- [get_worker_by_name(name)](#get_worker_by_name)
|
||||
- [assign_task(worker, task)](#assign_task)
|
||||
- [get_all_tasks(worker, task)](#get_all_tasks)
|
||||
- [get_finished_tasks()](#get_finished_tasks)
|
||||
- [get_pending_tasks()](#get_pending_tasks)
|
||||
- [pause_worker(worker, worker_id)](#pause_worker)
|
||||
- [resume_worker(worker, worker_id)](#resume_worker)
|
||||
- [stop_worker(worker, worker_id)](#stop_worker)
|
||||
- [restart_worker(worker)](#restart_worker)
|
||||
- [scale_up(num_worker)](#scale_up)
|
||||
- [scale_down(num_worker)](#scale_down)
|
||||
- [scale_to(num_worker)](#scale_to)
|
||||
- [get_all_workers()](#get_all_workers)
|
||||
- [get_swarm_size()](#get_swarm_size)
|
||||
- [get_swarm_status()](#get_swarm_status)
|
||||
- [save_swarm_state()](#save_swarm_state)
|
||||
|
||||
---
|
||||
|
||||
## 1. Introduction <a name="introduction"></a>
|
||||
|
||||
The Swarms library is designed to provide a framework for swarm simulation architectures. Swarms are collections of autonomous agents or workers that collaborate to perform tasks and achieve common goals. This documentation will guide you through the functionality and usage of the Swarms library, explaining the purpose and implementation details of the provided classes and methods.
|
||||
|
||||
## 2. Class Definition <a name="class-definition"></a>
|
||||
|
||||
### `BaseSwarm` Class
|
||||
|
||||
The `BaseSwarm` class is an abstract base class that serves as the foundation for swarm simulation architectures. It defines the core functionality and methods required to manage and interact with a swarm of workers.
|
||||
|
||||
```python
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import List
|
||||
|
||||
from swarms.swarms.base import AbstractWorker
|
||||
|
||||
|
||||
class BaseSwarm(ABC):
|
||||
"""
|
||||
Abstract class for swarm simulation architectures
|
||||
|
||||
Methods:
|
||||
---------
|
||||
...
|
||||
"""
|
||||
|
||||
# The class definition and constructor are provided here.
|
||||
|
||||
@abstractmethod
|
||||
def __init__(self, workers: List["AbstractWorker"]):
|
||||
"""Initialize the swarm with workers"""
|
||||
|
||||
# Other abstract methods are listed here.
|
||||
```
|
||||
|
||||
## 3. Methods <a name="methods"></a>
|
||||
|
||||
### `communicate()` <a name="communicate"></a>
|
||||
|
||||
The `communicate()` method allows the swarm to exchange information through the orchestrator, protocols, and the universal communication layer.
|
||||
|
||||
**Usage Example 1:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm.communicate()
|
||||
```
|
||||
|
||||
**Usage Example 2:**
|
||||
|
||||
```python
|
||||
# Another example of using the communicate method
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm.communicate()
|
||||
```
|
||||
|
||||
### `run()` <a name="run"></a>
|
||||
|
||||
The `run()` method executes the swarm, initiating its activities.
|
||||
|
||||
**Usage Example 1:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm.run()
|
||||
```
|
||||
|
||||
**Usage Example 2:**
|
||||
|
||||
```python
|
||||
# Another example of running the swarm
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm.run()
|
||||
```
|
||||
|
||||
### `arun()` <a name="arun"></a>
|
||||
|
||||
The `arun()` method runs the swarm asynchronously, allowing for parallel execution of tasks.
|
||||
|
||||
**Usage Example 1:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm.arun()
|
||||
```
|
||||
|
||||
**Usage Example 2:**
|
||||
|
||||
```python
|
||||
# Another example of running the swarm asynchronously
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm.arun()
|
||||
```
|
||||
|
||||
### `add_worker(worker: "AbstractWorker")` <a name="add_worker"></a>
|
||||
|
||||
The `add_worker()` method adds a worker to the swarm.
|
||||
|
||||
**Parameters:**
|
||||
- `worker` (AbstractWorker): The worker to be added to the swarm.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass([])
|
||||
worker = YourWorkerClass()
|
||||
swarm.add_worker(worker)
|
||||
```
|
||||
|
||||
### `remove_worker(worker: "AbstractWorker")` <a name="remove_worker"></a>
|
||||
|
||||
The `remove_worker()` method removes a worker from the swarm.
|
||||
|
||||
**Parameters:**
|
||||
- `worker` (AbstractWorker): The worker to be removed from the swarm.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
worker = swarm.get_worker_by_id("worker_id")
|
||||
swarm.remove_worker(worker)
|
||||
```
|
||||
|
||||
### `broadcast(message: str, sender: Optional["AbstractWorker"] = None)` <a name="broadcast"></a>
|
||||
|
||||
The `broadcast()` method sends a message to all workers in the swarm.
|
||||
|
||||
**Parameters:**
|
||||
- `message` (str): The message to be broadcasted.
|
||||
- `sender` (Optional[AbstractWorker]): The sender of the message (optional).
|
||||
|
||||
**Usage Example 1:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
message = "Hello, everyone!"
|
||||
swarm.broadcast(message)
|
||||
```
|
||||
|
||||
**Usage Example 2:**
|
||||
|
||||
```python
|
||||
# Another example of broadcasting a message
|
||||
swarm = YourSwarmClass(workers)
|
||||
message = "Important announcement!"
|
||||
sender = swarm.get_worker_by_name("Supervisor")
|
||||
swarm.broadcast(message, sender)
|
||||
```
|
||||
|
||||
### `reset()` <a name="reset"></a>
|
||||
|
||||
The `reset()` method resets the swarm to its initial state.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm.reset()
|
||||
```
|
||||
|
||||
### `plan(task: str)` <a name="plan"></a>
|
||||
|
||||
The `plan()` method instructs workers to individually plan using a workflow or pipeline for a specified task.
|
||||
|
||||
**Parameters:**
|
||||
- `task` (str): The task for which workers should plan.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
task = "Perform data analysis"
|
||||
swarm.plan(task)
|
||||
```
|
||||
|
||||
### `direct_message(message: str, sender: "AbstractWorker", recipient: "AbstractWorker")` <a name="direct_message"></a>
|
||||
|
||||
The `direct_message()` method sends a direct message from one worker to another.
|
||||
|
||||
**Parameters:**
|
||||
- `message` (str): The message to be sent.
|
||||
- `sender` (AbstractWorker): The sender of the message.
|
||||
- `recipient` (AbstractWorker): The recipient of the message.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
sender = swarm.get_worker_by_name("Worker1")
|
||||
recipient = swarm.get_worker_by_name("Worker2")
|
||||
message = "Hello
|
||||
|
||||
, Worker2!"
|
||||
swarm.direct_message(message, sender, recipient)
|
||||
```
|
||||
|
||||
### `autoscaler(num_workers: int, worker: List["AbstractWorker"])` <a name="autoscaler"></a>
|
||||
|
||||
The `autoscaler()` method acts as an autoscaler, dynamically adjusting the number of workers based on system load or other criteria.
|
||||
|
||||
**Parameters:**
|
||||
- `num_workers` (int): The desired number of workers.
|
||||
- `worker` (List[AbstractWorker]): A list of workers to be managed by the autoscaler.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass([])
|
||||
workers = [YourWorkerClass() for _ in range(10)]
|
||||
swarm.autoscaler(5, workers)
|
||||
```
|
||||
|
||||
### `get_worker_by_id(id: str) -> "AbstractWorker"` <a name="get_worker_by_id"></a>
|
||||
|
||||
The `get_worker_by_id()` method locates a worker in the swarm by their ID.
|
||||
|
||||
**Parameters:**
|
||||
- `id` (str): The ID of the worker to locate.
|
||||
|
||||
**Returns:**
|
||||
- `AbstractWorker`: The worker with the specified ID.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
worker_id = "worker_123"
|
||||
worker = swarm.get_worker_by_id(worker_id)
|
||||
```
|
||||
|
||||
### `get_worker_by_name(name: str) -> "AbstractWorker"` <a name="get_worker_by_name"></a>
|
||||
|
||||
The `get_worker_by_name()` method locates a worker in the swarm by their name.
|
||||
|
||||
**Parameters:**
|
||||
- `name` (str): The name of the worker to locate.
|
||||
|
||||
**Returns:**
|
||||
- `AbstractWorker`: The worker with the specified name.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
worker_name = "Alice"
|
||||
worker = swarm.get_worker_by_name(worker_name)
|
||||
```
|
||||
|
||||
### `assign_task(worker: "AbstractWorker", task: Any) -> Dict` <a name="assign_task"></a>
|
||||
|
||||
The `assign_task()` method assigns a task to a specific worker.
|
||||
|
||||
**Parameters:**
|
||||
- `worker` (AbstractWorker): The worker to whom the task should be assigned.
|
||||
- `task` (Any): The task to be assigned.
|
||||
|
||||
**Returns:**
|
||||
- `Dict`: A dictionary indicating the status of the task assignment.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
worker = swarm.get_worker_by_name("Worker1")
|
||||
task = "Perform data analysis"
|
||||
result = swarm.assign_task(worker, task)
|
||||
```
|
||||
|
||||
### `get_all_tasks(worker: "AbstractWorker", task: Any)` <a name="get_all_tasks"></a>
|
||||
|
||||
The `get_all_tasks()` method retrieves all tasks assigned to a specific worker.
|
||||
|
||||
**Parameters:**
|
||||
- `worker` (AbstractWorker): The worker for whom tasks should be retrieved.
|
||||
- `task` (Any): The task to be retrieved.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
worker = swarm.get_worker_by_name("Worker1")
|
||||
tasks = swarm.get_all_tasks(worker, "data analysis")
|
||||
```
|
||||
|
||||
### `get_finished_tasks() -> List[Dict]` <a name="get_finished_tasks"></a>
|
||||
|
||||
The `get_finished_tasks()` method retrieves all tasks that have been completed by the workers in the swarm.
|
||||
|
||||
**Returns:**
|
||||
- `List[Dict]`: A list of dictionaries representing finished tasks.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
finished_tasks = swarm.get_finished_tasks()
|
||||
```
|
||||
|
||||
### `get_pending_tasks() -> List[Dict]` <a name="get_pending_tasks"></a>
|
||||
|
||||
The `get_pending_tasks()` method retrieves all tasks that are pending or yet to be completed by the workers in the swarm.
|
||||
|
||||
**Returns:**
|
||||
- `List[Dict]`: A list of dictionaries representing pending tasks.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
pending_tasks = swarm.get_pending_tasks()
|
||||
```
|
||||
|
||||
### `pause_worker(worker: "AbstractWorker", worker_id: str)` <a name="pause_worker"></a>
|
||||
|
||||
The `pause_worker()` method pauses a specific worker, temporarily suspending their activities.
|
||||
|
||||
**Parameters:**
|
||||
- `worker` (AbstractWorker): The worker to be paused.
|
||||
- `worker_id` (str): The ID of the worker to be paused.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
worker = swarm.get_worker_by_name("Worker1")
|
||||
worker_id = "worker_123"
|
||||
swarm.pause_worker(worker, worker_id)
|
||||
```
|
||||
|
||||
### `resume_worker(worker: "AbstractWorker", worker_id: str)` <a name="resume_worker"></a>
|
||||
|
||||
The `resume_worker()` method resumes a paused worker, allowing them to continue their activities.
|
||||
|
||||
**Parameters:**
|
||||
- `worker` (AbstractWorker): The worker to be resumed.
|
||||
- `worker_id` (str): The ID of the worker to be resumed.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
worker = swarm.get_worker_by_name("Worker1")
|
||||
worker_id = "worker_123"
|
||||
swarm.resume_worker(worker, worker_id)
|
||||
```
|
||||
|
||||
### `stop_worker(worker: "AbstractWorker", worker_id: str)` <a name="stop_worker"></a>
|
||||
|
||||
The `stop_worker()` method stops a specific worker, terminating their activities.
|
||||
|
||||
**Parameters:**
|
||||
- `worker` (AbstractWorker): The worker to be stopped.
|
||||
- `worker_id` (str): The ID of the worker to be stopped.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
worker = swarm.get_worker_by_name("Worker1")
|
||||
worker_id = "worker_123"
|
||||
swarm.stop_worker(worker, worker_id)
|
||||
```
|
||||
|
||||
### `restart_worker(worker: "AbstractWorker")` <a name="restart_worker"></a>
|
||||
|
||||
The `restart_worker()` method restarts a worker, resetting them to their initial state.
|
||||
|
||||
**Parameters:**
|
||||
- `worker` (AbstractWorker): The worker to be restarted.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
worker = swarm.get_worker_by_name("Worker1")
|
||||
swarm.restart_worker(worker)
|
||||
```
|
||||
|
||||
### `scale_up(num_worker: int)` <a name="scale_up"></a>
|
||||
|
||||
The `scale_up()` method increases the number of workers in the swarm.
|
||||
|
||||
**Parameters:**
|
||||
- `num_worker` (int): The number of workers to add to the swarm.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm.scale_up(5)
|
||||
```
|
||||
|
||||
### `scale_down(num_worker: int)` <a name="scale_down"></a>
|
||||
|
||||
The `scale_down()` method decreases the number of workers in the swarm.
|
||||
|
||||
**Parameters:**
|
||||
- `num_worker` (int): The number of workers to remove from the swarm.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm.scale_down(3)
|
||||
```
|
||||
|
||||
### `scale_to(num_worker: int)` <a name="scale_to"></a>
|
||||
|
||||
The `scale_to()` method scales the swarm to a specific number of workers.
|
||||
|
||||
**Parameters:**
|
||||
- `num_worker` (int): The desired number of workers.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm.scale_to(10)
|
||||
```
|
||||
|
||||
### `get
|
||||
|
||||
_all_workers() -> List["AbstractWorker"]` <a name="get_all_workers"></a>
|
||||
|
||||
The `get_all_workers()` method retrieves a list of all workers in the swarm.
|
||||
|
||||
**Returns:**
|
||||
- `List[AbstractWorker]`: A list of all workers in the swarm.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
all_workers = swarm.get_all_workers()
|
||||
```
|
||||
|
||||
### `get_swarm_size() -> int` <a name="get_swarm_size"></a>
|
||||
|
||||
The `get_swarm_size()` method returns the size of the swarm, which is the total number of workers.
|
||||
|
||||
**Returns:**
|
||||
- `int`: The size of the swarm.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm_size = swarm.get_swarm_size()
|
||||
```
|
||||
|
||||
### `get_swarm_status() -> Dict` <a name="get_swarm_status"></a>
|
||||
|
||||
The `get_swarm_status()` method provides information about the current status of the swarm.
|
||||
|
||||
**Returns:**
|
||||
- `Dict`: A dictionary containing various status indicators for the swarm.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm_status = swarm.get_swarm_status()
|
||||
```
|
||||
|
||||
### `save_swarm_state()` <a name="save_swarm_state"></a>
|
||||
|
||||
The `save_swarm_state()` method allows you to save the current state of the swarm, including worker configurations and task assignments.
|
||||
|
||||
**Usage Example:**
|
||||
|
||||
```python
|
||||
swarm = YourSwarmClass(workers)
|
||||
swarm.save_swarm_state()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
This comprehensive documentation covers the Swarms library, including the `BaseSwarm` class and its methods. You can use this documentation as a guide to understanding and effectively utilizing the Swarms framework for swarm simulation architectures. Feel free to explore further and adapt the library to your specific use cases.
|
@ -0,0 +1,163 @@
|
||||
# `Agent` Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
The `Agent` class is a Python module designed to facilitate interactions with a language model, particularly one that operates as an autonomous agent. This class is part of a larger framework aimed at creating conversational agents using advanced language models like GPT-3. It enables you to establish a conversational loop with the model, generate responses, collect feedback, and control the agent of the conversation.
|
||||
|
||||
In this documentation, you will learn how to use the `Agent` class effectively, its purpose, and how it can be integrated into your projects.
|
||||
|
||||
## Purpose
|
||||
|
||||
The `Agent` class serves several key purposes:
|
||||
|
||||
1. **Conversational Loop**: It establishes a conversational loop with a language model. This means it allows you to interact with the model in a back-and-forth manner, taking turns in the conversation.
|
||||
|
||||
2. **Feedback Collection**: The class allows users to provide feedback on the responses generated by the model. This feedback can be valuable for training and improving the model's responses over time.
|
||||
|
||||
3. **Stoppable Conversation**: You can define custom stopping conditions for the conversation, allowing you to stop the interaction based on specific criteria. For example, you can stop the conversation if a certain keyword is detected in the responses.
|
||||
|
||||
4. **Retry Mechanism**: The class includes a retry mechanism that can be helpful if there are issues generating responses from the model. It attempts to generate a response multiple times before raising an error.
|
||||
|
||||
## Class Definition
|
||||
|
||||
The `Agent` class has the following constructor:
|
||||
|
||||
```python
|
||||
class Agent:
|
||||
def __init__(
|
||||
self,
|
||||
llm: Any,
|
||||
max_loops: int = 5,
|
||||
stopping_condition: Optional[Callable[[str], bool]] = None,
|
||||
loop_interval: int = 1,
|
||||
retry_attempts: int = 3,
|
||||
retry_interval: int = 1,
|
||||
interactive: bool = False,
|
||||
**kwargs: Any,
|
||||
):
|
||||
```
|
||||
|
||||
### Parameters
|
||||
|
||||
- `llm` (Any): The language model with which you want to interact.
|
||||
- `max_loops` (int): The maximum number of conversation loops. Default is 5.
|
||||
- `stopping_condition` (Optional[Callable[[str], bool]]): A custom stopping condition function. Default is `None`.
|
||||
- `loop_interval` (int): The time interval (in seconds) between conversation loops. Default is 1 second.
|
||||
- `retry_attempts` (int): The number of retry attempts if response generation fails. Default is 3.
|
||||
- `retry_interval` (int): The time interval (in seconds) between retry attempts. Default is 1 second.
|
||||
- `interactive` (bool): Set to `True` if the conversation is interactive, meaning the user is involved. Default is `False`.
|
||||
|
||||
## Usage
|
||||
|
||||
The `Agent` class can be used to create a conversational loop with the language model. Here's how you can use it:
|
||||
|
||||
```python
|
||||
from swarms.structs import Agent
|
||||
|
||||
agent = Agent(llm=my_language_model, max_loops=5)
|
||||
|
||||
# Define a starting task or message
|
||||
initial_task = "Generate a 10,000 word blog on health and wellness."
|
||||
|
||||
# Run the conversation loop
|
||||
final_response = agent.run(initial_task)
|
||||
```
|
||||
|
||||
### Feedback
|
||||
|
||||
You can collect feedback during the conversation using the `provide_feedback` method:
|
||||
|
||||
```python
|
||||
agent.provide_feedback(
|
||||
"Generate an SOP for new sales employees on the best cold sales practices"
|
||||
)
|
||||
```
|
||||
|
||||
### Stopping Condition
|
||||
|
||||
You can define a custom stopping condition using a function. For example, you can stop the conversation if the response contains the word "Stop":
|
||||
|
||||
```python
|
||||
from swarms.structs import Agent
|
||||
|
||||
|
||||
def stop_when_repeats(response: str) -> bool:
|
||||
return "Stop" in response.lower()
|
||||
|
||||
|
||||
agent = Agent(llm=my_language_model, max_loops=5, stopping_condition=stop_when_repeats)
|
||||
```
|
||||
|
||||
### Retry Mechanism
|
||||
|
||||
If the response generation fails, the class will retry up to the specified number of attempts:
|
||||
|
||||
```python
|
||||
agent = Agent(llm=my_language_model, max_loops=5, retry_attempts=3)
|
||||
```
|
||||
|
||||
## Additional Information
|
||||
|
||||
- To save the conversation history to a file, you can use the `save` method.
|
||||
|
||||
- To load a previously saved conversation history, you can use the `load` method.
|
||||
|
||||
- The class includes methods for bulk running conversations with multiple input sets.
|
||||
|
||||
## Examples
|
||||
|
||||
Here are three usage examples:
|
||||
|
||||
### Example 1: Simple Conversation
|
||||
|
||||
```python
|
||||
# Select any Language model from the models folder
|
||||
from swarms.models import Mistral, OpenAIChat
|
||||
from swarms.structs import Agent
|
||||
|
||||
llm = Mistral()
|
||||
# llm = OpenAIChat()
|
||||
|
||||
agent = Agent(llm=llm, max_loops=5)
|
||||
|
||||
# Define a starting task or message
|
||||
initial_task = "Generate an long form analysis on the transformer model architecture."
|
||||
|
||||
# Run the conversation loop
|
||||
final_response = agent.run(initial_task)
|
||||
```
|
||||
|
||||
### Example 2: Custom Stopping Condition
|
||||
|
||||
```python
|
||||
from swarms.structs import Agent
|
||||
|
||||
|
||||
def stop_when_repeats(response: str) -> bool:
|
||||
return "Stop" in response.lower()
|
||||
|
||||
|
||||
agent = Agent(llm=llm, max_loops=5, stopping_condition=stop_when_repeats)
|
||||
```
|
||||
|
||||
### Example 3: Interactive Conversation
|
||||
|
||||
```python
|
||||
from swarms.structs import Agent
|
||||
|
||||
agent = Agent(llm=llm, max_loops=5, interactive=True)
|
||||
|
||||
# Provide initial task
|
||||
initial_task = "Rank and prioritize the following financial documents and cut out 30% of our expenses"
|
||||
|
||||
# Run the conversation loop
|
||||
final_response = agent.run(initial_task)
|
||||
```
|
||||
|
||||
## References and Resources
|
||||
|
||||
- [GitHub Repository](https://github.com/kyegomez/swarms)
|
||||
|
||||
## Conclusion
|
||||
|
||||
The `Agent` class provides a powerful way to interact with language models in a conversational manner. By defining custom stopping conditions, collecting feedback, and controlling the agent of the conversation, you can create engaging and interactive applications that make use of advanced language models.
|
@ -0,0 +1,274 @@
|
||||
# Documentation for `AgentRearrange` Class
|
||||
-----
|
||||
|
||||
The `AgentRearrange` class represents a swarm of agents for rearranging tasks. It allows you to create a swarm of agents, add or remove agents from the swarm, and run the swarm to process tasks based on a specified flow pattern.
|
||||
|
||||
## Attributes
|
||||
----------
|
||||
|
||||
| Attribute | Type | Description |
|
||||
| --- | --- | --- |
|
||||
| `agents` | `dict` | A dictionary of agents, where the key is the agent's name and the value is the agent object. |
|
||||
| `flow` | `str` | The flow pattern of the tasks. |
|
||||
| `max_loops` | `int` | The maximum number of loops for the agents to run. |
|
||||
| `verbose` | `bool` | Whether to enable verbose logging or not. |
|
||||
|
||||
## Methods
|
||||
-------
|
||||
|
||||
### `__init__(self, agents: List[Agent] = None, flow: str = None, max_loops: int = 1, verbose: bool = True)`
|
||||
|
||||
Initializes the `AgentRearrange` object.
|
||||
|
||||
| Parameter | Type | Description |
|
||||
| --- | --- | --- |
|
||||
| `agents` | `List[Agent]` (optional) | A list of `Agent` objects. Defaults to `None`. |
|
||||
| `flow` | `str` (optional) | The flow pattern of the tasks. Defaults to `None`. |
|
||||
| `max_loops` | `int` (optional) | The maximum number of loops for the agents to run. Defaults to `1`. |
|
||||
| `verbose` | `bool` (optional) | Whether to enable verbose logging or not. Defaults to `True`. |
|
||||
|
||||
### `add_agent(self, agent: Agent)`
|
||||
|
||||
Adds an agent to the swarm.
|
||||
|
||||
| Parameter | Type | Description |
|
||||
| --- | --- | --- |
|
||||
| `agent` | `Agent` | The agent to be added. |
|
||||
|
||||
### `remove_agent(self, agent_name: str)`
|
||||
|
||||
Removes an agent from the swarm.
|
||||
|
||||
| Parameter | Type | Description |
|
||||
| --- | --- | --- |
|
||||
| `agent_name` | `str` | The name of the agent to be removed. |
|
||||
|
||||
### `add_agents(self, agents: List[Agent])`
|
||||
|
||||
Adds multiple agents to the swarm.
|
||||
|
||||
| Parameter | Type | Description |
|
||||
| --- | --- | --- |
|
||||
| `agents` | `List[Agent]` | A list of `Agent` objects. |
|
||||
|
||||
### `validate_flow(self)`
|
||||
|
||||
Validates the flow pattern.
|
||||
|
||||
**Raises:**
|
||||
|
||||
- `ValueError`: If the flow pattern is incorrectly formatted or contains duplicate agent names.
|
||||
|
||||
**Returns:**
|
||||
|
||||
- `bool`: `True` if the flow pattern is valid.
|
||||
|
||||
### `run(self, task: str, *args, **kwargs)`
|
||||
|
||||
Runs the swarm to rearrange the tasks.
|
||||
|
||||
| Parameter | Type | Description |
|
||||
| --- | --- | --- |
|
||||
| `task` | `str` | The initial task to be processed. |
|
||||
| `*args` | - | Additional positional arguments. |
|
||||
| `**kwargs` | - | Additional keyword arguments. |
|
||||
|
||||
**Returns:**
|
||||
|
||||
- `str`: The final processed task.
|
||||
|
||||
## Documentation for `rearrange` Function
|
||||
======================================
|
||||
|
||||
The `rearrange` function is a helper function that rearranges the given list of agents based on the specified flow.
|
||||
|
||||
## Parameters
|
||||
----------
|
||||
|
||||
| Parameter | Type | Description |
|
||||
| --- | --- | --- |
|
||||
| `agents` | `List[Agent]` | The list of agents to be rearranged. |
|
||||
| `flow` | `str` | The flow used for rearranging the agents. |
|
||||
| `task` | `str` (optional) | The task to be performed during rearrangement. Defaults to `None`. |
|
||||
| `*args` | - | Additional positional arguments. |
|
||||
| `**kwargs` | - | Additional keyword arguments. |
|
||||
|
||||
## Returns
|
||||
-------
|
||||
|
||||
The result of running the agent system with the specified task.
|
||||
|
||||
### Example
|
||||
-------
|
||||
|
||||
```python
|
||||
agents = [agent1, agent2, agent3]
|
||||
flow = "agent1 -> agent2, agent3"
|
||||
task = "Perform a task"
|
||||
rearrange(agents, flow, task)
|
||||
```
|
||||
|
||||
### Example Usage
|
||||
-------------
|
||||
|
||||
Here's an example of how to use the `AgentRearrange` class and the `rearrange` function:
|
||||
|
||||
```python
|
||||
from swarms import Agent, AgentRearrange, rearrange
|
||||
from typing import List
|
||||
|
||||
# Initialize the director agent
|
||||
director = Agent(
|
||||
agent_name="Director",
|
||||
system_prompt="Directs the tasks for the workers",
|
||||
llm=Anthropic(),
|
||||
max_loops=1,
|
||||
dashboard=False,
|
||||
streaming_on=True,
|
||||
verbose=True,
|
||||
stopping_token="<DONE>",
|
||||
state_save_file_type="json",
|
||||
saved_state_path="director.json",
|
||||
)
|
||||
|
||||
# Initialize worker 1
|
||||
worker1 = Agent(
|
||||
agent_name="Worker1",
|
||||
system_prompt="Generates a transcript for a youtube video on what swarms are",
|
||||
llm=Anthropic(),
|
||||
max_loops=1,
|
||||
dashboard=False,
|
||||
streaming_on=True,
|
||||
verbose=True,
|
||||
stopping_token="<DONE>",
|
||||
state_save_file_type="json",
|
||||
saved_state_path="worker1.json",
|
||||
)
|
||||
|
||||
# Initialize worker 2
|
||||
worker2 = Agent(
|
||||
agent_name="Worker2",
|
||||
system_prompt="Summarizes the transcript generated by Worker1",
|
||||
llm=Anthropic(),
|
||||
max_loops=1,
|
||||
dashboard=False,
|
||||
streaming_on=True,
|
||||
verbose=True,
|
||||
stopping_token="<DONE>",
|
||||
state_save_file_type="json",
|
||||
saved_state_path="worker2.json",
|
||||
)
|
||||
|
||||
# Create a list of agents
|
||||
agents = [director, worker1, worker2]
|
||||
|
||||
# Define the flow pattern
|
||||
flow = "Director -> Worker1 -> Worker2"
|
||||
|
||||
# Using AgentRearrange class
|
||||
agent_system = AgentRearrange(agents=agents, flow=flow)
|
||||
output = agent_system.run("Create a format to express and communicate swarms of llms in a structured manner for youtube")
|
||||
print(output)
|
||||
|
||||
# Using rearrange function
|
||||
output = rearrange(agents, flow, "Create a format to express and communicate swarms of llms in a structured manner for youtube")
|
||||
print(output)
|
||||
|
||||
```
|
||||
|
||||
In this example, we first initialize three agents: `director`, `worker1`, and `worker2`. Then, we create a list of these agents and define the flow pattern `"Director -> Worker1 -> Worker2"`.
|
||||
|
||||
We can use the `AgentRearrange` class by creating an instance of it with the list of agents and the flow pattern. We then call the `run` method with the initial task, and it will execute the agents in the specified order, passing the output of one agent as the input to the next agent.
|
||||
|
||||
Alternatively, we can use the `rearrange` function by passing the list of agents, the flow pattern, and the initial task as arguments.
|
||||
|
||||
Both the `AgentRearrange` class and the `rearrange` function will return the final output after processing the task through the agents according to the specified flow pattern.
|
||||
|
||||
## Error Handling
|
||||
--------------
|
||||
|
||||
The `AgentRearrange` class includes error handling mechanisms to validate the flow pattern. If the flow pattern is incorrectly formatted or contains duplicate agent names, a `ValueError` will be raised with an appropriate error message.
|
||||
|
||||
### Example:
|
||||
|
||||
```python
|
||||
# Invalid flow pattern
|
||||
invalid_flow = "Director->Worker1,Worker2->Worker3"
|
||||
agent_system = AgentRearrange(agents=agents, flow=invalid_flow)
|
||||
output = agent_system.run("Some task")`
|
||||
```
|
||||
|
||||
This will raise a `ValueError` with the message `"Agent 'Worker3' is not registered."`.
|
||||
|
||||
|
||||
## Parallel and Sequential Processing
|
||||
----------------------------------
|
||||
|
||||
The `AgentRearrange` class supports both parallel and sequential processing of tasks based on the specified flow pattern. If the flow pattern includes multiple agents separated by commas (e.g., `"agent1, agent2"`), the agents will be executed in parallel, and their outputs will be concatenated with a semicolon (`;`). If the flow pattern includes a single agent, it will be executed sequentially.
|
||||
|
||||
|
||||
### Parallel processing
|
||||
`parallel_flow = "Worker1, Worker2 -> Director"`
|
||||
|
||||
### Sequential processing
|
||||
`sequential_flow = "Worker1 -> Worker2 -> Director"`
|
||||
|
||||
In the `parallel_flow` example, `Worker1` and `Worker2` will be executed in parallel, and their outputs will be concatenated and passed to `Director`. In the `sequential_flow` example, `Worker1` will be executed first, and its output will be passed to `Worker2`, and then the output of `Worker2` will be passed to `Director`.
|
||||
|
||||
## Logging
|
||||
-------
|
||||
|
||||
The `AgentRearrange` class includes logging capabilities using the `loguru` library. If `verbose` is set to `True` during initialization, a log file named `agent_rearrange.log` will be created, and log messages will be written to it. You can use this log file to track the execution of the agents and any potential issues or errors that may occur.
|
||||
|
||||
|
||||
```bash
|
||||
2023-05-08 10:30:15.456 | INFO | agent_rearrange:__init__:34 - Adding agent Director to the swarm.
|
||||
2023-05-08 10:30:15.457 | INFO | agent_rearrange:__init__:34 - Adding agent Worker1 to the swarm.
|
||||
2023-05-08 10:30:15.457 | INFO | agent_rearrange:__init__:34 - Adding agent Worker2 to the swarm.
|
||||
2023-05-08 10:30:15.458 | INFO | agent_rearrange:run:118 - Running agents in parallel: ['Worker1', 'Worker2']
|
||||
2023-05-08 10:30:15.459 | INFO | agent_rearrange:run:121 - Running agents sequentially: ['Director']`
|
||||
```
|
||||
|
||||
## Additional Parameters
|
||||
---------------------
|
||||
|
||||
The `AgentRearrange` class also accepts additional parameters that can be passed to the `run` method using `*args` and `**kwargs`. These parameters will be forwarded to the individual agents during execution.
|
||||
|
||||
`agent_system = AgentRearrange(agents=agents, flow=flow)`
|
||||
`output = agent_system.run("Some task", max_tokens=200, temperature=0.7)`
|
||||
|
||||
In this example, the `max_tokens` and `temperature` parameters will be passed to each agent during execution.
|
||||
|
||||
## Customization
|
||||
-------------
|
||||
|
||||
The `AgentRearrange` class and the `rearrange` function can be customized and extended to suit specific use cases. For example, you can create custom agents by inheriting from the `Agent` class and implementing custom logic for task processing. You can then add these custom agents to the swarm and define the flow pattern accordingly.
|
||||
|
||||
Additionally, you can modify the `run` method of the `AgentRearrange` class to implement custom logic for task processing and agent interaction.
|
||||
|
||||
|
||||
## Limitations
|
||||
-----------
|
||||
|
||||
It's important to note that the `AgentRearrange` class and the `rearrange` function rely on the individual agents to process tasks correctly. The quality of the output will depend on the capabilities and configurations of the agents used in the swarm. Additionally, the `AgentRearrange` class does not provide any mechanisms for task prioritization or load balancing among the agents.
|
||||
|
||||
## Future Improvements
|
||||
-------------------
|
||||
|
||||
Here are some potential future improvements for the `AgentRearrange` class and the `rearrange` function:
|
||||
|
||||
- **Task Prioritization**: Implement a mechanism to prioritize tasks based on factors such as urgency, importance, or resource availability.
|
||||
- **Load Balancing**: Incorporate load balancing algorithms to distribute tasks among agents more efficiently, taking into account factors such as agent availability, performance, and resource utilization.
|
||||
- **Dynamic Flow Reconfiguration**: Allow for dynamic reconfiguration of the flow pattern during runtime, enabling the addition, removal, or reordering of agents based on specific conditions or events.
|
||||
- **Error Handling and Fault Tolerance**: Enhance error handling and fault tolerance mechanisms to gracefully handle agent failures, task timeouts, or other exceptional situations.
|
||||
- **Monitoring and Metrics**: Implement monitoring and metrics collection to track the performance and efficiency of the swarm, as well as individual agent performance.
|
||||
- **Scalability**: Enhance the scalability of the system to handle larger numbers of agents and tasks efficiently.
|
||||
|
||||
## Conclusion
|
||||
----------
|
||||
|
||||
The `AgentRearrange` class and the `rearrange` function provide a flexible and extensible framework for orchestrating swarms of agents to process tasks based on a specified flow pattern. By combining the capabilities of individual agents, you can create complex workflows and leverage the strengths of different agents to tackle various tasks efficiently.
|
||||
|
||||
While the current implementation offers basic functionality for agent rearrangement, there is room for future improvements and customizations to enhance the system's capabilities and cater to more specific use cases.
|
||||
|
||||
Whether you're working on natural language processing tasks, data analysis, or any other domain where agent-based systems can be beneficial, the `AgentRearrange` class and the `rearrange` function provide a solid foundation for building and experimenting with swarm-based solutions.
|