Merge branch 'master' into dockerize

Former-commit-id: b285542e107c7255896a60c19b1291c4b48c498b
pull/88/head
Zack 1 year ago
commit 1b65ca9627

@ -0,0 +1,12 @@
# this is a config file for the github action labeler
# Add 'label1' to any changes within 'example' folder or any subfolders
example_change:
- example/**
# Add 'label2' to any file changes within 'example2' folder
example2_change: example2/*
# Add label3 to any change to .txt files within the entire repository. Quotation marks are required for the leading asterisk
text_files:
- '**/*.txt'

@ -9,6 +9,7 @@ on:
jobs:
build:
name: 👋 Welcome
permissions: write-all
runs-on: ubuntu-latest
steps:
- uses: actions/first-interaction@v1.2.0

1
.gitignore vendored

@ -28,6 +28,7 @@ error.txt
# C extensions
*.so
.ruff_cache
errors.txt

@ -0,0 +1,128 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
kye@apac.ai.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series
of actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or
permanent ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within
the community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
Community Impact Guidelines were inspired by [Mozilla's code of conduct
enforcement ladder](https://github.com/mozilla/diversity).
[homepage]: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see the FAQ at
https://www.contributor-covenant.org/faq. Translations are available at
https://www.contributor-covenant.org/translations.

@ -100,6 +100,35 @@ You can learn more about mkdocs on the [mkdocs website](https://www.mkdocs.org/)
- Run all the tests in the tests folder
`find ./tests -name '*.py' -exec pytest {} \;`
## Code Quality
`quality.sh` runs 4 different code formatters for ultra reliable code cleanup using Autopep8, Black, Ruff, YAPF
1. Open your terminal.
2. Change directory to where `quality.sh` is located using `cd` command:
```sh
cd /path/to/directory
```
3. Make sure the script has execute permissions:
```sh
chmod +x code_quality.sh
```
4. Run the script:
```sh
./quality.sh
```
If the script requires administrative privileges, you might need to run it with `sudo`:
```sh
sudo ./quality.sh
```
Please replace `/path/to/directory` with the actual path where the `quality.sh` script is located on your system.
If you're asking for a specific content or functionality inside `quality.sh` related to YAPF or other code quality tools, you would need to edit the `quality.sh` script to include the desired commands, such as running YAPF on a directory. The contents of `quality.sh` would dictate exactly what happens when you run it.
## 📄 license
By contributing, you agree that your contributions will be licensed under an [MIT license](https://github.com/kyegomez/swarms/blob/develop/LICENSE.md).

@ -1,10 +1,9 @@
FROM python:3.8-slim-buster
WORKDIR /home/zack/code/swarms/*
WORKDIR /usr/src/app
ADD . /home/zack/code/swarms/*
ADD . .
RUN pip install --no-cache-dir -r requirements.txt
EXPOSE 8000

@ -39,7 +39,10 @@ Book a [1-on-1 Session with Kye](https://calendly.com/swarm-corp/30min), the Cre
We have a small gallery of examples to run here, [for more check out the docs to build your own agent and or swarms!](https://docs.apac.ai)
### `Flow` Example
- The `Flow` is a superior iteratioin of the `LLMChain` from Langchain, our intent with `Flow` is to create the most reliable loop structure that gives the agents their "autonomy" through 3 main methods of interaction, one through user specified loops, then dynamic where the agent parses a <DONE> token, and or an interactive human input verison, or a mix of all 3.
- Reliable Structure that provides LLMS autonomy
- Extremely Customizeable with stopping conditions, interactivity, dynamical temperature, loop intervals, and so much more
- Enterprise Grade + Production Grade: `Flow` is designed and optimized for automating real-world tasks at scale!
```python
from swarms.models import OpenAIChat
@ -47,71 +50,86 @@ from swarms.structs import Flow
api_key = ""
# Initialize the language model,
# This model can be swapped out with Anthropic, ETC, Huggingface Models like Mistral, ETC
# Initialize the language model, this model can be swapped out with Anthropic, ETC, Huggingface Models like Mistral, ETC
llm = OpenAIChat(
# model_name="gpt-4"
openai_api_key=api_key,
temperature=0.5,
# max_tokens=100,
)
# Initialize the flow
## Initialize the workflow
flow = Flow(
llm=llm,
max_loops=5,
max_loops=2,
dashboard=True,
# stopping_condition=None, # You can define a stopping condition as needed.
# loop_interval=1,
# retry_attempts=3,
# retry_interval=1,
# interactive=False, # Set to 'True' for interactive mode.
# dynamic_temperature=False, # Set to 'True' for dynamic temperature handling.
)
out = flow.run("Generate a 10,000 word blog, say Stop when done")
print(out)
# out = flow.load_state("flow_state.json")
# temp = flow.dynamic_temperature()
# filter = flow.add_response_filter("Trump")
out = flow.run("Generate a 10,000 word blog on health and wellness.")
# out = flow.validate_response(out)
# out = flow.analyze_feedback(out)
# out = flow.print_history_and_memory()
# # out = flow.save_state("flow_state.json")
# print(out)
```
------
## `GodMode`
- A powerful tool for concurrent execution of tasks using multiple Language Model (LLM) instances.
### `SequentialWorkflow`
- A Sequential swarm of autonomous agents where each agent's outputs are fed into the next agent
- Save and Restore Workflow states!
- Integrate Flow's with various LLMs and Multi-Modality Models
```python
from swarms.swarms import GodMode
from swarms.models import OpenAIChat
from swarms.structs import Flow
from swarms.structs.sequential_workflow import SequentialWorkflow
api_key = ""
# Example usage
api_key = (
"" # Your actual API key here
)
# Initialize the language flow
llm = OpenAIChat(
openai_api_key=api_key
openai_api_key=api_key,
temperature=0.5,
max_tokens=3000,
)
# Initialize the Flow with the language flow
flow1 = Flow(llm=llm, max_loops=1, dashboard=False)
llms = [
llm,
llm,
llm
]
# Create another Flow for a different task
flow2 = Flow(llm=llm, max_loops=1, dashboard=False)
god_mode = GodMode(llms)
task = 'Generate a 10,000 word blog on health and wellness.'
out = god_mode.run(task)
god_mode.print_responses(task)
```
------
### `OmniModalAgent`
- OmniModal Agent is an LLM that access to 10+ multi-modal encoders and diffusers! It can generate images, videos, speech, music and so much more, get started with:
```python
from swarms.models import OpenAIChat
from swarms.agents import OmniModalAgent
# Create the workflow
workflow = SequentialWorkflow(max_loops=1)
api_key = "SK-"
# Add tasks to the workflow
workflow.add("Generate a 10,000 word blog on health and wellness.", flow1)
llm = OpenAIChat(model_name="gpt-4", openai_api_key=api_key)
# Suppose the next task takes the output of the first task as input
workflow.add("Summarize the generated blog", flow2)
agent = OmniModalAgent(llm)
# Run the workflow
workflow.run()
agent.run("Create a video of a swarm of fish")
# Output the results
for task in workflow.tasks:
print(f"Task: {task.description}, Result: {task.result}")
```
@ -122,8 +140,10 @@ agent.run("Create a video of a swarm of fish")
## Contribute
- We're always looking for contributors to help us improve and expand this project. If you're interested, please check out our [Contributing Guidelines](CONTRIBUTING.md) and our [contributing board](https://github.com/users/kyegomez/projects/1)
We're always looking for contributors to help us improve and expand this project. If you're interested, please check out our [Contributing Guidelines](CONTRIBUTING.md) and our [contributing board](https://github.com/users/kyegomez/projects/1)
## Community
- [Join the Swarms community here on Discord!](https://discord.gg/AJazBmhKnr)
# License

@ -0,0 +1,32 @@
# Security Policy
===============
## Supported Versions
------------------
* * * * *
| Version | Supported |
| --- | --- |
| 2.0.5 | :white_check_mark: |
| 2.0.4 | :white_check_mark: |
| 2.0.3 | :white_check_mark: |
| 2.0.2 | :white_check_mark: |
| 2.0.1 | :white_check_mark: |
| 2.0.0 | :white_check_mark: |
# Reporting a Vulnerability
-------------------------
* * * * *
If you discover a security vulnerability in any of the above versions, please report it immediately to our security team by sending an email to kye@apac.ai. We take security vulnerabilities seriously and appreciate your efforts in disclosing them responsibly.
Please provide detailed information on the vulnerability, including steps to reproduce, potential impact, and any known mitigations. Our security team will acknowledge receipt of your report within 24 hours and will provide regular updates on the progress of the investigation.
Once the vulnerability has been thoroughly assessed, we will take the necessary steps to address it. This may include releasing a security patch, issuing a security advisory, or implementing other appropriate mitigations.
We aim to respond to all vulnerability reports in a timely manner and work towards resolving them as quickly as possible. We thank you for your contribution to the security of our software.
Please note that any vulnerability reports that are not related to the specified versions or do not provide sufficient information may be declined.

@ -0,0 +1,19 @@
#!/bin/bash
# Navigate to the directory containing the 'swarms' folder
# cd /path/to/your/code/directory
# Run autopep8 with max aggressiveness (-aaa) and in-place modification (-i)
# on all Python files (*.py) under the 'swarms' directory.
autopep8 --in-place --aggressive --aggressive --recursive --experimental --list-fixes swarms/
# Run black with default settings, since black does not have an aggressiveness level.
# Black will format all Python files it finds in the 'swarms' directory.
black --experimental-string-processing swarms/
# Run ruff on the 'swarms' directory.
# Add any additional flags if needed according to your version of ruff.
ruff swarms/
# YAPF
yapf --recursive --in-place --verbose --style=google --parallel swarms

@ -0,0 +1,35 @@
import re
from swarms.models.nougat import Nougat
from swarms.structs import Flow
from swarms.models import OpenAIChat
from swarms.models import LayoutLMDocumentQA
# # URL of the image of the financial document
IMAGE_OF_FINANCIAL_DOC_URL = "bank_statement_2.jpg"
# Example usage
api_key = ""
# Initialize the language flow
llm = OpenAIChat(
openai_api_key=api_key,
)
# LayoutLM Document QA
pdf_analyzer = LayoutLMDocumentQA()
question = "What is the total amount of expenses?"
answer = pdf_analyzer(
question,
IMAGE_OF_FINANCIAL_DOC_URL,
)
# Initialize the Flow with the language flow
agent = Flow(llm=llm)
SUMMARY_AGENT_PROMPT = f"""
Generate an actionable summary of this financial document be very specific and precise, provide bulletpoints be very specific provide methods of lowering expenses: {answer}"
"""
# Add tasks to the workflow
summary_agent = agent.run(SUMMARY_AGENT_PROMPT)
print(summary_agent)

Binary file not shown.

After

Width:  |  Height:  |  Size: 538 KiB

@ -0,0 +1,101 @@
import re
from concurrent.futures import ThreadPoolExecutor, as_completed
from swarms.models import OpenAIChat
class AutoTempAgent:
"""
AutoTemp is a tool for automatically selecting the best temperature setting for a given task.
Flow:
1. Generate outputs at a range of temperature settings.
2. Evaluate each output using the default temperature setting.
3. Select the best output based on the evaluation score.
4. Return the best output.
Args:
temperature (float, optional): The default temperature setting to use. Defaults to 0.5.
api_key (str, optional): Your OpenAI API key. Defaults to None.
alt_temps ([type], optional): A list of alternative temperature settings to try. Defaults to None.
auto_select (bool, optional): If True, the best temperature setting will be automatically selected. Defaults to True.
max_workers (int, optional): The maximum number of workers to use when generating outputs. Defaults to 6.
Returns:
[type]: [description]
Examples:
>>> from swarms.demos.autotemp import AutoTemp
>>> autotemp = AutoTemp()
>>> autotemp.run("Generate a 10,000 word blog on mental clarity and the benefits of meditation.", "0.4,0.6,0.8,1.0,1.2,1.4")
Best AutoTemp Output (Temp 0.4 | Score: 100.0):
Generate a 10,000 word blog on mental clarity and the benefits of meditation.
"""
def __init__(
self,
temperature: float = 0.5,
api_key: str = None,
alt_temps=None,
auto_select=True,
max_workers=6,
):
self.alt_temps = alt_temps if alt_temps else [0.4, 0.6, 0.8, 1.0, 1.2, 1.4]
self.auto_select = auto_select
self.max_workers = max_workers
self.temperature = temperature
self.alt_temps = alt_temps
self.llm = OpenAIChat(
openai_api_key=api_key,
temperature=temperature,
)
def evaluate_output(self, output: str):
"""Evaluate the output using the default temperature setting."""
eval_prompt = f"""
Evaluate the following output which was generated at a temperature setting of {self.temperature}.
Provide a precise score from 0.0 to 100.0, considering the criteria of relevance, clarity, utility, pride, and delight.
Output to evaluate:
---
{output}
---
"""
score_text = self.llm(prompt=eval_prompt)
score_match = re.search(r"\b\d+(\.\d)?\b", score_text)
return round(float(score_match.group()), 1) if score_match else 0.0
def run(self, task: str, temperature_string):
"""Run the AutoTemp agent."""
temperature_list = [
float(temp.strip()) for temp in temperature_string.split(",")
]
outputs = {}
scores = {}
with ThreadPoolExecutor(max_workers=self.max_workers) as executor:
future_to_temp = {
executor.submit(self.llm.generate, task, temp): temp
for temp in temperature_list
}
for future in as_completed(future_to_temp):
temp = future_to_temp[future]
output_text = future.result()
outputs[temp] = output_text
scores[temp] = self.evaluate_output(output_text, temp)
if not scores:
return "No valid outputs generated.", None
sorted_scores = sorted(scores.items(), key=lambda item: item[1], reverse=True)
best_temp, best_score = sorted_scores[0]
best_output = outputs[best_temp]
return (
f"Best AutoTemp Output (Temp {best_temp} | Score: {best_score}):\n{best_output}"
if self.auto_select
else "\n".join(
f"Temp {temp} | Score: {score}:\n{outputs[temp]}"
for temp, score in sorted_scores
)
)

@ -0,0 +1,30 @@
from swarms.structs import Flow
from swarms.models import Idefics
# Multi Modality Auto Agent
llm = Idefics(max_length=2000)
task = "User: What is in this image? https://upload.wikimedia.org/wikipedia/commons/8/86/Id%C3%A9fix.JPG"
## Initialize the workflow
flow = Flow(
llm=llm,
max_loops=2,
dashboard=True,
# stopping_condition=None, # You can define a stopping condition as needed.
# loop_interval=1,
# retry_attempts=3,
# retry_interval=1,
# interactive=False, # Set to 'True' for interactive mode.
# dynamic_temperature=False, # Set to 'True' for dynamic temperature handling.
)
# out = flow.load_state("flow_state.json")
# temp = flow.dynamic_temperature()
# filter = flow.add_response_filter("Trump")
out = flow.run(task)
# out = flow.validate_response(out)
# out = flow.analyze_feedback(out)
# out = flow.print_history_and_memory()
# # out = flow.save_state("flow_state.json")
# print(out)

@ -23,7 +23,7 @@ Distribution Agent:
"""
from swarms import OpenAIChat
from swarms.models import OpenAIChat
from termcolor import colored
TOPIC_GENERATOR = f"""

@ -0,0 +1,5 @@
"""
Autonomous swarm that optimizes UI autonomously
GPT4Vision ->> GPT4 ->> UI
"""

@ -0,0 +1,63 @@
# 2O+ Autonomous Agent Blogs
1. **The Ultimate Guide to Deploying Production-Ready Autonomous Agents with Swarms**
- A comprehensive start-to-finish guide on implementing Swarms in a production environment.
2. **5 Steps to Elevate Your AI with Swarms Multi-Modal Autonomous Agents**
- A walkthrough highlighting the simplicity of Swarms setup and deployment for various AI applications.
3. **Integrating Swarms Into Your Enterprise Workflow: A Step-By-Step Tutorial**
- A practical guide focusing on integrating Swarms into existing enterprise systems.
4. **Swarms Flow: Streamlining AI Deployment in Your Business**
- Exploring the benefits and technicalities of using the Flow feature to simplify complex AI workflows.
5. **From Zero to Hero: Building Your First Enterprise-Grade AI Agent with Swarms**
- A beginner-friendly walkthrough for building and deploying an AI agent using Swarms.
6. **Scaling AI with Swarms: Managing Multi-Agent Systems Efficiently**
- Strategies and best practices for scaling multi-agent systems in enterprise settings.
7. **Creating Resilient AI Systems with Swarms' Autonomous Agents**
- Discussing the robustness of Swarms agents and how they maintain performance under stress.
8. **Unlocking New Capabilities: Advanced Features of Swarms for AI Engineers**
- Diving into the more sophisticated features of Swarms and how they can be leveraged in complex projects.
9. **Swarms Quick Wins: Implementing AI Agents in Less Than 5 Lines of Code**
- A focused guide on rapidly deploying functional AI agents with minimal coding.
10. **Benchmarking Your AI: Performance Metrics with Swarms**
- How to use Swarms to measure and optimize the performance of AI agents.
11. **Swarms Case Studies: Real-World Success Stories from AI Engineers**
- Sharing stories and testimonials of how various organizations successfully implemented Swarms.
12. **Effortless Multi-Modal Model Deployment: A Swarms Walkthrough**
- Explaining how to use Swarms to deploy multi-modal models with ease.
13. **Future-Proof Your AI: Adapting to New Tech with Swarms**
- How Swarms' flexible architecture allows for easy updates and adaptation to new AI technologies.
14. **Enterprise AI Security: Ensuring Your Swarms Agents are Hack-Proof**
- Best practices for securing autonomous agents in enterprise applications.
15. **Migrating to Swarms: Transitioning From Legacy Systems**
- A guide for AI engineers on migrating existing AI systems to Swarms without downtime.
16. **Multi-Agent Collaboration: How Swarms Facilitates Teamwork Among AI**
- An insight into how Swarms allows for multiple AI agents to work together seamlessly.
17. **The Engineer's Toolkit: Swarms' Features Every AI Developer Must Know**
- Highlighting the most useful tools and features of Swarms from an AI developers perspective.
18. **Swarms for Different Industries: Customizing AI Agents for Niche Markets**
- Exploring how Swarms can be tailored to fit the needs of various industries such as healthcare, finance, and retail.
19. **Building Intelligent Workflows with Swarms Flow**
- A tutorial on using the Flow feature to create intelligent, responsive AI-driven workflows.
20. **Troubleshooting Common Issues When Deploying Swarms Autonomous Agents**
- A problem-solving guide for AI engineers on overcoming common challenges when implementing Swarms agents.
Each blog or walkthrough can be structured to not only showcase the functionality and benefits of the Swarms framework but also to establish the brand as a thought leader in the space of enterprise AI solutions.

@ -0,0 +1,239 @@
# Enterprise-Grade Workflow Automation With Autonomous Agents
========================================================================
Welcome to this comprehensive walkthrough guide tutorial on the SequentialWorkflow feature of the Swarms Framework! In this tutorial, we will explore the purpose, usage, and key concepts of the SequentialWorkflow class, which is a part of the swarms package. Whether you are a beginner, intermediate, or expert developer, this tutorial will provide you with a clear understanding of how to effectively use the SequentialWorkflow class in your projects.
AI engineering is a dynamic and evolving field that involves the development and deployment of intelligent systems and applications. In this ever-changing landscape, AI engineers often face the challenge of orchestrating complex sequences of tasks, managing data flows, and ensuring the smooth execution of AI workflows. This is where the Workflow Class, such as the SequentialWorkflow class we discussed earlier, plays a pivotal role in enabling AI engineers to achieve their goals efficiently and effectively.
## The Versatile World of AI Workflows
AI workflows encompass a wide range of tasks and processes, from data preprocessing and model training to natural language understanding and decision-making. These workflows are the backbone of AI systems, guiding them through intricate sequences of actions to deliver meaningful results. Here are some of the diverse use cases where the Workflow Class can empower AI engineers:
### 1. Natural Language Processing (NLP) Pipelines
AI engineers often build NLP pipelines that involve multiple stages such as text preprocessing, tokenization, feature extraction, model inference, and post-processing. The Workflow Class enables the orderly execution of these stages, ensuring that textual data flows seamlessly through each step, resulting in accurate and coherent NLP outcomes.
### 2. Data Ingestion and Transformation
AI projects frequently require the ingestion of diverse data sources, including structured databases, unstructured text, and multimedia content. The Workflow Class can be used to design data ingestion workflows that extract, transform, and load (ETL) data efficiently, making it ready for downstream AI tasks like training and analysis.
### 3. Autonomous Agents and Robotics
In autonomous robotics and intelligent agent systems, workflows are essential for decision-making, sensor fusion, motion planning, and control. AI engineers can use the Workflow Class to create structured sequences of actions that guide robots and agents through dynamic environments, enabling them to make informed decisions and accomplish tasks autonomously.
### 4. Machine Learning Model Training
Training machine learning models involves a series of steps, including data preprocessing, feature engineering, model selection, hyperparameter tuning, and evaluation. The Workflow Class simplifies the orchestration of these steps, allowing AI engineers to experiment with different configurations and track the progress of model training.
### 5. Content Generation and Summarization
AI-driven content generation tasks, such as generating articles, reports, or summaries, often require multiple steps, including content creation and post-processing. The Workflow Class can be used to create content generation workflows, ensuring that the generated content meets quality and coherence criteria.
### 6. Adaptive Decision-Making
In AI systems that make real-time decisions based on changing data and environments, workflows facilitate adaptive decision-making. Engineers can use the Workflow Class to design decision-making pipelines that take into account the latest information and make informed choices.
## Enabling Efficiency and Maintainability
The Workflow Class provides AI engineers with a structured and maintainable approach to building, executing, and managing complex AI workflows. It offers the following advantages:
- Modularity: Workflows can be modularly designed, allowing engineers to focus on individual task implementations and ensuring code reusability.
- Debugging and Testing: The Workflow Class simplifies debugging and testing by providing a clear sequence of tasks and well-defined inputs and outputs for each task.
- Scalability: As AI projects grow in complexity, the Workflow Class can help manage and scale workflows by adding or modifying tasks as needed.
- Error Handling: The class supports error handling strategies, enabling engineers to define how to handle unexpected failures gracefully.
- Maintainability: With structured workflows, AI engineers can easily maintain and update AI systems as requirements evolve or new data sources become available.
The Workflow Class, such as the SequentialWorkflow class, is an indispensable tool in the toolkit of AI engineers. It empowers engineers to design, execute, and manage AI workflows across a diverse range of use cases. By providing structure, modularity, and maintainability to AI projects, the Workflow Class contributes significantly to the efficiency and success of AI engineering endeavors. As the field of AI continues to advance, harnessing the power of workflow orchestration will remain a key ingredient in building intelligent and adaptable systems, now lets get started with SequentialWorkflow.
## Official Swarms Links
Here is the Swarms website:
Here is the Swarms Github:
Here are the Swarms docs:
And, join the Swarm community!
Book a call with The Swarm Corporation here if youre interested in high performance custom swarms!
Now lets begin…
## Installation
Before we dive into the tutorial, make sure you have the following prerequisites in place:
Python installed on your system.
The swarms library installed. You can install it via pip using the following command:
`pip3 install --upgrade swarms`
Additionally, you will need an API key for the OpenAIChat model to run the provided code examples. Replace "YOUR_API_KEY" with your actual API key in the code examples where applicable.
## Getting Started
Lets start by importing the necessary modules and initializing the OpenAIChat model, which we will use in our workflow tasks.
```python
from swarms.models import OpenAIChat
from swarms.structs import Flow
from swarms.structs.sequential_workflow import SequentialWorkflow
# Replace "YOUR_API_KEY" with your actual OpenAI API key
api_key = "YOUR_API_KEY"
# Initialize the language model flow (e.g., GPT-3)
llm = OpenAIChat(
openai_api_key=api_key,
temperature=0.5,
max_tokens=3000,
)
We have initialized the OpenAIChat model, which will be used as a callable object in our tasks. Now, lets proceed to create the SequentialWorkflow.
Creating a SequentialWorkflow
To create a SequentialWorkflow, follow these steps:
# Initialize Flows for individual tasks
flow1 = Flow(llm=llm, max_loops=1, dashboard=False)
flow2 = Flow(llm=llm, max_loops=1, dashboard=False)
# Create the Sequential Workflow
workflow = SequentialWorkflow(max_loops=1)
``````
In this code snippet, we have initialized two Flow instances (flow1 and flow2) representing individual tasks within our workflow. These flows will use the OpenAIChat model we initialized earlier. We then create a SequentialWorkflow instance named workflow with a maximum loop count of 1. The max_loops parameter determines how many times the entire workflow can be run, and we set it to 1 for this example.
Adding Tasks to the SequentialWorkflow
Now that we have created the SequentialWorkflow, lets add tasks to it. In our example, well create two tasks: one for generating a 10,000-word blog on “health and wellness” and another for summarizing the generated blog.
```
### Add tasks to the workflow
workflow.add("Generate a 10,000 word blog on health and wellness.", flow1)
`workflow.add("Summarize the generated blog", flow2)`
The workflow.add() method is used to add tasks to the workflow. Each task is described using a human-readable description, such as "Generate a 10,000 word blog on health and wellness," and is associated with a flow (callable object) that will be executed as the task. In our example, flow1 and flow2 represent the tasks.
Running the SequentialWorkflow
With tasks added to the SequentialWorkflow, we can now run the workflow sequentially using the workflow.run() method.
### Run the workflow
`workflow.run()`
Executing workflow.run() will start the execution of tasks in the order they were added to the workflow. In our example, it will first generate the blog and then summarize it.
Accessing Task Results
After running the workflow, you can access the results of each task using the get_task_results() method.
# Get and display the results of each task in the workflow
```python
results = workflow.get_task_results()
for task_description, result in results.items():
print(f"Task: {task_description}, Result: {result}")
```
The workflow.get_task_results() method returns a dictionary where the keys are task descriptions, and the values are the corresponding results. You can then iterate through the results and print them, as shown in the code snippet.
Resetting a SequentialWorkflow
Sometimes, you might need to reset a SequentialWorkflow to start fresh. You can use the workflow.reset_workflow() method for this purpose.
### Reset the workflow
`workflow.reset_workflow()`
Resetting the workflow clears the results of each task, allowing you to rerun the workflow from the beginning without reinitializing it.
Updating Task Arguments
You can also update the arguments of a specific task in the workflow using the workflow.update_task() method.
### Update the arguments of a specific task in the workflow
`workflow.update_task("Generate a 10,000 word blog on health and wellness.", max_loops=2)`
In this example, we update the max_loops argument of the task with the description "Generate a 10,000 word blog on health and wellness" to 2. This can be useful if you want to change the behavior of a specific task without recreating the entire workflow.
# Conclusion: Mastering Workflow Orchestration in AI Engineering
In the ever-evolving landscape of artificial intelligence (AI), where the pace of innovation and complexity of tasks are ever-increasing, harnessing the power of workflow orchestration is paramount. In this comprehensive walkthrough guide, weve embarked on a journey through the world of workflow orchestration, focusing on the Workflow Class, with a specific emphasis on the SequentialWorkflow class. As we conclude this exploration, weve delved deep into the intricacies of orchestrating AI workflows, and its time to reflect on the valuable insights gained and the immense potential that this knowledge unlocks for AI engineers.
## The Art of Workflow Orchestration
At its core, workflow orchestration is the art of designing, managing, and executing sequences of tasks or processes in a structured and efficient manner. In the realm of AI engineering, where tasks can range from data preprocessing and model training to decision-making and autonomous actions, mastering workflow orchestration is a game-changer. It empowers AI engineers to streamline their work, ensure reliable execution, and deliver impactful results.
The Workflow Class, and particularly the SequentialWorkflow class weve explored, acts as a guiding light in this intricate journey. It provides AI engineers with a toolbox of tools and techniques to conquer the challenges of orchestrating AI workflows effectively. Through a disciplined approach and adherence to best practices, AI engineers can achieve the following:
### 1. Structured Workflow Design
A well-structured workflow is the cornerstone of any successful AI project. The Workflow Class encourages AI engineers to break down complex tasks into manageable units. Each task becomes a building block that contributes to the overarching goal. Whether its preprocessing data, training a machine learning model, or generating content, structured workflow design ensures clarity, modularity, and maintainability.
### 2. Efficient Task Sequencing
In AI, the order of tasks often matters. One tasks output can be another tasks input, and ensuring the correct sequence of execution is crucial. The SequentialWorkflow class enforces this sequential execution, eliminating the risk of running tasks out of order. It ensures that the workflow progresses systematically, following the predefined sequence of tasks.
### 3. Error Resilience and Recovery
AI systems must be resilient in the face of unexpected errors and failures. The Workflow Class equips AI engineers with error handling strategies, such as retries and fallbacks. These strategies provide the ability to gracefully handle issues, recover from failures, and continue the workflows execution without disruption.
### 4. Code Modularity and Reusability
Building AI workflows often involves implementing various tasks, each with its own logic. The Workflow Class encourages code modularity, allowing AI engineers to encapsulate tasks as separate units. This modularity promotes code reusability, making it easier to adapt and expand workflows as AI projects evolve.
### 5. Efficient Debugging and Testing
Debugging and testing AI workflows can be challenging without clear structure and boundaries. The Workflow Class provides a clear sequence of tasks with well-defined inputs and outputs. This structure simplifies the debugging process, as AI engineers can isolate and test individual tasks, ensuring that each component functions as intended.
### 6. Scalability and Adaptability
As AI projects grow in complexity, the Workflow Class scales effortlessly. AI engineers can add or modify tasks as needed, accommodating new data sources, algorithms, or requirements. This scalability ensures that workflows remain adaptable to changing demands and evolving AI landscapes.
### 7. Maintainability and Future-Proofing
Maintaining AI systems over time is a crucial aspect of engineering. The Workflow Class fosters maintainability by providing a clear roadmap of tasks and their interactions. AI engineers can revisit, update, and extend workflows with confidence, ensuring that AI systems remain effective and relevant in the long run.
## Empowering AI Engineers
The knowledge and skills gained from this walkthrough guide go beyond technical proficiency. They empower AI engineers to be architects of intelligent systems, capable of orchestrating AI workflows that solve real-world problems. The Workflow Class is a versatile instrument in their hands, enabling them to tackle diverse use cases and engineering challenges.
## Diverse Use Cases for Workflow Class
Throughout this guide, we explored a myriad of use cases where the Workflow Class shines:
Natural Language Processing (NLP) Pipelines: In NLP, workflows involve multiple stages, and the Workflow Class ensures orderly execution, resulting in coherent NLP outcomes.
Data Ingestion and Transformation: Data is the lifeblood of AI, and structured data workflows ensure efficient data preparation for downstream tasks.
Autonomous Agents and Robotics: For robots and intelligent agents, workflows enable autonomous decision-making and task execution.
Machine Learning Model Training: Model training workflows encompass numerous steps, and structured orchestration simplifies the process.
Content Generation and Summarization: Workflows for content generation ensure that generated content meets quality and coherence criteria.
Adaptive Decision-Making: In dynamic environments, workflows facilitate adaptive decision-making based on real-time data.
## Efficiency and Maintainability
AI engineers not only have the tools to tackle these use cases but also the means to do so efficiently. The Workflow Class fosters efficiency and maintainability, making AI engineering endeavors more manageable:
- Modularity: Encapsulate tasks as separate units, promoting code reusability and maintainability.
- Debugging and Testing: Streamline debugging and testing through clear task boundaries and well-defined inputs and outputs.
- Scalability: As AI projects grow, workflows scale with ease, accommodating new components and requirements.
Error Handling: Gracefully handle errors and failures, ensuring that AI systems continue to operate smoothly.
- Maintainability: AI systems remain adaptable and maintainable, even as the AI landscape evolves and requirements change.
## The Future of AI Engineering
As AI engineering continues to advance, workflow orchestration will play an increasingly pivotal role. The Workflow Class is not a static tool; it is a dynamic enabler of innovation. In the future, we can expect further enhancements and features to meet the evolving demands of AI engineering:
### 1. Asynchronous Support
Support for asynchronous task execution will improve the efficiency of workflows, especially when tasks involve waiting for external events or resources.
### 2. Context Managers
Introducing context manager support for tasks can simplify resource management, such as opening and closing files or database connections.
### 3. Workflow History
Maintaining a detailed history of workflow execution, including timestamps, task durations, and input/output data, will facilitate debugging and performance analysis.
### 4. Parallel Processing
Enhancing the module to support parallel processing with a pool of workers can significantly speed up the execution of tasks, especially for computationally intensive workflows.
### 5. Error Handling Strategies
Providing built-in error handling strategies, such as retries, fallbacks, and circuit breakers, will further enhance the resilience of workflows.
## Closing Thoughts
In conclusion, the journey through workflow orchestration in AI engineering has been both enlightening and empowering. The Workflow Class, and particularly the SequentialWorkflow class, has proven to be an invaluable ally in the AI engineers toolkit. It offers structure, modularity, and efficiency, ensuring that AI projects progress smoothly from inception to deployment.
As AI continues to permeate every aspect of our lives, the skills acquired in this guide will remain highly relevant and sought after. AI engineers armed with workflow orchestration expertise will continue to push the boundaries of what is possible, solving complex problems, and driving innovation.
But beyond the technical aspects, this guide also emphasizes the importance of creativity, adaptability, and problem-solving. AI engineering is not just about mastering tools; its about using them to make a meaningful impact on the world.
So, whether youre just starting your journey into AI engineering or youre a seasoned professional seeking to expand your horizons, remember that the power of workflow orchestration lies not only in the code but in the limitless potential it unlocks for you as an AI engineer. As you embark on your own AI adventures, may this guide serve as a reliable companion, illuminating your path and inspiring your journey towards AI excellence.
The world of AI is waiting for your innovation and creativity. With workflow orchestration as your guide, you have the tools to shape the future. The possibilities are boundless, and the future is yours to create.
Official Swarms Links
Here is the Swarms website:
Here is the Swarms Github:
Here are the Swarms docs:
And, join the Swarm community!
Book a call with The Swarm Corporation here if youre interested in high performance custom swarms!

@ -1,93 +0,0 @@
Create multi-page long and explicit professional pytorch-like documentation for the swarms code below follow the outline for the swarms library, provide many examples and teach the user about the code, provide examples for every function, make the documentation 10,000 words, provide many usage examples and note this is markdown docs, create the documentation for the code to document.
Now make the professional documentation for this code, provide the architecture and how the class works and why it works that way, it's purpose, provide args, their types, 3 ways of usage examples, in examples use from shapeless import x
BE VERY EXPLICIT AND THOROUGH, MAKE IT DEEP AND USEFUL
########
Step 1: Understand the purpose and functionality of the module or framework
Read and analyze the description provided in the documentation to understand the purpose and functionality of the module or framework.
Identify the key features, parameters, and operations performed by the module or framework.
Step 2: Provide an overview and introduction
Start the documentation by providing a brief overview and introduction to the module or framework.
Explain the importance and relevance of the module or framework in the context of the problem it solves.
Highlight any key concepts or terminology that will be used throughout the documentation.
Step 3: Provide a class or function definition
Provide the class or function definition for the module or framework.
Include the parameters that need to be passed to the class or function and provide a brief description of each parameter.
Specify the data types and default values for each parameter.
Step 4: Explain the functionality and usage
Provide a detailed explanation of how the module or framework works and what it does.
Describe the steps involved in using the module or framework, including any specific requirements or considerations.
Provide code examples to demonstrate the usage of the module or framework.
Explain the expected inputs and outputs for each operation or function.
Step 5: Provide additional information and tips
Provide any additional information or tips that may be useful for using the module or framework effectively.
Address any common issues or challenges that developers may encounter and provide recommendations or workarounds.
Step 6: Include references and resources
Include references to any external resources or research papers that provide further information or background on the module or framework.
Provide links to relevant documentation or websites for further exploration.
Example Template for the given documentation:
# Module/Function Name: MultiheadAttention
class torch.nn.MultiheadAttention(embed_dim, num_heads, dropout=0.0, bias=True, add_bias_kv=False, add_zero_attn=False, kdim=None, vdim=None, batch_first=False, device=None, dtype=None):
"""
Creates a multi-head attention module for joint information representation from the different subspaces.
Parameters:
- embed_dim (int): Total dimension of the model.
- num_heads (int): Number of parallel attention heads. The embed_dim will be split across num_heads.
- dropout (float): Dropout probability on attn_output_weights. Default: 0.0 (no dropout).
- bias (bool): If specified, adds bias to input/output projection layers. Default: True.
- add_bias_kv (bool): If specified, adds bias to the key and value sequences at dim=0. Default: False.
- add_zero_attn (bool): If specified, adds a new batch of zeros to the key and value sequences at dim=1. Default: False.
- kdim (int): Total number of features for keys. Default: None (uses kdim=embed_dim).
- vdim (int): Total number of features for values. Default: None (uses vdim=embed_dim).
- batch_first (bool): If True, the input and output tensors are provided as (batch, seq, feature). Default: False.
- device (torch.device): If specified, the tensors will be moved to the specified device.
- dtype (torch.dtype): If specified, the tensors will have the specified dtype.
"""
def forward(query, key, value, key_padding_mask=None, need_weights=True, attn_mask=None, average_attn_weights=True, is_causal=False):
"""
Forward pass of the multi-head attention module.
Parameters:
- query (Tensor): Query embeddings of shape (L, E_q) for unbatched input, (L, N, E_q) when batch_first=False, or (N, L, E_q) when batch_first=True.
- key (Tensor): Key embeddings of shape (S, E_k) for unbatched input, (S, N, E_k) when batch_first=False, or (N, S, E_k) when batch_first=True.
- value (Tensor): Value embeddings of shape (S, E_v) for unbatched input, (S, N, E_v) when batch_first=False, or (N, S, E_v) when batch_first=True.
- key_padding_mask (Optional[Tensor]): If specified, a mask indicating elements to be ignored in key for attention computation.
- need_weights (bool): If specified, returns attention weights in addition to attention outputs. Default: True.
- attn_mask (Optional[Tensor]): If specified, a mask preventing attention to certain positions.
- average_attn_weights (bool): If true, returns averaged attention weights per head. Otherwise, returns attention weights separately per head. Note that this flag only has an effect when need_weights=True. Default: True.
- is_causal (bool): If specified, applies a causal mask as the attention mask. Default: False.
Returns:
Tuple[Tensor, Optional[Tensor]]:
- attn_output (Tensor): Attention outputs of shape (L, E) for unbatched input, (L, N, E) when batch_first=False, or (N, L, E) when batch_first=True.
- attn_output_weights (Optional[Tensor]): Attention weights of shape (L, S) when unbatched or (N, L, S) when batched. Optional, only returned when need_weights=True.
"""
# Implementation of the forward pass of the attention module goes here
return attn_output, attn_output_weights
# Usage example:
multihead_attn = nn.MultiheadAttention(embed_dim, num_heads)
attn_output, attn_output_weights = multihead_attn(query, key, value)
Note:
The above template includes the class or function definition, parameters, description, and usage example.
To replicate the documentation for any other module or framework, follow the same structure and provide the specific details for that module or framework.
############# CODE TO DOCUMENT, DOCUMENT THE

@ -53,7 +53,7 @@ The `BaseChunker` class is the core component of the `BaseChunker` module. It is
#### Parameters:
- `separators` (list[ChunkSeparator]): Specifies a list of `ChunkSeparator` objects used to split the text into chunks.
- `tokenizer` (OpenAiTokenizer): Defines the tokenizer to be used for counting tokens in the text.
- `tokenizer` (OpenAITokenizer): Defines the tokenizer to be used for counting tokens in the text.
- `max_tokens` (int): Sets the maximum token limit for each chunk.
### 4.2. Examples <a name="examples"></a>

@ -52,7 +52,7 @@ The `PdfChunker` class is the core component of the `PdfChunker` module. It is u
#### Parameters:
- `separators` (list[ChunkSeparator]): Specifies a list of `ChunkSeparator` objects used to split the PDF text content into chunks.
- `tokenizer` (OpenAiTokenizer): Defines the tokenizer used for counting tokens in the text.
- `tokenizer` (OpenAITokenizer): Defines the tokenizer used for counting tokens in the text.
- `max_tokens` (int): Sets the maximum token limit for each chunk.
### 4.2. Examples <a name="examples"></a>

@ -70,17 +70,18 @@ class Anthropic:
```python
# Import necessary modules and classes
from swarms.models import Anthropic
import torch
# Initialize an instance of the Anthropic class
anthropic_instance = Anthropic()
model = Anthropic(
anthropic_api_key=""
)
# Using the generate method
completion_1 = anthropic_instance.generate("What is the capital of France?")
# Using the run method
completion_1 = model.run("What is the capital of France?")
print(completion_1)
# Using the __call__ method
completion_2 = anthropic_instance("How far is the moon from the earth?", stop=["miles", "km"])
completion_2 = model("How far is the moon from the earth?", stop=["miles", "km"])
print(completion_2)
```

@ -0,0 +1,261 @@
# `Dalle3` Documentation
## Table of Contents
1. [Introduction](#introduction)
2. [Installation](#installation)
3. [Quick Start](#quick-start)
4. [Dalle3 Class](#dalle3-class)
- [Attributes](#attributes)
- [Methods](#methods)
5. [Usage Examples](#usage-examples)
6. [Error Handling](#error-handling)
7. [Advanced Usage](#advanced-usage)
8. [References](#references)
---
## Introduction<a name="introduction"></a>
The Dalle3 library is a Python module that provides an easy-to-use interface for generating images from text descriptions using the DALL·E 3 model by OpenAI. DALL·E 3 is a powerful language model capable of converting textual prompts into images. This documentation will guide you through the installation, setup, and usage of the Dalle3 library.
---
## Installation<a name="installation"></a>
To use the Dalle3 model, you must first install swarms:
```bash
pip install swarms
```
---
## Quick Start<a name="quick-start"></a>
Let's get started with a quick example of using the Dalle3 library to generate an image from a text prompt:
```python
from swarms.models.dalle3 import Dalle3
# Create an instance of the Dalle3 class
dalle = Dalle3()
# Define a text prompt
task = "A painting of a dog"
# Generate an image from the text prompt
image_url = dalle3(task)
# Print the generated image URL
print(image_url)
```
This example demonstrates the basic usage of the Dalle3 library to convert a text prompt into an image. The generated image URL will be printed to the console.
---
## Dalle3 Class<a name="dalle3-class"></a>
The Dalle3 library provides a `Dalle3` class that allows you to interact with the DALL·E 3 model. This class has several attributes and methods for generating images from text prompts.
### Attributes<a name="attributes"></a>
- `model` (str): The name of the DALL·E 3 model. Default: "dall-e-3".
- `img` (str): The image URL generated by the Dalle3 API.
- `size` (str): The size of the generated image. Default: "1024x1024".
- `max_retries` (int): The maximum number of API request retries. Default: 3.
- `quality` (str): The quality of the generated image. Default: "standard".
- `n` (int): The number of variations to create. Default: 4.
### Methods<a name="methods"></a>
#### `__call__(self, task: str) -> Dalle3`
This method makes a call to the Dalle3 API and returns the image URL generated from the provided text prompt.
Parameters:
- `task` (str): The text prompt to be converted to an image.
Returns:
- `Dalle3`: An instance of the Dalle3 class with the image URL generated by the Dalle3 API.
#### `create_variations(self, img: str)`
This method creates variations of an image using the Dalle3 API.
Parameters:
- `img` (str): The image to be used for the API request.
Returns:
- `img` (str): The image URL of the generated variations.
---
## Usage Examples<a name="usage-examples"></a>
### Example 1: Basic Image Generation
```python
from swarms.models.dalle3 import Dalle3
# Create an instance of the Dalle3 class
dalle3 = Dalle3()
# Define a text prompt
task = "A painting of a dog"
# Generate an image from the text prompt
image_url = dalle3(task)
# Print the generated image URL
print(image_url)
```
### Example 2: Creating Image Variations
```python
from swarms.models.dalle3 import Dalle3
# Create an instance of the Dalle3 class
dalle3 = Dalle3()
# Define the URL of an existing image
img_url = "https://images.unsplash.com/photo-1694734479898-6ac4633158ac?q=80&w=1287&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D
# Create variations of the image
variations_url = dalle3.create_variations(img_url)
# Print the URLs of the generated variations
print(variations_url)
```
Certainly! Here are additional examples that cover various edge cases and methods of the `Dalle3` class in the Dalle3 library:
### Example 3: Customizing Image Size
You can customize the size of the generated image by specifying the `size` parameter when creating an instance of the `Dalle3` class. Here's how to generate a smaller image:
```python
from swarms.models.dalle3 import Dalle3
# Create an instance of the Dalle3 class with a custom image size
dalle3 = Dalle3(size="512x512")
# Define a text prompt
task = "A small painting of a cat"
# Generate a smaller image from the text prompt
image_url = dalle3(task)
# Print the generated image URL
print(image_url)
```
### Example 4: Adjusting Retry Limit
You can adjust the maximum number of API request retries using the `max_retries` parameter. Here's how to increase the retry limit:
```python
from swarms.models.dalle3 import Dalle3
# Create an instance of the Dalle3 class with a higher retry limit
dalle3 = Dalle3(max_retries=5)
# Define a text prompt
task = "An image of a landscape"
# Generate an image with a higher retry limit
image_url = dalle3(task)
# Print the generated image URL
print(image_url)
```
### Example 5: Generating Image Variations
To create variations of an existing image, you can use the `create_variations` method. Here's an example:
```python
from swarms.models.dalle3 import Dalle3
# Create an instance of the Dalle3 class
dalle3 = Dalle3()
# Define the URL of an existing image
img_url = "https://images.unsplash.com/photo-1677290043066-12eccd944004?q=80&w=1287&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D"
# Create variations of the image
variations_url = dalle3.create_variations(img_url)
# Print the URLs of the generated variations
print(variations_url)
```
### Example 6: Handling API Errors
The Dalle3 library provides error handling for API-related issues. Here's how to handle and display API errors:
```python
from swarms.models.dalle3 import Dalle3
# Create an instance of the Dalle3 class
dalle3 = Dalle3()
# Define a text prompt
task = "Invalid prompt that may cause an API error"
try:
# Attempt to generate an image with an invalid prompt
image_url = dalle3(task)
print(image_url)
except Exception as e:
print(f"Error occurred: {str(e)}")
```
### Example 7: Customizing Image Quality
You can customize the quality of the generated image by specifying the `quality` parameter. Here's how to generate a high-quality image:
```python
from swarms.models.dalle3 import Dalle3
# Create an instance of the Dalle3 class with high quality
dalle3 = Dalle3(quality="high")
# Define a text prompt
task = "A high-quality image of a sunset"
# Generate a high-quality image from the text prompt
image_url = dalle3(task)
# Print the generated image URL
print(image_url)
```
---
## Error Handling<a name="error-handling"></a>
The Dalle3 library provides error handling for API-related issues. If an error occurs during API communication, the library will handle it and provide detailed error messages. Make sure to handle exceptions appropriately in your code.
---
## Advanced Usage<a name="advanced-usage"></a>
For advanced usage and customization of the Dalle3 library, you can explore the attributes and methods of the `Dalle3` class. Adjusting parameters such as `size`, `max_retries`, and `quality` allows you to fine-tune the image generation process to your specific needs.
---
## References<a name="references"></a>
For more information about the DALL·E 3 model and the Dalle3 library, you can refer to the official OpenAI documentation and resources.
- [OpenAI API Documentation](https://beta.openai.com/docs/)
- [DALL·E 3 Model Information](https://openai.com/research/dall-e-3)
- [Dalle3 GitHub Repository](https://github.com/openai/dall-e-3)
---
This concludes the documentation for the Dalle3 library. You can now use the library to generate images from text prompts and explore its advanced features for various applications.

@ -0,0 +1,123 @@
# DistilWhisperModel Documentation
## Overview
The `DistilWhisperModel` is a Python class designed to handle English speech recognition tasks. It leverages the capabilities of the Whisper model, which is fine-tuned for speech-to-text processes. It is designed for both synchronous and asynchronous transcription of audio inputs, offering flexibility for real-time applications or batch processing.
## Installation
Before you can use `DistilWhisperModel`, ensure you have the required libraries installed:
```sh
pip3 install --upgrade swarms
```
## Initialization
The `DistilWhisperModel` class is initialized with the following parameters:
| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `model_id` | `str` | The identifier for the pre-trained Whisper model | `"distil-whisper/distil-large-v2"` |
Example of initialization:
```python
from swarms.models import DistilWhisperModel
# Initialize with default model
model_wrapper = DistilWhisperModel()
# Initialize with a specific model ID
model_wrapper = DistilWhisperModel(model_id='distil-whisper/distil-large-v2')
```
## Attributes
After initialization, the `DistilWhisperModel` has several attributes:
| Attribute | Type | Description |
|-----------|------|-------------|
| `device` | `str` | The device used for computation (`"cuda:0"` for GPU or `"cpu"`). |
| `torch_dtype` | `torch.dtype` | The data type used for the Torch tensors. |
| `model_id` | `str` | The model identifier string. |
| `model` | `torch.nn.Module` | The actual Whisper model loaded from the identifier. |
| `processor` | `transformers.AutoProcessor` | The processor for handling input data. |
## Methods
### `transcribe`
Transcribes audio input synchronously.
**Arguments**:
| Argument | Type | Description |
|----------|------|-------------|
| `inputs` | `Union[str, dict]` | File path or audio data dictionary. |
**Returns**: `str` - The transcribed text.
**Usage Example**:
```python
# Synchronous transcription
transcription = model_wrapper.transcribe('path/to/audio.mp3')
print(transcription)
```
### `async_transcribe`
Transcribes audio input asynchronously.
**Arguments**:
| Argument | Type | Description |
|----------|------|-------------|
| `inputs` | `Union[str, dict]` | File path or audio data dictionary. |
**Returns**: `Coroutine` - A coroutine that when awaited, returns the transcribed text.
**Usage Example**:
```python
import asyncio
# Asynchronous transcription
transcription = asyncio.run(model_wrapper.async_transcribe('path/to/audio.mp3'))
print(transcription)
```
### `real_time_transcribe`
Simulates real-time transcription of an audio file.
**Arguments**:
| Argument | Type | Description |
|----------|------|-------------|
| `audio_file_path` | `str` | Path to the audio file. |
| `chunk_duration` | `int` | Duration of audio chunks in seconds. |
**Usage Example**:
```python
# Real-time transcription simulation
model_wrapper.real_time_transcribe('path/to/audio.mp3', chunk_duration=5)
```
## Error Handling
The `DistilWhisperModel` class incorporates error handling for file not found errors and generic exceptions during the transcription process. If a non-recoverable exception is raised, it is printed to the console in red to indicate failure.
## Conclusion
The `DistilWhisperModel` offers a convenient interface to the powerful Whisper model for speech recognition. Its design supports both batch and real-time transcription, catering to different application needs. The class's error handling and retry logic make it robust for real-world applications.
## Additional Notes
- Ensure you have appropriate permissions to read audio files when using file paths.
- Transcription quality depends on the audio quality and the Whisper model's performance on your dataset.
- Adjust `chunk_duration` according to the processing power of your system for real-time transcription.
For a full list of models supported by `transformers.AutoModelForSpeechSeq2Seq`, visit the [Hugging Face Model Hub](https://huggingface.co/models).

@ -42,13 +42,6 @@ from swarms.models import Fuyu
fuyu = Fuyu()
```
### Example 1 - Initialization
```python
from swarms.models import Fuyu
fuyu = Fuyu()
```
2. Generate Text with Fuyu:

@ -0,0 +1,251 @@
# `GPT4Vision` Documentation
## Table of Contents
- [Overview](#overview)
- [Installation](#installation)
- [Initialization](#initialization)
- [Methods](#methods)
- [process_img](#process_img)
- [__call__](#__call__)
- [run](#run)
- [arun](#arun)
- [Configuration Options](#configuration-options)
- [Usage Examples](#usage-examples)
- [Additional Tips](#additional-tips)
- [References and Resources](#references-and-resources)
---
## Overview
The GPT4Vision Model API is designed to provide an easy-to-use interface for interacting with the OpenAI GPT-4 Vision model. This model can generate textual descriptions for images and answer questions related to visual content. Whether you want to describe images or perform other vision-related tasks, GPT4Vision makes it simple and efficient.
The library offers a straightforward way to send images and tasks to the GPT-4 Vision model and retrieve the generated responses. It handles API communication, authentication, and retries, making it a powerful tool for developers working with computer vision and natural language processing tasks.
## Installation
To use the GPT4Vision Model API, you need to install the required dependencies and configure your environment. Follow these steps to get started:
1. Install the required Python package:
```bash
pip3 install --upgrade swarms
```
2. Make sure you have an OpenAI API key. You can obtain one by signing up on the [OpenAI platform](https://beta.openai.com/signup/).
3. Set your OpenAI API key as an environment variable. You can do this in your code or your environment configuration. Alternatively, you can provide the API key directly when initializing the `GPT4Vision` class.
## Initialization
To start using the GPT4Vision Model API, you need to create an instance of the `GPT4Vision` class. You can customize its behavior by providing various configuration options, but it also comes with sensible defaults.
Here's how you can initialize the `GPT4Vision` class:
```python
from swarms.models.gpt4v import GPT4Vision
gpt4vision = GPT4Vision(
api_key="Your Key"
)
```
The above code initializes the `GPT4Vision` class with default settings. You can adjust these settings as needed.
## Methods
### `process_img`
The `process_img` method is used to preprocess an image before sending it to the GPT-4 Vision model. It takes the image path as input and returns the processed image in a format suitable for API requests.
```python
processed_img = gpt4vision.process_img(img_path)
```
- `img_path` (str): The file path or URL of the image to be processed.
### `__call__`
The `__call__` method is the main method for interacting with the GPT-4 Vision model. It sends the image and tasks to the model and returns the generated response.
```python
response = gpt4vision(img, tasks)
```
- `img` (Union[str, List[str]]): Either a single image URL or a list of image URLs to be used for the API request.
- `tasks` (List[str]): A list of tasks or questions related to the image(s).
This method returns a `GPT4VisionResponse` object, which contains the generated answer.
### `run`
The `run` method is an alternative way to interact with the GPT-4 Vision model. It takes a single task and image URL as input and returns the generated response.
```python
response = gpt4vision.run(task, img)
```
- `task` (str): The task or question related to the image.
- `img` (str): The image URL to be used for the API request.
This method simplifies interactions when dealing with a single task and image.
### `arun`
The `arun` method is an asynchronous version of the `run` method. It allows for asynchronous processing of API requests, which can be useful in certain scenarios.
```python
import asyncio
async def main():
response = await gpt4vision.arun(task, img)
print(response)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
```
- `task` (str): The task or question related to the image.
- `img` (str): The image URL to be used for the API request.
## Configuration Options
The `GPT4Vision` class provides several configuration options that allow you to customize its behavior:
- `max_retries` (int): The maximum number of retries to make to the API. Default: 3
- `backoff_factor` (float): The backoff factor to use for exponential backoff. Default: 2.0
- `timeout_seconds` (int): The timeout in seconds for the API request. Default: 10
- `api_key` (str): The API key to use for the API request. Default: None (set via environment variable)
- `quality` (str): The quality of the image to generate. Options: 'low' or 'high'. Default: 'low'
- `max_tokens` (int): The maximum number of tokens to use for the API request. Default: 200
## Usage Examples
### Example 1: Generating Image Descriptions
```python
gpt4vision = GPT4Vision()
img = "https://example.com/image.jpg"
tasks = ["Describe this image."]
response = gpt4vision(img, tasks)
print(response.answer)
```
In this example, we create an instance of `GPT4Vision`, provide an image URL, and ask the model to describe the image. The response contains the generated description.
### Example 2: Custom Configuration
```python
custom_config = {
"max_retries": 5,
"timeout_seconds": 20,
"quality": "high",
"max_tokens": 300,
}
gpt4vision = GPT4Vision(**custom_config)
img = "https://example.com/another_image.jpg"
tasks = ["What objects can you identify in this image?"]
response = gpt4vision(img, tasks)
print(response.answer)
```
In this example, we create an instance of `GPT4Vision` with custom configuration options. We set a higher timeout, request high-quality images, and allow more tokens in the response.
### Example 3: Using the `run` Method
```python
gpt4vision = GPT4Vision()
img = "https://example.com/image.jpg"
task = "Describe this image in detail."
response = gpt4vision.run(task, img)
print(response)
```
In this example, we use the `run` method to simplify the interaction by providing a single task and image URL.
# Model Usage and Image Understanding
The GPT-4 Vision model processes images in a unique way, allowing it to answer questions about both or each of the images independently. Here's an overview:
| Purpose | Description |
| --------------------------------------- | ---------------------------------------------------------------------------------------------------------------- |
| Image Understanding | The model is shown two copies of the same image and can answer questions about both or each of the images independently. |
# Image Detail Control
You have control over how the model processes the image and generates textual understanding by using the `detail` parameter, which has two options: `low` and `high`.
| Detail | Description |
| -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| low | Disables the "high-res" model. The model receives a low-res 512 x 512 version of the image and represents the image with a budget of 65 tokens. Ideal for use cases not requiring high detail. |
| high | Enables "high-res" mode. The model first sees the low-res image and then creates detailed crops of input images as 512px squares based on the input image size. Uses a total of 129 tokens. |
# Managing Images
To use the Chat Completions API effectively, you must manage the images you pass to the model. Here are some key considerations:
| Management Aspect | Description |
| ------------------------- | ------------------------------------------------------------------------------------------------- |
| Image Reuse | To pass the same image multiple times, include the image with each API request. |
| Image Size Optimization | Improve latency by downsizing images to meet the expected size requirements. |
| Image Deletion | After processing, images are deleted from OpenAI servers and not retained. No data is used for training. |
# Limitations
While GPT-4 with Vision is powerful, it has some limitations:
| Limitation | Description |
| -------------------------------------------- | --------------------------------------------------------------------------------------------------- |
| Medical Images | Not suitable for interpreting specialized medical images like CT scans. |
| Non-English Text | May not perform optimally when handling non-Latin alphabets, such as Japanese or Korean. |
| Large Text in Images | Enlarge text within images for readability, but avoid cropping important details. |
| Rotated or Upside-Down Text/Images | May misinterpret rotated or upside-down text or images. |
| Complex Visual Elements | May struggle to understand complex graphs or text with varying colors or styles. |
| Spatial Reasoning | Struggles with tasks requiring precise spatial localization, such as identifying chess positions. |
| Accuracy | May generate incorrect descriptions or captions in certain scenarios. |
| Panoramic and Fisheye Images | Struggles with panoramic and fisheye images. |
# Calculating Costs
Image inputs are metered and charged in tokens. The token cost depends on the image size and detail option.
| Example | Token Cost |
| --------------------------------------------- | ----------- |
| 1024 x 1024 square image in detail: high mode | 765 tokens |
| 2048 x 4096 image in detail: high mode | 1105 tokens |
| 4096 x 8192 image in detail: low mode | 85 tokens |
# FAQ
Here are some frequently asked questions about GPT-4 with Vision:
| Question | Answer |
| -------------------------------------------- | -------------------------------------------------------------------------------------------------- |
| Fine-Tuning Image Capabilities | No, fine-tuning the image capabilities of GPT-4 is not supported at this time. |
| Generating Images | GPT-4 is used for understanding images, not generating them. |
| Supported Image File Types | Supported image file types include PNG (.png), JPEG (.jpeg and .jpg), WEBP (.webp), and non-animated GIF (.gif). |
| Image Size Limitations | Image uploads are restricted to 20MB per image. |
| Image Deletion | Uploaded images are automatically deleted after processing by the model. |
| Learning More | For more details about GPT-4 with Vision, refer to the GPT-4 with Vision system card. |
| CAPTCHA Submission | CAPTCHAs are blocked for safety reasons. |
| Rate Limits | Image processing counts toward your tokens per minute (TPM) limit. Refer to the calculating costs section for details. |
| Image Metadata | The model does not receive image metadata. |
| Handling Unclear Images | If an image is unclear, the model will do its best to interpret it, but results may be less accurate. |
## Additional Tips
- Make sure to handle potential exceptions and errors when making API requests. The library includes retries and error handling, but it's essential to handle exceptions gracefully in your code.
- Experiment with different configuration options to optimize the trade-off between response quality and response time based on your specific requirements.
## References and Resources
- [OpenAI Platform](https://beta.openai.com/signup/): Sign up for an OpenAI API key.
- [OpenAI API Documentation](https://platform.openai.com/docs/api-reference/chat/create): Official API documentation for the GPT-4 Vision model.
Now you have a comprehensive understanding of the GPT4Vision Model API, its configuration options, and how to use it for various computer vision and natural language processing tasks. Start experimenting and integrating it into your projects to leverage the power of GPT-4 Vision for image-related tasks.
# Conclusion
With GPT-4 Vision, you have a powerful tool for understanding and generating textual descriptions for images. By considering its capabilities, limitations, and cost calculations, you can effectively leverage this model for various image-related tasks.

@ -1,4 +1,4 @@
# Swarms Documentation
# `Mistral` Documentation
## Table of Contents
@ -133,9 +133,7 @@ Mistral provides two methods for running the model:
The `run` method is used to generate text-based responses to a given task or input. It takes a single string parameter, `task`, and returns the generated text as a string.
```python
def run
(self, task: str) -> str:
def run(self, task: str) -> str:
"""
Run the model on a given task.
@ -236,6 +234,8 @@ In this section, we provide practical examples to illustrate how to use Mistral
In this example, we initialize the Mistral AI agent with custom settings:
```python
from swarms.models import Mistral
model = Mistral(
ai_name="My AI Assistant",
device="cpu",

@ -108,8 +108,13 @@ Here are three usage examples:
```python
from swarms.structs import Flow
# Select any Language model from the models folder
from swarms.models import Mistral, OpenAIChat
flow = Flow(llm=my_language_model, max_loops=5)
llm = Mistral()
# llm = OpenAIChat()
flow = Flow(llm=llm, max_loops=5)
# Define a starting task or message
initial_task = "Generate an long form analysis on the transformer model architecture."
@ -126,7 +131,7 @@ from swarms.structs import Flow
def stop_when_repeats(response: str) -> bool:
return "Stop" in response.lower()
flow = Flow(llm=my_language_model, max_loops=5, stopping_condition=stop_when_repeats)
flow = Flow(llm=llm, max_loops=5, stopping_condition=stop_when_repeats)
```
### Example 3: Interactive Conversation
@ -134,7 +139,7 @@ flow = Flow(llm=my_language_model, max_loops=5, stopping_condition=stop_when_rep
```python
from swarms.structs import Flow
flow = Flow(llm=my_language_model, max_loops=5, interactive=True)
flow = Flow(llm=llm, max_loops=5, interactive=True)
# Provide initial task
initial_task = "Rank and prioritize the following financial documents and cut out 30% of our expenses"

@ -0,0 +1,614 @@
# `SequentialWorkflow` Documentation
The **SequentialWorkflow** class is a Python module designed to facilitate the execution of a sequence of tasks in a sequential manner. It is a part of the `swarms.structs` package and is particularly useful for orchestrating the execution of various callable objects, such as functions or models, in a predefined order. This documentation will provide an in-depth understanding of the **SequentialWorkflow** class, including its purpose, architecture, usage, and examples.
## Purpose and Relevance
The **SequentialWorkflow** class is essential for managing and executing a series of tasks or processes, where each task may depend on the outcome of the previous one. It is commonly used in various application scenarios, including but not limited to:
1. **Natural Language Processing (NLP) Workflows:** In NLP workflows, multiple language models are employed sequentially to process and generate text. Each model may depend on the results of the previous one, making sequential execution crucial.
2. **Data Analysis Pipelines:** Data analysis often involves a series of tasks such as data preprocessing, transformation, and modeling steps. These tasks must be performed sequentially to ensure data consistency and accuracy.
3. **Task Automation:** In task automation scenarios, there is a need to execute a series of automated tasks in a specific order. Sequential execution ensures that each task is performed in a predefined sequence, maintaining the workflow's integrity.
By providing a structured approach to managing these tasks, the **SequentialWorkflow** class helps developers streamline their workflow execution and improve code maintainability.
## Key Concepts and Terminology
Before delving into the details of the **SequentialWorkflow** class, let's define some key concepts and terminology that will be used throughout the documentation:
### Task
A **task** refers to a specific unit of work that needs to be executed as part of the workflow. Each task is associated with a description and can be implemented as a callable object, such as a function or a model.
### Flow
A **flow** represents a callable object that can be a task within the **SequentialWorkflow**. Flows encapsulate the logic and functionality of a particular task. Flows can be functions, models, or any callable object that can be executed.
### Sequential Execution
Sequential execution refers to the process of running tasks one after the other in a predefined order. In a **SequentialWorkflow**, tasks are executed sequentially, meaning that each task starts only after the previous one has completed.
### Workflow
A **workflow** is a predefined sequence of tasks that need to be executed in a specific order. It represents the overall process or pipeline that the **SequentialWorkflow** manages.
### Dashboard (Optional)
A **dashboard** is an optional feature of the **SequentialWorkflow** that provides real-time monitoring and visualization of the workflow's progress. It displays information such as the current task being executed, task results, and other relevant metadata.
### Max Loops
The **maximum number of times** the entire workflow can be run. This parameter allows developers to control how many times the workflow is executed.
### Autosaving
**Autosaving** is a feature that allows the **SequentialWorkflow** to automatically save its state to a file at specified intervals. This feature helps in resuming a workflow from where it left off, even after interruptions.
Now that we have a clear understanding of the key concepts and terminology, let's explore the architecture and usage of the **SequentialWorkflow** class in more detail.
## Architecture of SequentialWorkflow
The architecture of the **SequentialWorkflow** class is designed to provide a structured and flexible way to define, manage, and execute a sequence of tasks. It comprises the following core components:
1. **Task**: The **Task** class represents an individual unit of work within the workflow. Each task has a description, which serves as a human-readable identifier for the task. Tasks can be implemented as callable objects, allowing for great flexibility in defining their functionality.
2. **Workflow**: The **SequentialWorkflow** class itself represents the workflow. It manages a list of tasks in the order they should be executed. Workflows can be run sequentially or asynchronously, depending on the use case.
3. **Task Execution**: Task execution is the process of running each task in the workflow. Tasks are executed one after another in the order they were added to the workflow. Task results can be passed as inputs to subsequent tasks.
4. **Dashboard (Optional)**: The **SequentialWorkflow** optionally includes a dashboard feature. The dashboard provides a visual interface for monitoring the progress of the workflow. It displays information about the current task, task results, and other relevant metadata.
5. **State Management**: The **SequentialWorkflow** supports state management, allowing developers to save and load the state of the workflow to and from JSON files. This feature is valuable for resuming workflows after interruptions or for sharing workflow configurations.
## Usage of SequentialWorkflow
The **SequentialWorkflow** class is versatile and can be employed in a wide range of applications. Its usage typically involves the following steps:
1. **Initialization**: Begin by initializing any callable objects or flows that will serve as tasks in the workflow. These callable objects can include functions, models, or any other Python objects that can be executed.
2. **Workflow Creation**: Create an instance of the **SequentialWorkflow** class. Specify the maximum number of loops the workflow should run and whether a dashboard should be displayed.
3. **Task Addition**: Add tasks to the workflow using the `add` method. Each task should be described using a human-readable description, and the associated flow (callable object) should be provided. Additional arguments and keyword arguments can be passed to the task.
4. **Task Execution**: Execute the workflow using the `run` method. The tasks within the workflow will be executed sequentially, with task results passed as inputs to subsequent tasks.
5. **Accessing Results**: After running the workflow, you can access the results of each task using the `get_task_results` method or by directly accessing the `result` attribute of each task.
6. **Optional Features**: Optionally, you can enable features such as autosaving of the workflow state and utilize the dashboard for real-time monitoring.
## Installation
Before using the Sequential Workflow library, you need to install it. You can install it via pip:
```bash
pip3 install --upgrade swarms
```
## Quick Start
Let's begin with a quick example to demonstrate how to create and run a Sequential Workflow. In this example, we'll create a workflow that generates a 10,000-word blog on "health and wellness" using an AI model and then summarizes the generated content.
```python
from swarms.models import OpenAIChat
from swarms.structs import Flow
from swarms.structs.sequential_workflow import SequentialWorkflow
# Initialize the language model flow (e.g., GPT-3)
llm = OpenAIChat(
openai_api_key="YOUR_API_KEY",
temperature=0.5,
max_tokens=3000,
)
# Initialize flows for individual tasks
flow1 = Flow(llm=llm, max_loops=1, dashboard=False)
flow2 = Flow(llm=llm, max_loops=1, dashboard=False)
# Create the Sequential Workflow
workflow = SequentialWorkflow(max_loops=1)
# Add tasks to the workflow
workflow.add("Generate a 10,000 word blog on health and wellness.", flow1)
workflow.add("Summarize the generated blog", flow2)
# Run the workflow
workflow.run()
# Output the results
for task in workflow.tasks:
print(f"Task: {task.description}, Result: {task.result}")
```
This quick example demonstrates the basic usage of the Sequential Workflow. It creates two tasks and executes them sequentially.
## Class: `Task`
### Description
The `Task` class represents an individual task in the workflow. A task is essentially a callable object, such as a function or a class, that can be executed sequentially. Tasks can have arguments and keyword arguments.
### Class Definition
```python
class Task:
def __init__(self, description: str, flow: Union[Callable, Flow], args: List[Any] = [], kwargs: Dict[str, Any] = {}, result: Any = None, history: List[Any] = [])
```
### Parameters
- `description` (str): A description of the task.
- `flow` (Union[Callable, Flow]): The callable object representing the task. It can be a function, class, or a `Flow` instance.
- `args` (List[Any]): A list of positional arguments to pass to the task when executed. Default is an empty list.
- `kwargs` (Dict[str, Any]): A dictionary of keyword arguments to pass to the task when executed. Default is an empty dictionary.
- `result` (Any): The result of the task's execution. Default is `None`.
- `history` (List[Any]): A list to store the historical results of the task. Default is an empty list.
### Methods
#### `execute()`
Execute the task.
```python
def execute(self):
```
This method executes the task and updates the `result` and `history` attributes of the task. It checks if the task is a `Flow` instance and if the 'task' argument is needed.
## Class: `SequentialWorkflow`
### Description
The `SequentialWorkflow` class is responsible for managing a sequence of tasks and executing them in a sequential order. It provides methods for adding tasks, running the workflow, and managing the state of the tasks.
### Class Definition
```python
class SequentialWorkflow:
def __init__(self, max_loops: int = 1, autosave: bool = False, saved_state_filepath: Optional[str] = "sequential_workflow_state.json", restore_state_filepath: Optional[str] = None, dashboard: bool = False, tasks: List[Task] = [])
```
### Parameters
- `max_loops` (int): The maximum number of times to run the workflow sequentially. Default is `1`.
- `autosave` (bool): Whether to enable autosaving of the workflow state. Default is `False`.
- `saved_state_filepath` (Optional[str]): The file path to save the workflow state when autosave is enabled. Default is `"sequential_workflow_state.json"`.
- `restore_state_filepath` (Optional[str]): The file path to restore the workflow state when initializing. Default is `None`.
- `dashboard` (bool): Whether to display a dashboard with workflow information. Default is `False`.
- `tasks` (List[Task]): A list of `Task` instances representing the tasks in the workflow. Default is an empty list.
### Methods
#### `add(task: str, flow: Union[Callable, Flow], *args, **kwargs)`
Add a task to the workflow.
```python
def add(self, task: str, flow: Union[Callable, Flow], *args, **kwargs) -> None:
```
This method adds a new task to the workflow. You can provide a description of the task, the callable object (function, class, or `Flow` instance), and any additional positional or keyword arguments required for the task.
#### `reset_workflow()`
Reset the workflow by clearing the results of each task.
```python
def reset_workflow(self) -> None:
```
This method clears the results of each task in the workflow, allowing you to start fresh without reinitializing the workflow.
#### `get_task_results()`
Get the results of each task in the workflow.
```python
def get_task_results(self) -> Dict[str, Any]:
```
This method returns a dictionary containing the results of each task in the workflow, where the keys are task descriptions, and the values are the corresponding results.
#### `remove_task(task_description: str)`
Remove a task from the workflow.
```python
def remove_task(self, task_description: str) -> None:
```
This method removes a specific task from the workflow based on its description.
#### `update_task(task_description: str, **updates)`
Update the arguments of a task in the workflow.
```python
def update_task(self, task_description: str, **updates) -> None:
```
This method allows you to update the arguments and keyword arguments of a task in the workflow. You specify the task's description and provide the updates as keyword arguments.
#### `save_workflow_state(filepath: Optional[str] = "sequential_workflow_state.json", **kwargs)`
Save the workflow state to a JSON file.
```python
def save_workflow_state(self, filepath: Optional[str] = "sequential_workflow_state.json", **kwargs) -> None:
```
This method saves the current state of the workflow, including the results and history of each task, to a JSON file. You can specify the file path for saving the state.
#### `load_workflow_state(filepath: str = None, **kwargs)`
Load the workflow state from a JSON file and restore the workflow state.
```python
def load_workflow_state(self, filepath: str = None, **kwargs) -> None:
```
This method loads a previously saved workflow state from a JSON file
and restores the state, allowing you to continue the workflow from where it was saved. You can specify the file path for loading the state.
#### `run()`
Run the workflow sequentially.
```python
def run(self) -> None:
```
This method executes the tasks in the workflow sequentially. It checks if a task is a `Flow` instance and handles the flow of data between tasks accordingly.
#### `arun()`
Asynchronously run the workflow.
```python
async def arun(self) -> None:
```
This method asynchronously executes the tasks in the workflow sequentially. It's suitable for use cases where asynchronous execution is required. It also handles data flow between tasks.
#### `workflow_bootup(**kwargs)`
Display a bootup message for the workflow.
```python
def workflow_bootup(self, **kwargs) -> None:
```
This method displays a bootup message when the workflow is initialized. You can customize the message by providing additional keyword arguments.
#### `workflow_dashboard(**kwargs)`
Display a dashboard for the workflow.
```python
def workflow_dashboard(self, **kwargs) -> None:
```
This method displays a dashboard with information about the workflow, such as the number of tasks, maximum loops, and autosave settings. You can customize the dashboard by providing additional keyword arguments.
## Examples
Let's explore some examples to illustrate how to use the Sequential Workflow library effectively.
Sure, I'll recreate the usage examples section for each method and use case using the provided foundation. Here are the examples:
### Example 1: Adding Tasks to a Sequential Workflow
In this example, we'll create a Sequential Workflow and add tasks to it.
```python
from swarms.models import OpenAIChat
from swarms.structs import Flow
from swarms.structs.sequential_workflow import SequentialWorkflow
# Example usage
api_key = (
"" # Your actual API key here
)
# Initialize the language flow
llm = OpenAIChat(
openai_api_key=api_key,
temperature=0.5,
max_tokens=3000,
)
# Initialize Flows for individual tasks
flow1 = Flow(llm=llm, max_loops=1, dashboard=False)
flow2 = Flow(llm=llm, max_loops=1, dashboard=False)
# Create the Sequential Workflow
workflow = SequentialWorkflow(max_loops=1)
# Add tasks to the workflow
workflow.add("Generate a 10,000 word blog on health and wellness.", flow1)
workflow.add("Summarize the generated blog", flow2)
# Output the list of tasks in the workflow
print("Tasks in the workflow:")
for task in workflow.tasks:
print(f"Task: {task.description}")
```
In this example, we create a Sequential Workflow and add two tasks to it.
### Example 2: Resetting a Sequential Workflow
In this example, we'll create a Sequential Workflow, add tasks to it, and then reset it.
```python
from swarms.models import OpenAIChat
from swarms.structs import Flow
from swarms.structs.sequential_workflow import SequentialWorkflow
# Example usage
api_key = (
"" # Your actual API key here
)
# Initialize the language flow
llm = OpenAIChat(
openai_api_key=api_key,
temperature=0.5,
max_tokens=3000,
)
# Initialize Flows for individual tasks
flow1 = Flow(llm=llm, max_loops=1, dashboard=False)
flow2 = Flow(llm=llm, max_loops=1, dashboard=False)
# Create the Sequential Workflow
workflow = SequentialWorkflow(max_loops=1)
# Add tasks to the workflow
workflow.add("Generate a 10,000 word blog on health and wellness.", flow1)
workflow.add("Summarize the generated blog", flow2)
# Reset the workflow
workflow.reset_workflow()
# Output the list of tasks in the workflow after resetting
print("Tasks in the workflow after resetting:")
for task in workflow.tasks:
print(f"Task: {task.description}")
```
In this example, we create a Sequential Workflow, add two tasks to it, and then reset the workflow, clearing all task results.
### Example 3: Getting Task Results from a Sequential Workflow
In this example, we'll create a Sequential Workflow, add tasks to it, run the workflow, and then retrieve the results of each task.
```python
from swarms.models import OpenAIChat
from swarms.structs import Flow
from swarms.structs.sequential_workflow import SequentialWorkflow
# Example usage
api_key = (
"" # Your actual API key here
)
# Initialize the language flow
llm = OpenAIChat(
openai_api_key=api_key,
temperature=0.5,
max_tokens=3000,
)
# Initialize Flows for individual tasks
flow1 = Flow(llm=llm, max_loops=1, dashboard=False)
flow2 = Flow(llm=llm, max_loops=1, dashboard=False)
# Create the Sequential Workflow
workflow = SequentialWorkflow(max_loops=1)
# Add tasks to the workflow
workflow.add("Generate a 10,000 word blog on health and wellness.", flow1)
workflow.add("Summarize the generated blog", flow2)
# Run the workflow
workflow.run()
# Get and display the results of each task in the workflow
results = workflow.get_task_results()
for task_description, result in results.items():
print(f"Task: {task_description}, Result: {result}")
```
In this example, we create a Sequential Workflow, add two tasks to it, run the workflow, and then retrieve and display the results of each task.
### Example 4: Removing a Task from a Sequential Workflow
In this example, we'll create a Sequential Workflow, add tasks to it, and then remove a specific task from the workflow.
```python
from swarms.models import OpenAIChat
from swarms.structs import Flow
from swarms.structs.sequential_workflow import SequentialWorkflow
# Example usage
api_key = (
"" # Your actual API key here
)
# Initialize the language flow
llm = OpenAIChat(
openai_api_key=api_key,
temperature=0.5,
max_tokens=3000,
)
# Initialize Flows for individual tasks
flow1 = Flow(llm=llm, max_loops=1, dashboard=False)
flow2 = Flow(llm=llm, max_loops=1, dashboard=False)
# Create the Sequential Workflow
workflow = SequentialWorkflow(max_loops=1)
# Add tasks to the workflow
workflow.add("Generate a 10,000 word blog on health and wellness.", flow1)
workflow.add("Summarize the generated blog", flow2)
# Remove a specific task from the workflow
workflow.remove_task("Generate a 10,000 word blog on health and wellness.")
# Output the list of tasks in the workflow after removal
print("Tasks in the workflow after removing a task:")
for task in workflow.tasks:
print(f"Task: {task.description}")
```
In this example, we create a Sequential Workflow, add two tasks to it, and then remove a specific task from the workflow.
### Example 5: Updating Task Arguments in a Sequential Workflow
In this example, we'll create a Sequential Workflow, add tasks to it, and then update the arguments of a specific task in the workflow.
```python
from swarms.models import OpenAIChat
from swarms.structs import Flow
from swarms.structs.sequential_workflow import SequentialWorkflow
# Example usage
api_key = (
"" # Your actual API key here
)
# Initialize the language flow
llm = OpenAIChat(
openai_api_key=api_key,
temperature=0.5,
max_tokens=3000,
)
# Initialize Flows for individual tasks
flow1 = Flow(llm=llm, max_loops=1, dashboard=False)
flow2 = Flow(llm=llm, max_loops=1, dashboard=False)
# Create the Sequential Workflow
workflow = SequentialWorkflow(max_loops=1)
# Add tasks to the workflow
workflow.add("Generate a 10,000 word blog on health and wellness.", flow1)
workflow.add("Summarize the generated blog", flow2)
# Update the arguments of a specific task in the workflow
workflow.update_task("Generate a 10,000 word blog on health and wellness.", max_loops=2)
# Output the list of tasks in the workflow after updating task arguments
print("Tasks in the workflow after updating task arguments:")
for task in workflow.tasks:
print(f"Task: {task.description}, Arguments: {
task.arguments}")
```
In this example, we create a Sequential Workflow, add two tasks to it, and then update the arguments of a specific task in the workflow.
These examples demonstrate various operations and use cases for working with a Sequential Workflow.
# Why `SequentialWorkflow`?
## Enhancing Autonomous Agent Development
The development of autonomous agents, whether they are conversational AI, robotic systems, or any other AI-driven application, often involves complex workflows that require a sequence of tasks to be executed in a specific order. Managing and orchestrating these tasks efficiently is crucial for building reliable and effective agents. The Sequential Workflow module serves as a valuable tool for AI engineers in achieving this goal.
## Reliability and Coordination
One of the primary challenges in autonomous agent development is ensuring that tasks are executed in the correct sequence and that the results of one task can be used as inputs for subsequent tasks. The Sequential Workflow module simplifies this process by allowing AI engineers to define and manage workflows in a structured and organized manner.
By using the Sequential Workflow module, AI engineers can achieve the following benefits:
### 1. Improved Reliability
Reliability is a critical aspect of autonomous agents. The ability to handle errors gracefully and recover from failures is essential for building robust systems. The Sequential Workflow module offers a systematic approach to task execution, making it easier to handle errors, retry failed tasks, and ensure that the agent continues to operate smoothly.
### 2. Task Coordination
Coordinating tasks in the correct order is essential for achieving the desired outcome. The Sequential Workflow module enforces task sequencing, ensuring that each task is executed only when its dependencies are satisfied. This eliminates the risk of executing tasks out of order, which can lead to incorrect results.
### 3. Code Organization
Managing complex workflows can become challenging without proper organization. The Sequential Workflow module encourages AI engineers to structure their code in a modular and maintainable way. Each task can be encapsulated as a separate unit, making it easier to understand, modify, and extend the agent's behavior.
### 4. Workflow Visualization
Visualization is a powerful tool for understanding and debugging workflows. The Sequential Workflow module can be extended to include a visualization dashboard, allowing AI engineers to monitor the progress of tasks, track results, and identify bottlenecks or performance issues.
## TODO: Future Features
While the Sequential Workflow module offers significant advantages, there are opportunities for further enhancement. Here is a list of potential features and improvements that can be added to make it even more versatile and adaptable for various AI engineering tasks:
### 1. Asynchronous Support
Adding support for asynchronous task execution can improve the efficiency of workflows, especially when dealing with tasks that involve waiting for external events or resources.
### 2. Context Managers
Introducing context manager support for tasks can simplify resource management, such as opening and closing files, database connections, or network connections within a task's context.
### 3. Workflow History
Maintaining a detailed history of workflow execution, including timestamps, task durations, and input/output data, can facilitate debugging and performance analysis.
### 4. Parallel Processing
Enhancing the module to support parallel processing with a pool of workers can significantly speed up the execution of tasks, especially for computationally intensive workflows.
### 5. Error Handling Strategies
Providing built-in error handling strategies, such as retries, fallbacks, and custom error handling functions, can make the module more robust in handling unexpected failures.
## Conclusion
The Sequential Workflow module is a valuable tool for AI engineers working on autonomous agents and complex AI-driven applications. It offers a structured and reliable approach to defining and executing workflows, ensuring that tasks are performed in the correct sequence. By using this module, AI engineers can enhance the reliability, coordination, and maintainability of their agents.
As the field of AI continues to evolve, the demand for efficient workflow management tools will only increase. The Sequential Workflow module is a step towards meeting these demands and empowering AI engineers to create more reliable and capable autonomous agents. With future enhancements and features, it has the potential to become an indispensable asset in the AI engineer's toolkit.
In summary, the Sequential Workflow module provides a foundation for orchestrating complex tasks and workflows, enabling AI engineers to focus on designing intelligent agents that can perform tasks with precision and reliability.
## Frequently Asked Questions (FAQs)
### Q1: What is the difference between a task and a flow in Sequential Workflows?
**A1:** In Sequential Workflows, a **task** refers to a specific unit of work that needs to be executed. It can be implemented as a callable object, such as a Python function, and is the fundamental building block of a workflow.
A **flow**, on the other hand, is an encapsulation of a task within the workflow. Flows define the order in which tasks are executed and can be thought of as task containers. They allow you to specify dependencies, error handling, and other workflow-related configurations.
### Q2: Can I run tasks in parallel within a Sequential Workflow?
**A2:** Yes, you can run tasks in parallel within a Sequential Workflow by using parallel execution techniques. This advanced feature allows you to execute multiple tasks concurrently, improving performance and efficiency. You can explore this feature further in the guide's section on "Parallel Execution."
### Q3: How do I handle errors within Sequential Workflows?
**A3:** Error handling within Sequential Workflows can be implemented by adding error-handling logic within your task functions. You can catch exceptions and handle errors gracefully, ensuring that your workflow can recover from unexpected scenarios. The guide also covers more advanced error handling strategies, such as retrying failed tasks and handling specific error types.
### Q4: What are some real-world use cases for Sequential Workflows?
**A4:** Sequential Workflows can be applied to a wide range of real-world use cases, including:
- **Data ETL (Extract, Transform, Load) Processes:** Automating data pipelines that involve data extraction, transformation, and loading into databases or data warehouses.
- **Batch Processing:** Running batch jobs that process large volumes of data or perform data analysis.
- **Automation of DevOps Tasks:** Streamlining DevOps processes such as deployment, provisioning, and monitoring.
- **Cross-system Integrations:** Automating interactions between different systems, services, or APIs.
- **Report Generation:** Generating reports and documents automatically based on data inputs.
- **Workflow Orchestration:** Orchestrating complex workflows involving multiple steps and dependencies.
- **Resource Provisioning:** Automatically provisioning and managing cloud resources.
These are just a few examples, and Sequential Workflows can be tailored to various automation needs across industries.

@ -0,0 +1,167 @@
# Swarms Framework Documentation
---
## Overview
The Swarms framework is a Python library designed to facilitate the creation and management of a simulated group chat environment. This environment can be used for a variety of purposes, such as training conversational agents, role-playing games, or simulating dialogues for machine learning purposes. The core functionality revolves around managing the flow of messages between different agents within the chat, as well as handling the selection and responses of these agents based on the conversation's context.
### Purpose
The purpose of the Swarms framework, and specifically the `GroupChat` and `GroupChatManager` classes, is to simulate a dynamic and interactive conversation between multiple agents. This simulates a real-time chat environment where each participant is represented by an agent with a specific role and behavioral patterns. These agents interact within the rules of the group chat, controlled by the `GroupChatManager`.
### Key Features
- **Agent Interaction**: Allows multiple agents to communicate within a group chat scenario.
- **Message Management**: Handles the storage and flow of messages within the group chat.
- **Role Play**: Enables agents to assume specific roles and interact accordingly.
- **Conversation Context**: Maintains the context of the conversation for appropriate responses by agents.
---
## GroupChat Class
The `GroupChat` class is the backbone of the Swarms framework's chat simulation. It maintains the list of agents participating in the chat, the messages that have been exchanged, and the logic to reset the chat and determine the next speaker.
### Class Definition
#### Parameters
| Parameter | Type | Description | Default Value |
|------------|---------------------|--------------------------------------------------------------|---------------|
| agents | List[Flow] | List of agent flows participating in the group chat. | None |
| messages | List[Dict] | List of message dictionaries exchanged in the group chat. | None |
| max_round | int | Maximum number of rounds/messages allowed in the group chat. | 10 |
| admin_name | str | The name of the admin agent in the group chat. | "Admin" |
#### Class Properties and Methods
- `agent_names`: Returns a list of the names of the agents in the group chat.
- `reset()`: Clears all messages from the group chat.
- `agent_by_name(name: str) -> Flow`: Finds and returns an agent by name.
- `next_agent(agent: Flow) -> Flow`: Returns the next agent in the list.
- `select_speaker_msg() -> str`: Returns the message for selecting the next speaker.
- `select_speaker(last_speaker: Flow, selector: Flow) -> Flow`: Logic to select the next speaker based on the last speaker and the selector agent.
- `_participant_roles() -> str`: Returns a string listing all participant roles.
- `format_history(messages: List[Dict]) -> str`: Formats the history of messages for display or processing.
### Usage Examples
#### Example 1: Initializing a GroupChat
```python
from swarms.structs.flow import Flow
from swarms.groupchat import GroupChat
# Assuming Flow objects (flow1, flow2, flow3) are initialized and configured
agents = [flow1, flow2, flow3]
group_chat = GroupChat(agents=agents, messages=[], max_round=10)
```
#### Example 2: Resetting a GroupChat
```python
group_chat.reset()
```
#### Example 3: Selecting a Speaker
```python
last_speaker = agents[0] # Assuming this is a Flow object representing the last speaker
selector = agents[1] # Assuming this is a Flow object with the selector role
next_speaker = group_chat.select_speaker(last_speaker, selector)
```
---
## GroupChatManager Class
The `GroupChatManager` class acts as a controller for the `GroupChat` instance. It orchestrates the interaction between agents, prompts for tasks, and manages the rounds of conversation.
### Class Definition
#### Constructor Parameters
| Parameter | Type | Description |
|------------|-------------|------------------------------------------------------|
| groupchat | GroupChat | The GroupChat instance that the manager will handle. |
| selector | Flow | The Flow object that selects the next speaker. |
#### Methods
- `__call__(task: str)`: Invokes the GroupChatManager with a given task string to start the conversation.
### Usage Examples
#### Example 1: Initializing GroupChatManager
```python
from swarms.groupchat import GroupChat, GroupChatManager
from swarms.structs.flow import Flow
# Initialize your agents and group chat as shown in previous examples
chat_manager = GroupChatManager(groupchat=group_chat, selector=manager)
```
#### Example 2: Starting a Conversation
```python
# Start the group chat with a task
chat_history = chat_manager("Start a conversation about space exploration.")
```
#### Example 3: Using the Call Method
```python
# The call method is the same as starting a conversation
chat_history = chat_manager.__call__("Discuss recent advances in AI.")
```
---
## Conclusion
In summary, the Swarms framework offers a unique and effective solution for simulating group chat environments. Its `GroupChat` and `GroupChatManager` classes provide the necessary infrastructure to create dynamic conversations between agents, manage messages, and maintain the context of the dialogue. This framework can be instrumental in developing more sophisticated conversational agents, experimenting with social dynamics in chat environments, and providing a rich dataset for machine learning applications.
By leveraging the framework's features, users can create complex interaction scenarios that closely mimic real-world group communication. This can prove to be a valuable asset in the fields of artificial intelligence, computational social science, and beyond.
---
### Frequently Asked Questions (FAQ)
**Q: Can the Swarms framework handle real-time interactions between agents?**
A: The Swarms framework is designed to simulate group chat environments. While it does not handle real-time interactions as they would occur on a network, it can simulate the flow of conversation in a way that mimics real-time communication.
**Q: Is the Swarms framework capable of natural language processing?**
A: The framework itself is focused on the structure and management of group chats. It does not inherently include natural language processing (NLP) capabilities. However, it can be integrated with NLP tools to enhance the simulation with language understanding and generation features.
**Q: Can I customize the roles and behaviors of agents within the framework?**
A: Yes, the framework is designed to be flexible. You can define custom roles and behaviors for agents to fit the specific requirements of your simulation scenario.
**Q: What are the limitations of the Swarms framework?**
A: The framework is constrained by its design to simulate text-based group chats. It is not suitable for voice or video communication simulations. Additionally, its effectiveness depends on the sophistication of the agents decision-making logic, which is outside the framework itself.
**Q: Is it possible to integrate the Swarms framework with other chat services?**
A: The framework is can be integrated with any chat services. However, it could potentially be adapted to work with chat service APIs, where the agents could be used to simulate user behavior within a real chat application.
**Q: How does the `GroupChatManager` select the next speaker?**
A: The `GroupChatManager` uses a selection mechanism, which is typically based on the conversation's context and the roles of the agents, to determine the next speaker. The specifics of this mechanism can be customized to match the desired flow of the conversation.
**Q: Can I contribute to the Swarms framework or suggest features?**
A: As with many open-source projects, contributions and feature suggestions can usually be made through the project's repository on platforms like GitHub. It's best to check with the maintainers of the Swarms framework for their contribution guidelines.
**Q: Are there any tutorials or community support for new users of the Swarms framework?**
A: Documentation and usage examples are provided with the framework. Community support may be available through forums, chat groups, or the platform where the framework is hosted. Tutorials may also be available from third-party educators or in official documentation.
**Q: What programming skills do I need to use the Swarms framework effectively?**
A: You should have a good understanding of Python programming, including experience with classes and methods. Familiarity with the principles of agent-based modeling and conversational AI would also be beneficial.

@ -1,24 +1,39 @@
from swarms.models import OpenAIChat
from swarms import Worker
from swarms.prompts import PRODUCT_AGENT_PROMPT
from swarms.structs import Flow
api_key = ""
# Initialize the language model, this model can be swapped out with Anthropic, ETC, Huggingface Models like Mistral, ETC
llm = OpenAIChat(
# model_name="gpt-4"
openai_api_key=api_key,
temperature=0.5,
# max_tokens=100,
)
node = Worker(
## Initialize the workflow
flow = Flow(
llm=llm,
ai_name="Optimus Prime",
openai_api_key=api_key,
ai_role=PRODUCT_AGENT_PROMPT,
external_tools=None,
human_in_the_loop=False,
temperature=0.5,
max_loops=5,
dashboard=True,
# tools = [search_api, slack, ]
# stopping_condition=None, # You can define a stopping condition as needed.
# loop_interval=1,
# retry_attempts=3,
# retry_interval=1,
# interactive=False, # Set to 'True' for interactive mode.
# dynamic_temperature=False, # Set to 'True' for dynamic temperature handling.
)
task = "Locate 5 trending topics on healthy living, locate a website like NYTimes, and then generate an image of people doing those topics."
response = node.run(task)
print(response)
# out = flow.load_state("flow_state.json")
# temp = flow.dynamic_temperature()
# filter = flow.add_response_filter("Trump")
out = flow.run(
"Generate a 10,000 word blog on mental clarity and the benefits of meditation."
)
# out = flow.validate_response(out)
# out = flow.analyze_feedback(out)
# out = flow.print_history_and_memory()
# # out = flow.save_state("flow_state.json")
# print(out)

@ -1,16 +0,0 @@
from swarms.swarms import GodMode
from swarms.models import OpenAIChat
api_key = ""
llm = OpenAIChat(openai_api_key=api_key)
llms = [llm, llm, llm]
god_mode = GodMode(llms)
task = "Generate a 10,000 word blog on health and wellness."
out = god_mode.run(task)
god_mode.print_responses(task)

@ -1,109 +1,49 @@
# from swarms.structs import Flow
# from swarms.models import OpenAIChat
# from swarms.swarms.groupchat import GroupChat
# from swarms.agents import SimpleAgent
from swarms import OpenAI, Flow
from swarms.swarms.groupchat import GroupChatManager, GroupChat
# api_key = ""
# llm = OpenAIChat(
# openai_api_key=api_key,
# )
api_key = ""
# agent1 = SimpleAgent("Captain Price", Flow(llm=llm, max_loops=4))
# agent2 = SimpleAgent("John Mactavis", Flow(llm=llm, max_loops=4))
# # Create a groupchat with the 2 agents
# chat = GroupChat([agent1, agent2])
# # Assign duties to the agents
# chat.assign_duty(agent1.name, "Buy the groceries")
# chat.assign_duty(agent2.name, "Clean the house")
# # Initate a chat
# response = chat.run("Captain Price", "Hello, how are you John?")
# print(response)
from swarms.models import OpenAIChat
from swarms.structs import Flow
import random
api_key = "" # Your API Key here
class GroupChat:
"""
GroupChat class that facilitates agent-to-agent communication using multiple instances of the Flow class.
"""
def __init__(self, agents: list):
self.agents = {f"agent_{i}": agent for i, agent in enumerate(agents)}
self.message_log = []
def add_agent(self, agent: Flow):
agent_id = f"agent_{len(self.agents)}"
self.agents[agent_id] = agent
def remove_agent(self, agent_id: str):
if agent_id in self.agents:
del self.agents[agent_id]
def send_message(self, sender_id: str, recipient_id: str, message: str):
if sender_id not in self.agents or recipient_id not in self.agents:
raise ValueError("Invalid sender or recipient ID.")
formatted_message = f"{sender_id} to {recipient_id}: {message}"
self.message_log.append(formatted_message)
recipient_agent = self.agents[recipient_id]
recipient_agent.run(message)
def broadcast_message(self, sender_id: str, message: str):
for agent_id, agent in self.agents.items():
if agent_id != sender_id:
self.send_message(sender_id, agent_id, message)
def get_message_log(self):
return self.message_log
class EnhancedGroupChatV2(GroupChat):
def __init__(self, agents: list):
super().__init__(agents)
def multi_round_conversation(self, rounds: int = 5):
"""
Initiate a multi-round conversation between agents.
Args:
rounds (int): The number of rounds of conversation.
"""
for _ in range(rounds):
# Randomly select a sender and recipient agent for the conversation
sender_id = random.choice(list(self.agents.keys()))
recipient_id = random.choice(list(self.agents.keys()))
while recipient_id == sender_id: # Ensure the recipient is not the sender
recipient_id = random.choice(list(self.agents.keys()))
# Generate a message (for simplicity, a generic message is used)
message = f"Hello {recipient_id}, how are you today?"
self.send_message(sender_id, recipient_id, message)
# Sample usage with EnhancedGroupChatV2
# Initialize the language model
llm = OpenAIChat(
llm = OpenAI(
openai_api_key=api_key,
temperature=0.5,
max_tokens=3000,
)
# Initialize two Flow agents
agent1 = Flow(llm=llm, max_loops=5, dashboard=True)
agent2 = Flow(llm=llm, max_loops=5, dashboard=True)
# Initialize the flow
flow1 = Flow(
llm=llm,
max_loops=1,
system_prompt="YOU ARE SILLY, YOU OFFER NOTHING OF VALUE",
name="silly",
dashboard=True,
)
flow2 = Flow(
llm=llm,
max_loops=1,
system_prompt="YOU ARE VERY SMART AND ANSWER RIDDLES",
name="detective",
dashboard=True,
)
flow3 = Flow(
llm=llm,
max_loops=1,
system_prompt="YOU MAKE RIDDLES",
name="riddler",
dashboard=True,
)
manager = Flow(
llm=llm,
max_loops=1,
system_prompt="YOU ARE A GROUP CHAT MANAGER",
name="manager",
dashboard=True,
)
# Create an enhanced group chat with the two agents
enhanced_group_chat_v2 = EnhancedGroupChatV2(agents=[agent1, agent2])
# Simulate multi-round agent to agent communication
enhanced_group_chat_v2.multi_round_conversation(rounds=5)
# Example usage:
agents = [flow1, flow2, flow3]
enhanced_group_chat_v2.get_message_log() # Get the conversation log
group_chat = GroupChat(agents=agents, messages=[], max_round=10)
chat_manager = GroupChatManager(groupchat=group_chat, selector=manager)
chat_history = chat_manager("Write me a riddle")

@ -61,34 +61,19 @@ nav:
- Home:
- Overview: "index.md"
- Contributing: "contributing.md"
- FAQ: "faq.md"
- Purpose: "purpose.md"
- Roadmap: "roadmap.md"
- Weaknesses: "failures.md"
- Design: "design.md"
- Flywheel: "flywheel.md"
- Bounties: "bounties.md"
- Metric: "metric.md"
- Distribution: "distribution"
- Research: "research.md"
- Demos: "demos.md"
- Architecture: "architecture.md"
- Checklist: "checklist.md"
- Hiring: "hiring.md"
- Swarms:
- Overview: "swarms/index.md"
- swarms.swarms:
- AbstractSwarm: "swarms/swarms/abstractswarm.md"
- AutoScaler: "swarms/swarms/autoscaler.md"
- GodMode: "swarms/swarms/godmode.md"
- Groupchat: "swarms/swarms/groupchat.md"
- swarms.workers:
- AbstractWorker: "swarms/workers/base.md"
- Overview: "swarms/workers/index.md"
- AbstractWorker: "swarms/workers/abstract_worker.md"
- swarms.agents:
- AbstractAgent: "swarms/agents/abstract_agent.md"
- OmniModalAgent: "swarms/agents/omni_agent.md"
- Idea2Image: "swarms/agents/idea_to_image.md"
- swarms.models:
- Language:
- Overview: "swarms/models/index.md"
@ -98,6 +83,7 @@ nav:
- Zephyr: "swarms/models/zephyr.md"
- BioGPT: "swarms/models/biogpt.md"
- MPT7B: "swarms/models/mpt.md"
- Mistral: "swarms/models/mistral.md"
- MultiModal:
- Fuyu: "swarms/models/fuyu.md"
- Vilt: "swarms/models/vilt.md"
@ -105,28 +91,45 @@ nav:
- BingChat: "swarms/models/bingchat.md"
- Kosmos: "swarms/models/kosmos.md"
- Nougat: "swarms/models/nougat.md"
- Dalle3: "swarms/models/dalle3.md"
- GPT4V: "swarms/models/gpt4v.md"
- LayoutLMDocumentQA: "swarms/models/layoutlm_document_qa.md"
- DistilWhisperModel: "swarms/models/distilled_whisperx.md"
- swarms.structs:
- Overview: "swarms/structs/overview.md"
- Workflow: "swarms/structs/workflow.md"
- Flow: "swarms/structs/flow.md"
- SequentialWorkflow: 'swarms/structs/sequential_workflow.md'
- swarms.memory:
- PineconeVectorStoreStore: "swarms/memory/pinecone.md"
- PGVectorStore: "swarms/memory/pg.md"
- swarms.chunkers:
- BaseChunker: "swarms/chunkers/basechunker.md"
- PdfChunker: "swarms/chunkers/pdf_chunker.md"
- Walkthroughs:
- Guides:
- Overview: "examples/index.md"
- Structs:
- Flow: "examples/flow.md"
- Agents:
- Flow: "examples/flow.md"
- SequentialWorkflow: "examples/reliable_autonomous_agents.md"
- OmniAgent: "examples/omni_agent.md"
- Worker:
- Basic: "examples/worker.md"
- StackedWorker: "examples/stacked_worker.md"
- 2O+ Autonomous Agent Blogs: "examples/ideas.md"
- Applications:
- CustomerSupport:
- Overview: "applications/customer_support.md"
- Marketing:
- Overview: "applications/marketing_agencies.md"
- Corporate:
- FAQ: "corporate/faq.md"
- Purpose: "corporate/purpose.md"
- Roadmap: "corporate/roadmap.md"
- Weaknesses: "corporate/failures.md"
- Design: "corporate/design.md"
- Flywheel: "corporate/flywheel.md"
- Bounties: "corporate/bounties.md"
- Metric: "corporate/metric.md"
- Distribution: "corporate/distribution"
- Research: "corporate/research.md"
- Demos: "corporate/demos.md"
- Architecture: "corporate/architecture.md"
- Checklist: "corporate/checklist.md"
- Hiring: "corporate/hiring.md"

@ -0,0 +1,9 @@
from swarms.models.anthropic import Anthropic
model = Anthropic(anthropic_api_key="")
task = "What is quantum field theory? What are 3 books on the field?"
print(model(task))

Binary file not shown.

After

Width:  |  Height:  |  Size: 223 KiB

@ -0,0 +1,6 @@
from swarms.models.dalle3 import Dalle3
model = Dalle3()
task = "A painting of a dog"
img = model(task)

@ -1,4 +0,0 @@
from swarms.models import Fuyu
fuyu = Fuyu()
fuyu("Hello, my name is", "images/github-banner-swarms.png")

@ -0,0 +1,7 @@
from swarms.models.fuyu import Fuyu
img = "dalle3.jpeg"
fuyu = Fuyu()
fuyu("What is this image", img)

@ -0,0 +1,15 @@
from swarms.models.gpt4v import GPT4Vision
api_key = ""
gpt4vision = GPT4Vision(
openai_api_key=api_key,
)
img = "https://upload.wikimedia.org/wikipedia/commons/thumb/0/0d/VFPt_Solenoid_correct2.svg/640px-VFPt_Solenoid_correct2.svg.png"
task = "What is this image"
answer = gpt4vision.run(task, img)
print(answer)

@ -0,0 +1,7 @@
from swarms.models.gpt4v import GPT4Vision
gpt4vision = GPT4Vision(api_key="")
task = "What is the following image about?"
img = "https://cdn.openai.com/dall-e/encoded/feats/feats_01J9J5ZKJZJY9.png"
answer = gpt4vision.run(task, img)

@ -1,56 +0,0 @@
from swarms.models import OpenAIChat # Replace with your actual OpenAIChat import
if __name__ == "__main__":
api_key = "" # Your OpenAI API key here
agent = MultiTempAgent(api_key)
prompt = "Write a blog post about health and wellness"
final_output = agent.run(prompt)
print("Final chosen output:")
print(final_output)
class MultiTempAgent:
def __init__(self, api_key, default_temp=0.5, alt_temps=[0.2, 0.7, 0.9]):
self.api_key = api_key
self.default_temp = default_temp
self.alt_temps = alt_temps
def ask_user_feedback(self, text):
print(f"Generated text: {text}")
feedback = input("Are you satisfied with this output? (yes/no): ")
return feedback.lower() == "yes"
def present_options_to_user(self, outputs):
print("Alternative outputs:")
for temp, output in outputs.items():
print(f"Temperature {temp}: {output}")
chosen_temp = float(input("Choose the temperature of the output you like: "))
return outputs.get(chosen_temp, "Invalid temperature chosen.")
def run(self, prompt):
try:
llm = OpenAIChat(openai_api_key=self.api_key, temperature=self.default_temp)
initial_output = llm(prompt) # Using llm as a callable
except Exception as e:
print(f"Error generating initial output: {e}")
initial_output = None
user_satisfied = self.ask_user_feedback(initial_output)
if user_satisfied:
return initial_output
else:
outputs = {}
for temp in self.alt_temps:
try:
llm = OpenAIChat(
openai_api_key=self.api_key, temperature=temp
) # Re-initializing
outputs[temp] = llm(prompt) # Using llm as a callable
except Exception as e:
print(f"Error generating text at temperature {temp}: {e}")
outputs[temp] = None
chosen_output = self.present_options_to_user(outputs)
return chosen_output

@ -2,5 +2,5 @@ from swarms.models.openai_models import OpenAIChat
openai = OpenAIChat(openai_api_key="", verbose=False)
chat = openai("Are quantum fields everywhere?")
chat = openai("What are quantum fields?")
print(chat)

@ -0,0 +1,35 @@
from swarms.models import OpenAIChat
from swarms.structs import Flow
api_key = ""
# Initialize the language model, this model can be swapped out with Anthropic, ETC, Huggingface Models like Mistral, ETC
llm = OpenAIChat(
# model_name="gpt-4"
openai_api_key=api_key,
temperature=0.5,
# max_tokens=100,
)
## Initialize the workflow
flow = Flow(
llm=llm,
max_loops=2,
dashboard=True,
# stopping_condition=None, # You can define a stopping condition as needed.
# loop_interval=1,
# retry_attempts=3,
# retry_interval=1,
# interactive=False, # Set to 'True' for interactive mode.
# dynamic_temperature=False, # Set to 'True' for dynamic temperature handling.
)
# out = flow.load_state("flow_state.json")
# temp = flow.dynamic_temperature()
# filter = flow.add_response_filter("Trump")
out = flow.run("Generate a 10,000 word blog on health and wellness.")
# out = flow.validate_response(out)
# out = flow.analyze_feedback(out)
# out = flow.print_history_and_memory()
# # out = flow.save_state("flow_state.json")
# print(out)

@ -0,0 +1,31 @@
from swarms.models import OpenAIChat
from swarms.structs import Flow
from swarms.structs.sequential_workflow import SequentialWorkflow
# Example usage
llm = OpenAIChat(
temperature=0.5,
max_tokens=3000,
)
# Initialize the Flow with the language flow
flow1 = Flow(llm=llm, max_loops=1, dashboard=False)
# Create another Flow for a different task
flow2 = Flow(llm=llm, max_loops=1, dashboard=False)
# Create the workflow
workflow = SequentialWorkflow(max_loops=1)
# Add tasks to the workflow
workflow.add("Generate a 10,000 word blog on health and wellness.", flow1)
# Suppose the next task takes the output of the first task as input
workflow.add("Summarize the generated blog", flow2)
# Run the workflow
workflow.run()
# Output the results
for task in workflow.tasks:
print(f"Task: {task.description}, Result: {task.result}")

@ -1,39 +1,16 @@
from swarms.swarms import GodMode
from swarms.models import OpenAIChat
from swarms.swarms import GodMode
from swarms.workers.worker import Worker
api_key = ""
llm = OpenAIChat(openai_api_key=api_key)
llm = OpenAIChat(model_name="gpt-4", openai_api_key="api-key", temperature=0.5)
worker1 = Worker(
llm=llm,
ai_name="Bumble Bee",
ai_role="Worker in a swarm",
external_tools=None,
human_in_the_loop=False,
temperature=0.5,
)
worker2 = Worker(
llm=llm,
ai_name="Optimus Prime",
ai_role="Worker in a swarm",
external_tools=None,
human_in_the_loop=False,
temperature=0.5,
)
worker3 = Worker(
llm=llm,
ai_name="Megatron",
ai_role="Worker in a swarm",
external_tools=None,
human_in_the_loop=False,
temperature=0.5,
)
# Usage
agents = [worker1, worker2, worker3]
llms = [llm, llm, llm]
god_mode = GodMode(agents)
god_mode = GodMode(llms)
task = "What are the biggest risks facing humanity?"
task = "Generate a 10,000 word blog on health and wellness."
out = god_mode.run(task)
god_mode.print_responses(task)

@ -1,61 +1,49 @@
from swarms.models import OpenAIChat
from swarms.swarms import GroupChat, GroupChatManager
from swarms.workers import Worker
from swarms import OpenAI, Flow
from swarms.swarms.groupchat import GroupChatManager, GroupChat
llm = OpenAIChat(model_name="gpt-4", openai_api_key="api-key", temperature=0.5)
node = Worker(
llm=llm,
ai_name="Optimus Prime",
ai_role="Worker in a swarm",
external_tools=None,
human_in_the_loop=False,
api_key = ""
llm = OpenAI(
openai_api_key=api_key,
temperature=0.5,
max_tokens=3000,
)
node2 = Worker(
# Initialize the flow
flow1 = Flow(
llm=llm,
ai_name="Optimus Prime",
ai_role="Worker in a swarm",
external_tools=None,
human_in_the_loop=False,
temperature=0.5,
max_loops=1,
system_message="YOU ARE SILLY, YOU OFFER NOTHING OF VALUE",
name="silly",
dashboard=True,
)
node3 = Worker(
flow2 = Flow(
llm=llm,
ai_name="Optimus Prime",
ai_role="Worker in a swarm",
external_tools=None,
human_in_the_loop=False,
temperature=0.5,
max_loops=1,
system_message="YOU ARE VERY SMART AND ANSWER RIDDLES",
name="detective",
dashboard=True,
)
nodes = [node, node2, node3]
messages = [
{
"role": "system",
"context": "Create an a small feedforward in pytorch",
}
]
group = GroupChat(
workers=nodes,
messages=messages,
max_rounds=3,
flow3 = Flow(
llm=llm,
max_loops=1,
system_message="YOU MAKE RIDDLES",
name="riddler",
dashboard=True,
)
manager = GroupChatManager(
groupchat=group,
max_consecutive_auto_reply=3,
manager = Flow(
llm=llm,
max_loops=1,
system_message="YOU ARE A GROUP CHAT MANAGER",
name="manager",
dashboard=True,
)
output = group.run(
messages,
sender=node,
config=group,
)
print(output)
# Example usage:
agents = [flow1, flow2, flow3]
group_chat = GroupChat(agents=agents, messages=[], max_round=10)
chat_manager = GroupChatManager(groupchat=group_chat, selector=manager)
chat_history = chat_manager("Write me a riddle")

@ -1,5 +1,5 @@
from swarms import Workflow
from swarms.tools.autogpt import ChatOpenAI
from swarms.models import ChatOpenAI
workflow = Workflow(ChatOpenAI)

@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
[tool.poetry]
name = "swarms"
version = "1.9.3"
version = "2.1.6"
description = "Swarms - Pytorch"
license = "MIT"
authors = ["Kye Gomez <kye@apac.ai>"]
@ -28,7 +28,6 @@ openai = "*"
langchain = "*"
asyncio = "*"
nest_asyncio = "*"
pegasusx = "*"
einops = "*"
google-generativeai = "*"
torch = "*"
@ -38,21 +37,22 @@ duckduckgo-search = "*"
faiss-cpu = "*"
datasets = "*"
diffusers = "*"
accelerate = "*"
sentencepiece = "*"
wget = "*"
griptape = "*"
httpx = "*"
tiktoken = "*"
attrs = "*"
ggl = "*"
ratelimit = "*"
beautifulsoup4 = "*"
huggingface-hub = "*"
pydantic = "*"
tenacity = "*"
redis = "*"
Pillow = "*"
chromadb = "*"
agent-protocol = "*"
open-interpreter = "*"
tabulate = "*"
termcolor = "*"
black = "*"

@ -28,12 +28,15 @@ google-generativeai
sentencepiece
duckduckgo-search
agent-protocol
accelerate
chromadb
tiktoken
open-interpreter
tabulate
colored
griptape
addict
ratelimit
albumentations
basicsr
termcolor

@ -0,0 +1,35 @@
from swarms.models import OpenAIChat
from swarms.structs import Flow
from swarms.structs.sequential_workflow import SequentialWorkflow
# Example usage
api_key = ""
# Initialize the language flow
llm = OpenAIChat(
openai_api_key=api_key,
temperature=0.5,
max_tokens=3000,
)
# Initialize the Flow with the language flow
flow1 = Flow(llm=llm, max_loops=1, dashboard=False)
# Create another Flow for a different task
flow2 = Flow(llm=llm, max_loops=1, dashboard=False)
# Create the workflow
workflow = SequentialWorkflow(max_loops=1)
# Add tasks to the workflow
workflow.add("Generate a 10,000 word blog on health and wellness.", flow1)
# Suppose the next task takes the output of the first task as input
workflow.add("Summarize the generated blog", flow2)
# Run the workflow
workflow.run()
# Output the results
for task in workflow.tasks:
print(f"Task: {task.description}, Result: {task.result}")

@ -6,12 +6,9 @@ warnings.filterwarnings("ignore", category=UserWarning)
# disable tensorflow warnings
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"
from swarms.workers import *
from swarms.workers.worker import Worker
from swarms.chunkers import *
from swarms.models import * # import * only works when __all__ = [] is defined in __init__.py
from swarms.structs import *
from swarms.swarms import *
from swarms.agents import *
from swarms.swarms import *
from swarms.structs import *
from swarms.models import *
from swarms.chunkers import *
from swarms.workers import *

@ -4,9 +4,9 @@ from swarms.agents.message import Message
# from swarms.agents.stream_response import stream
from swarms.agents.base import AbstractAgent
from swarms.agents.registry import Registry
from swarms.agents.idea_to_image_agent import Idea2Image
from swarms.agents.simple_agent import SimpleAgent
# from swarms.agents.idea_to_image_agent import Idea2Image
from swarms.agents.simple_agent import SimpleAgent
"""Agent Infrastructure, models, memory, utils, tools"""
@ -16,6 +16,6 @@ __all__ = [
"Message",
"AbstractAgent",
"Registry",
"Idea2Image",
# "Idea2Image",
"SimpleAgent",
]

@ -34,7 +34,6 @@ from langchain_experimental.autonomous_agents.autogpt.prompt_generator import (
)
from langchain_experimental.pydantic_v1 import BaseModel, ValidationError
# PROMPT
FINISH_NAME = "finish"
@ -111,8 +110,7 @@ class AutoGPTPrompt(BaseChatPromptTemplate, BaseModel): # type: ignore[misc]
[self.token_counter(doc) for doc in relevant_memory]
)
content_format = (
f"This reminds you of these events "
f"from your past:\n{relevant_memory}\n\n"
f"This reminds you of these events from your past:\n{relevant_memory}\n\n"
)
memory_message = SystemMessage(content=content_format)
used_tokens += self.token_counter(memory_message.content)
@ -233,14 +231,14 @@ class PromptGenerator:
formatted_response_format = json.dumps(self.response_format, indent=4)
prompt_string = (
f"Constraints:\n{self._generate_numbered_list(self.constraints)}\n\n"
f"Commands:\n"
"Commands:\n"
f"{self._generate_numbered_list(self.commands, item_type='command')}\n\n"
f"Resources:\n{self._generate_numbered_list(self.resources)}\n\n"
f"Performance Evaluation:\n"
"Performance Evaluation:\n"
f"{self._generate_numbered_list(self.performance_evaluation)}\n\n"
f"You should only respond in JSON format as described below "
"You should only respond in JSON format as described below "
f"\nResponse Format: \n{formatted_response_format} "
f"\nEnsure the response can be parsed by Python json.loads"
"\nEnsure the response can be parsed by Python json.loads"
)
return prompt_string
@ -419,13 +417,11 @@ class AutoGPT:
else:
result = (
f"Unknown command '{action.name}'. "
f"Please refer to the 'COMMANDS' list for available "
f"commands and only respond in the specified JSON format."
"Please refer to the 'COMMANDS' list for available "
"commands and only respond in the specified JSON format."
)
memory_to_add = (
f"Assistant Reply: {assistant_reply} " f"\nResult: {result} "
)
memory_to_add = f"Assistant Reply: {assistant_reply} \nResult: {result} "
if self.feedback_tool is not None:
feedback = f"\n{self.feedback_tool.run('Input: ')}"
if feedback in {"q", "stop"}:

@ -75,7 +75,8 @@ class OpenAI:
except openai_model.error.RateLimitError as e:
sleep_duratoin = os.environ.get("OPENAI_RATE_TIMEOUT", 30)
print(
f"{str(e)}, sleep for {sleep_duratoin}s, set it by env OPENAI_RATE_TIMEOUT"
f"{str(e)}, sleep for {sleep_duratoin}s, set it by env"
" OPENAI_RATE_TIMEOUT"
)
time.sleep(sleep_duratoin)

@ -53,10 +53,12 @@ def record(agent_name: str, autotab_ext_path: Optional[str] = None):
file.write(data)
print(
"\033[34mYou have the Python debugger open, you can run commands in it like you would in a normal Python shell.\033[0m"
"\033[34mYou have the Python debugger open, you can run commands in it like you"
" would in a normal Python shell.\033[0m"
)
print(
"\033[34mTo exit, type 'q' and press enter. For a list of commands type '?' and press enter.\033[0m"
"\033[34mTo exit, type 'q' and press enter. For a list of commands type '?' and"
" press enter.\033[0m"
)
breakpoint()
@ -116,7 +118,8 @@ def open_plugin_and_login(driver: AutotabChromeDriver):
raise Exception("Invalid API key")
else:
raise Exception(
f"Error {response.status_code} from backend while logging you in with your API key: {response.text}"
f"Error {response.status_code} from backend while logging you in"
f" with your API key: {response.text}"
)
cookie["name"] = cookie["key"]
del cookie["key"]
@ -144,7 +147,8 @@ def get_driver(
options = webdriver.ChromeOptions()
options.add_argument("--no-sandbox") # Necessary for running
options.add_argument(
"--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
"--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
" (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
)
options.add_argument("--enable-webgl")
options.add_argument("--enable-3d-apis")
@ -371,7 +375,10 @@ def _login_with_google(driver, url: str, google_credentials: SiteCredentials):
)
main_window = driver.current_window_handle
xpath = "//*[contains(text(), 'Continue with Google') or contains(text(), 'Sign in with Google') or contains(@title, 'Sign in with Google')]"
xpath = (
"//*[contains(text(), 'Continue with Google') or contains(text(), 'Sign in with"
" Google') or contains(@title, 'Sign in with Google')]"
)
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, xpath)))
driver.find_element(
@ -477,8 +484,6 @@ def play(agent_name: Optional[str] = None):
if __name__ == "__main__":
play()
"""

@ -0,0 +1,4 @@
"""
Companion agents converse with the user about the agent the user wants to create then creates the agent with the desired attributes and traits and tools and configurations
"""

@ -19,7 +19,6 @@ from transformers.utils import is_offline_mode, is_openai_available, logging
# utils
logger = logging.get_logger(__name__)
if is_openai_available():
import openai
@ -28,7 +27,6 @@ else:
_tools_are_initialized = False
BASE_PYTHON_TOOLS = {
"print": print,
"range": range,
@ -48,7 +46,6 @@ class PreTool:
HUGGINGFACE_DEFAULT_TOOLS = {}
HUGGINGFACE_DEFAULT_TOOLS_FROM_HUB = [
"image-transformation",
"text-download",
@ -229,12 +226,14 @@ class Agent:
if len(replacements) > 1:
names = "\n".join([f"- {n}: {t}" for n, t in replacements.items()])
logger.warning(
f"The following tools have been replaced by the ones provided in `additional_tools`:\n{names}."
"The following tools have been replaced by the ones provided in"
f" `additional_tools`:\n{names}."
)
elif len(replacements) == 1:
name = list(replacements.keys())[0]
logger.warning(
f"{name} has been replaced by {replacements[name]} as provided in `additional_tools`."
f"{name} has been replaced by {replacements[name]} as provided in"
" `additional_tools`."
)
self.prepare_for_new_chat()
@ -425,9 +424,9 @@ class HFAgent(Agent):
api_key = os.environ.get("OPENAI_API_KEY", None)
if api_key is None:
raise ValueError(
"You need an openai key to use `OpenAIAgent`. You can get one here: Get one here "
"https://openai.com/api/`. If you have one, set it in your env with `os.environ['OPENAI_API_KEY'] = "
"xxx."
"You need an openai key to use `OpenAIAgent`. You can get one here: Get"
" one here https://openai.com/api/`. If you have one, set it in your"
" env with `os.environ['OPENAI_API_KEY'] = xxx."
)
else:
openai.api_key = api_key
@ -540,8 +539,9 @@ class AzureOpenAI(Agent):
api_key = os.environ.get("AZURE_OPENAI_API_KEY", None)
if api_key is None:
raise ValueError(
"You need an Azure openAI key to use `AzureOpenAIAgent`. If you have one, set it in your env with "
"`os.environ['AZURE_OPENAI_API_KEY'] = xxx."
"You need an Azure openAI key to use `AzureOpenAIAgent`. If you have"
" one, set it in your env with `os.environ['AZURE_OPENAI_API_KEY'] ="
" xxx."
)
else:
openai.api_key = api_key
@ -549,8 +549,9 @@ class AzureOpenAI(Agent):
resource_name = os.environ.get("AZURE_OPENAI_RESOURCE_NAME", None)
if resource_name is None:
raise ValueError(
"You need a resource_name to use `AzureOpenAIAgent`. If you have one, set it in your env with "
"`os.environ['AZURE_OPENAI_RESOURCE_NAME'] = xxx."
"You need a resource_name to use `AzureOpenAIAgent`. If you have one,"
" set it in your env with `os.environ['AZURE_OPENAI_RESOURCE_NAME'] ="
" xxx."
)
else:
openai.api_base = f"https://{resource_name}.openai.azure.com"

@ -1,7 +1,7 @@
import os
import logging
from dataclasses import dataclass
from dalle3 import Dalle
from swarms.models.dalle3 import Dalle
from swarms.models import OpenAIChat

@ -270,10 +270,12 @@ class InstructPix2Pix:
@prompts(
name="Instruct Image Using Text",
description="useful when you want to the style of the image to be like the text. "
"like: make it look like a painting. or make it like a robot. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the text. ",
description=(
"useful when you want to the style of the image to be like the text. "
"like: make it look like a painting. or make it like a robot. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the text. "
),
)
def inference(self, inputs):
"""Change style of image."""
@ -286,8 +288,8 @@ class InstructPix2Pix:
updated_image_path = get_new_image_name(image_path, func_name="pix2pix")
image.save(updated_image_path)
print(
f"\nProcessed InstructPix2Pix, Input Image: {image_path}, Instruct Text: {text}, "
f"Output Image: {updated_image_path}"
f"\nProcessed InstructPix2Pix, Input Image: {image_path}, Instruct Text:"
f" {text}, Output Image: {updated_image_path}"
)
return updated_image_path
@ -309,9 +311,12 @@ class Text2Image:
@prompts(
name="Generate Image From User Input Text",
description="useful when you want to generate an image from a user input text and save it to a file. "
"like: generate an image of an object or something, or generate an image that includes some objects. "
"The input to this tool should be a string, representing the text used to generate image. ",
description=(
"useful when you want to generate an image from a user input text and save"
" it to a file. like: generate an image of an object or something, or"
" generate an image that includes some objects. The input to this tool"
" should be a string, representing the text used to generate image. "
),
)
def inference(self, text):
image_filename = os.path.join("image", f"{str(uuid.uuid4())[:8]}.png")
@ -319,7 +324,8 @@ class Text2Image:
image = self.pipe(prompt, negative_prompt=self.n_prompt).images[0]
image.save(image_filename)
print(
f"\nProcessed Text2Image, Input Text: {text}, Output Image: {image_filename}"
f"\nProcessed Text2Image, Input Text: {text}, Output Image:"
f" {image_filename}"
)
return image_filename
@ -338,8 +344,11 @@ class ImageCaptioning:
@prompts(
name="Get Photo Description",
description="useful when you want to know what is inside the photo. receives image_path as input. "
"The input to this tool should be a string, representing the image_path. ",
description=(
"useful when you want to know what is inside the photo. receives image_path"
" as input. The input to this tool should be a string, representing the"
" image_path. "
),
)
def inference(self, image_path):
inputs = self.processor(Image.open(image_path), return_tensors="pt").to(
@ -348,7 +357,8 @@ class ImageCaptioning:
out = self.model.generate(**inputs)
captions = self.processor.decode(out[0], skip_special_tokens=True)
print(
f"\nProcessed ImageCaptioning, Input Image: {image_path}, Output Text: {captions}"
f"\nProcessed ImageCaptioning, Input Image: {image_path}, Output Text:"
f" {captions}"
)
return captions
@ -361,10 +371,12 @@ class Image2Canny:
@prompts(
name="Edge Detection On Image",
description="useful when you want to detect the edge of the image. "
"like: detect the edges of this image, or canny detection on image, "
"or perform edge detection on this image, or detect the canny image of this image. "
"The input to this tool should be a string, representing the image_path",
description=(
"useful when you want to detect the edge of the image. like: detect the"
" edges of this image, or canny detection on image, or perform edge"
" detection on this image, or detect the canny image of this image. The"
" input to this tool should be a string, representing the image_path"
),
)
def inference(self, inputs):
image = Image.open(inputs)
@ -376,7 +388,8 @@ class Image2Canny:
updated_image_path = get_new_image_name(inputs, func_name="edge")
canny.save(updated_image_path)
print(
f"\nProcessed Image2Canny, Input Image: {inputs}, Output Text: {updated_image_path}"
f"\nProcessed Image2Canny, Input Image: {inputs}, Output Text:"
f" {updated_image_path}"
)
return updated_image_path
@ -410,11 +423,14 @@ class CannyText2Image:
@prompts(
name="Generate Image Condition On Canny Image",
description="useful when you want to generate a new real image from both the user description and a canny image."
" like: generate a real image of a object or something from this canny image,"
" or generate a new real image of a object or something from this edge image. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the user description. ",
description=(
"useful when you want to generate a new real image from both the user"
" description and a canny image. like: generate a real image of a object or"
" something from this canny image, or generate a new real image of a object"
" or something from this edge image. The input to this tool should be a"
" comma separated string of two, representing the image_path and the user"
" description. "
),
)
def inference(self, inputs):
image_path, instruct_text = inputs.split(",")[0], ",".join(
@ -435,8 +451,8 @@ class CannyText2Image:
updated_image_path = get_new_image_name(image_path, func_name="canny2image")
image.save(updated_image_path)
print(
f"\nProcessed CannyText2Image, Input Canny: {image_path}, Input Text: {instruct_text}, "
f"Output Text: {updated_image_path}"
f"\nProcessed CannyText2Image, Input Canny: {image_path}, Input Text:"
f" {instruct_text}, Output Text: {updated_image_path}"
)
return updated_image_path
@ -448,10 +464,13 @@ class Image2Line:
@prompts(
name="Line Detection On Image",
description="useful when you want to detect the straight line of the image. "
"like: detect the straight lines of this image, or straight line detection on image, "
"or perform straight line detection on this image, or detect the straight line image of this image. "
"The input to this tool should be a string, representing the image_path",
description=(
"useful when you want to detect the straight line of the image. like:"
" detect the straight lines of this image, or straight line detection on"
" image, or perform straight line detection on this image, or detect the"
" straight line image of this image. The input to this tool should be a"
" string, representing the image_path"
),
)
def inference(self, inputs):
image = Image.open(inputs)
@ -459,7 +478,8 @@ class Image2Line:
updated_image_path = get_new_image_name(inputs, func_name="line-of")
mlsd.save(updated_image_path)
print(
f"\nProcessed Image2Line, Input Image: {inputs}, Output Line: {updated_image_path}"
f"\nProcessed Image2Line, Input Image: {inputs}, Output Line:"
f" {updated_image_path}"
)
return updated_image_path
@ -492,12 +512,14 @@ class LineText2Image:
@prompts(
name="Generate Image Condition On Line Image",
description="useful when you want to generate a new real image from both the user description "
"and a straight line image. "
"like: generate a real image of a object or something from this straight line image, "
"or generate a new real image of a object or something from this straight lines. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the user description. ",
description=(
"useful when you want to generate a new real image from both the user"
" description and a straight line image. like: generate a real image of a"
" object or something from this straight line image, or generate a new real"
" image of a object or something from this straight lines. The input to"
" this tool should be a comma separated string of two, representing the"
" image_path and the user description. "
),
)
def inference(self, inputs):
image_path, instruct_text = inputs.split(",")[0], ",".join(
@ -518,8 +540,8 @@ class LineText2Image:
updated_image_path = get_new_image_name(image_path, func_name="line2image")
image.save(updated_image_path)
print(
f"\nProcessed LineText2Image, Input Line: {image_path}, Input Text: {instruct_text}, "
f"Output Text: {updated_image_path}"
f"\nProcessed LineText2Image, Input Line: {image_path}, Input Text:"
f" {instruct_text}, Output Text: {updated_image_path}"
)
return updated_image_path
@ -531,10 +553,13 @@ class Image2Hed:
@prompts(
name="Hed Detection On Image",
description="useful when you want to detect the soft hed boundary of the image. "
"like: detect the soft hed boundary of this image, or hed boundary detection on image, "
"or perform hed boundary detection on this image, or detect soft hed boundary image of this image. "
"The input to this tool should be a string, representing the image_path",
description=(
"useful when you want to detect the soft hed boundary of the image. like:"
" detect the soft hed boundary of this image, or hed boundary detection on"
" image, or perform hed boundary detection on this image, or detect soft"
" hed boundary image of this image. The input to this tool should be a"
" string, representing the image_path"
),
)
def inference(self, inputs):
image = Image.open(inputs)
@ -542,7 +567,8 @@ class Image2Hed:
updated_image_path = get_new_image_name(inputs, func_name="hed-boundary")
hed.save(updated_image_path)
print(
f"\nProcessed Image2Hed, Input Image: {inputs}, Output Hed: {updated_image_path}"
f"\nProcessed Image2Hed, Input Image: {inputs}, Output Hed:"
f" {updated_image_path}"
)
return updated_image_path
@ -575,12 +601,14 @@ class HedText2Image:
@prompts(
name="Generate Image Condition On Soft Hed Boundary Image",
description="useful when you want to generate a new real image from both the user description "
"and a soft hed boundary image. "
"like: generate a real image of a object or something from this soft hed boundary image, "
"or generate a new real image of a object or something from this hed boundary. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the user description",
description=(
"useful when you want to generate a new real image from both the user"
" description and a soft hed boundary image. like: generate a real image of"
" a object or something from this soft hed boundary image, or generate a"
" new real image of a object or something from this hed boundary. The input"
" to this tool should be a comma separated string of two, representing the"
" image_path and the user description"
),
)
def inference(self, inputs):
image_path, instruct_text = inputs.split(",")[0], ",".join(
@ -601,8 +629,8 @@ class HedText2Image:
updated_image_path = get_new_image_name(image_path, func_name="hed2image")
image.save(updated_image_path)
print(
f"\nProcessed HedText2Image, Input Hed: {image_path}, Input Text: {instruct_text}, "
f"Output Image: {updated_image_path}"
f"\nProcessed HedText2Image, Input Hed: {image_path}, Input Text:"
f" {instruct_text}, Output Image: {updated_image_path}"
)
return updated_image_path
@ -614,10 +642,12 @@ class Image2Scribble:
@prompts(
name="Sketch Detection On Image",
description="useful when you want to generate a scribble of the image. "
"like: generate a scribble of this image, or generate a sketch from this image, "
"detect the sketch from this image. "
"The input to this tool should be a string, representing the image_path",
description=(
"useful when you want to generate a scribble of the image. like: generate a"
" scribble of this image, or generate a sketch from this image, detect the"
" sketch from this image. The input to this tool should be a string,"
" representing the image_path"
),
)
def inference(self, inputs):
image = Image.open(inputs)
@ -625,7 +655,8 @@ class Image2Scribble:
updated_image_path = get_new_image_name(inputs, func_name="scribble")
scribble.save(updated_image_path)
print(
f"\nProcessed Image2Scribble, Input Image: {inputs}, Output Scribble: {updated_image_path}"
f"\nProcessed Image2Scribble, Input Image: {inputs}, Output Scribble:"
f" {updated_image_path}"
)
return updated_image_path
@ -659,10 +690,12 @@ class ScribbleText2Image:
@prompts(
name="Generate Image Condition On Sketch Image",
description="useful when you want to generate a new real image from both the user description and "
"a scribble image or a sketch image. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the user description",
description=(
"useful when you want to generate a new real image from both the user"
" description and a scribble image or a sketch image. The input to this"
" tool should be a comma separated string of two, representing the"
" image_path and the user description"
),
)
def inference(self, inputs):
image_path, instruct_text = inputs.split(",")[0], ",".join(
@ -683,8 +716,8 @@ class ScribbleText2Image:
updated_image_path = get_new_image_name(image_path, func_name="scribble2image")
image.save(updated_image_path)
print(
f"\nProcessed ScribbleText2Image, Input Scribble: {image_path}, Input Text: {instruct_text}, "
f"Output Image: {updated_image_path}"
f"\nProcessed ScribbleText2Image, Input Scribble: {image_path}, Input Text:"
f" {instruct_text}, Output Image: {updated_image_path}"
)
return updated_image_path
@ -696,9 +729,11 @@ class Image2Pose:
@prompts(
name="Pose Detection On Image",
description="useful when you want to detect the human pose of the image. "
"like: generate human poses of this image, or generate a pose image from this image. "
"The input to this tool should be a string, representing the image_path",
description=(
"useful when you want to detect the human pose of the image. like: generate"
" human poses of this image, or generate a pose image from this image. The"
" input to this tool should be a string, representing the image_path"
),
)
def inference(self, inputs):
image = Image.open(inputs)
@ -706,7 +741,8 @@ class Image2Pose:
updated_image_path = get_new_image_name(inputs, func_name="human-pose")
pose.save(updated_image_path)
print(
f"\nProcessed Image2Pose, Input Image: {inputs}, Output Pose: {updated_image_path}"
f"\nProcessed Image2Pose, Input Image: {inputs}, Output Pose:"
f" {updated_image_path}"
)
return updated_image_path
@ -742,12 +778,13 @@ class PoseText2Image:
@prompts(
name="Generate Image Condition On Pose Image",
description="useful when you want to generate a new real image from both the user description "
"and a human pose image. "
"like: generate a real image of a human from this human pose image, "
"or generate a new real image of a human from this pose. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the user description",
description=(
"useful when you want to generate a new real image from both the user"
" description and a human pose image. like: generate a real image of a"
" human from this human pose image, or generate a new real image of a human"
" from this pose. The input to this tool should be a comma separated string"
" of two, representing the image_path and the user description"
),
)
def inference(self, inputs):
image_path, instruct_text = inputs.split(",")[0], ",".join(
@ -768,8 +805,8 @@ class PoseText2Image:
updated_image_path = get_new_image_name(image_path, func_name="pose2image")
image.save(updated_image_path)
print(
f"\nProcessed PoseText2Image, Input Pose: {image_path}, Input Text: {instruct_text}, "
f"Output Image: {updated_image_path}"
f"\nProcessed PoseText2Image, Input Pose: {image_path}, Input Text:"
f" {instruct_text}, Output Image: {updated_image_path}"
)
return updated_image_path
@ -802,11 +839,14 @@ class SegText2Image:
@prompts(
name="Generate Image Condition On Segmentations",
description="useful when you want to generate a new real image from both the user description and segmentations. "
"like: generate a real image of a object or something from this segmentation image, "
"or generate a new real image of a object or something from these segmentations. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the user description",
description=(
"useful when you want to generate a new real image from both the user"
" description and segmentations. like: generate a real image of a object or"
" something from this segmentation image, or generate a new real image of a"
" object or something from these segmentations. The input to this tool"
" should be a comma separated string of two, representing the image_path"
" and the user description"
),
)
def inference(self, inputs):
image_path, instruct_text = inputs.split(",")[0], ",".join(
@ -827,8 +867,8 @@ class SegText2Image:
updated_image_path = get_new_image_name(image_path, func_name="segment2image")
image.save(updated_image_path)
print(
f"\nProcessed SegText2Image, Input Seg: {image_path}, Input Text: {instruct_text}, "
f"Output Image: {updated_image_path}"
f"\nProcessed SegText2Image, Input Seg: {image_path}, Input Text:"
f" {instruct_text}, Output Image: {updated_image_path}"
)
return updated_image_path
@ -840,9 +880,12 @@ class Image2Depth:
@prompts(
name="Predict Depth On Image",
description="useful when you want to detect depth of the image. like: generate the depth from this image, "
"or detect the depth map on this image, or predict the depth for this image. "
"The input to this tool should be a string, representing the image_path",
description=(
"useful when you want to detect depth of the image. like: generate the"
" depth from this image, or detect the depth map on this image, or predict"
" the depth for this image. The input to this tool should be a string,"
" representing the image_path"
),
)
def inference(self, inputs):
image = Image.open(inputs)
@ -854,7 +897,8 @@ class Image2Depth:
updated_image_path = get_new_image_name(inputs, func_name="depth")
depth.save(updated_image_path)
print(
f"\nProcessed Image2Depth, Input Image: {inputs}, Output Depth: {updated_image_path}"
f"\nProcessed Image2Depth, Input Image: {inputs}, Output Depth:"
f" {updated_image_path}"
)
return updated_image_path
@ -888,11 +932,14 @@ class DepthText2Image:
@prompts(
name="Generate Image Condition On Depth",
description="useful when you want to generate a new real image from both the user description and depth image. "
"like: generate a real image of a object or something from this depth image, "
"or generate a new real image of a object or something from the depth map. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the user description",
description=(
"useful when you want to generate a new real image from both the user"
" description and depth image. like: generate a real image of a object or"
" something from this depth image, or generate a new real image of a object"
" or something from the depth map. The input to this tool should be a comma"
" separated string of two, representing the image_path and the user"
" description"
),
)
def inference(self, inputs):
image_path, instruct_text = inputs.split(",")[0], ",".join(
@ -913,8 +960,8 @@ class DepthText2Image:
updated_image_path = get_new_image_name(image_path, func_name="depth2image")
image.save(updated_image_path)
print(
f"\nProcessed DepthText2Image, Input Depth: {image_path}, Input Text: {instruct_text}, "
f"Output Image: {updated_image_path}"
f"\nProcessed DepthText2Image, Input Depth: {image_path}, Input Text:"
f" {instruct_text}, Output Image: {updated_image_path}"
)
return updated_image_path
@ -929,9 +976,11 @@ class Image2Normal:
@prompts(
name="Predict Normal Map On Image",
description="useful when you want to detect norm map of the image. "
"like: generate normal map from this image, or predict normal map of this image. "
"The input to this tool should be a string, representing the image_path",
description=(
"useful when you want to detect norm map of the image. like: generate"
" normal map from this image, or predict normal map of this image. The"
" input to this tool should be a string, representing the image_path"
),
)
def inference(self, inputs):
image = Image.open(inputs)
@ -954,7 +1003,8 @@ class Image2Normal:
updated_image_path = get_new_image_name(inputs, func_name="normal-map")
image.save(updated_image_path)
print(
f"\nProcessed Image2Normal, Input Image: {inputs}, Output Depth: {updated_image_path}"
f"\nProcessed Image2Normal, Input Image: {inputs}, Output Depth:"
f" {updated_image_path}"
)
return updated_image_path
@ -988,11 +1038,14 @@ class NormalText2Image:
@prompts(
name="Generate Image Condition On Normal Map",
description="useful when you want to generate a new real image from both the user description and normal map. "
"like: generate a real image of a object or something from this normal map, "
"or generate a new real image of a object or something from the normal map. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the user description",
description=(
"useful when you want to generate a new real image from both the user"
" description and normal map. like: generate a real image of a object or"
" something from this normal map, or generate a new real image of a object"
" or something from the normal map. The input to this tool should be a"
" comma separated string of two, representing the image_path and the user"
" description"
),
)
def inference(self, inputs):
image_path, instruct_text = inputs.split(",")[0], ",".join(
@ -1013,8 +1066,8 @@ class NormalText2Image:
updated_image_path = get_new_image_name(image_path, func_name="normal2image")
image.save(updated_image_path)
print(
f"\nProcessed NormalText2Image, Input Normal: {image_path}, Input Text: {instruct_text}, "
f"Output Image: {updated_image_path}"
f"\nProcessed NormalText2Image, Input Normal: {image_path}, Input Text:"
f" {instruct_text}, Output Image: {updated_image_path}"
)
return updated_image_path
@ -1031,9 +1084,12 @@ class VisualQuestionAnswering:
@prompts(
name="Answer Question About The Image",
description="useful when you need an answer for a question based on an image. "
"like: what is the background color of the last image, how many cats in this figure, what is in this figure. "
"The input to this tool should be a comma separated string of two, representing the image_path and the question",
description=(
"useful when you need an answer for a question based on an image. like:"
" what is the background color of the last image, how many cats in this"
" figure, what is in this figure. The input to this tool should be a comma"
" separated string of two, representing the image_path and the question"
),
)
def inference(self, inputs):
image_path, question = inputs.split(",")[0], ",".join(inputs.split(",")[1:])
@ -1044,8 +1100,8 @@ class VisualQuestionAnswering:
out = self.model.generate(**inputs)
answer = self.processor.decode(out[0], skip_special_tokens=True)
print(
f"\nProcessed VisualQuestionAnswering, Input Image: {image_path}, Input Question: {question}, "
f"Output Answer: {answer}"
f"\nProcessed VisualQuestionAnswering, Input Image: {image_path}, Input"
f" Question: {question}, Output Answer: {answer}"
)
return answer
@ -1245,12 +1301,13 @@ class Segmenting:
@prompts(
name="Segment the Image",
description="useful when you want to segment all the part of the image, but not segment a certain object."
"like: segment all the object in this image, or generate segmentations on this image, "
"or segment the image,"
"or perform segmentation on this image, "
"or segment all the object in this image."
"The input to this tool should be a string, representing the image_path",
description=(
"useful when you want to segment all the part of the image, but not segment"
" a certain object.like: segment all the object in this image, or generate"
" segmentations on this image, or segment the image,or perform segmentation"
" on this image, or segment all the object in this image.The input to this"
" tool should be a string, representing the image_path"
),
)
def inference_all(self, image_path):
image = cv2.imread(image_path)
@ -1401,9 +1458,12 @@ class Text2Box:
@prompts(
name="Detect the Give Object",
description="useful when you only want to detect or find out given objects in the picture"
"The input to this tool should be a comma separated string of two, "
"representing the image_path, the text description of the object to be found",
description=(
"useful when you only want to detect or find out given objects in the"
" pictureThe input to this tool should be a comma separated string of two,"
" representing the image_path, the text description of the object to be"
" found"
),
)
def inference(self, inputs):
image_path, det_prompt = inputs.split(",")
@ -1427,8 +1487,8 @@ class Text2Box:
updated_image = image_with_box.resize(size)
updated_image.save(updated_image_path)
print(
f"\nProcessed ObejectDetecting, Input Image: {image_path}, Object to be Detect {det_prompt}, "
f"Output Image: {updated_image_path}"
f"\nProcessed ObejectDetecting, Input Image: {image_path}, Object to be"
f" Detect {det_prompt}, Output Image: {updated_image_path}"
)
return updated_image_path
@ -1483,7 +1543,8 @@ class InfinityOutPainting:
out = self.ImageVQA.model.generate(**inputs)
answer = self.ImageVQA.processor.decode(out[0], skip_special_tokens=True)
print(
f"\nProcessed VisualQuestionAnswering, Input Question: {question}, Output Answer: {answer}"
f"\nProcessed VisualQuestionAnswering, Input Question: {question}, Output"
f" Answer: {answer}"
)
return answer
@ -1499,9 +1560,9 @@ class InfinityOutPainting:
def check_prompt(self, prompt):
check = (
f"Here is a paragraph with adjectives. "
"Here is a paragraph with adjectives. "
f"{prompt} "
f"Please change all plural forms in the adjectives to singular forms. "
"Please change all plural forms in the adjectives to singular forms. "
)
return self.llm(check)
@ -1512,13 +1573,12 @@ class InfinityOutPainting:
)
style = self.get_BLIP_vqa(image, "what is the style of this image")
imagine_prompt = (
f"let's pretend you are an excellent painter and now "
f"there is an incomplete painting with {BLIP_caption} in the center, "
f"please imagine the complete painting and describe it"
f"you should consider the background color is {background_color}, the style is {style}"
f"You should make the painting as vivid and realistic as possible"
f"You can not use words like painting or picture"
f"and you should use no more than 50 words to describe it"
"let's pretend you are an excellent painter and now there is an incomplete"
f" painting with {BLIP_caption} in the center, please imagine the complete"
" painting and describe ityou should consider the background color is"
f" {background_color}, the style is {style}You should make the painting as"
" vivid and realistic as possibleYou can not use words like painting or"
" pictureand you should use no more than 50 words to describe it"
)
caption = self.llm(imagine_prompt) if imagine else BLIP_caption
caption = self.check_prompt(caption)
@ -1580,9 +1640,12 @@ class InfinityOutPainting:
@prompts(
name="Extend An Image",
description="useful when you need to extend an image into a larger image."
"like: extend the image into a resolution of 2048x1024, extend the image into 2048x1024. "
"The input to this tool should be a comma separated string of two, representing the image_path and the resolution of widthxheight",
description=(
"useful when you need to extend an image into a larger image.like: extend"
" the image into a resolution of 2048x1024, extend the image into"
" 2048x1024. The input to this tool should be a comma separated string of"
" two, representing the image_path and the resolution of widthxheight"
),
)
def inference(self, inputs):
image_path, resolution = inputs.split(",")
@ -1594,8 +1657,8 @@ class InfinityOutPainting:
updated_image_path = get_new_image_name(image_path, func_name="outpainting")
out_painted_image.save(updated_image_path)
print(
f"\nProcessed InfinityOutPainting, Input Image: {image_path}, Input Resolution: {resolution}, "
f"Output Image: {updated_image_path}"
f"\nProcessed InfinityOutPainting, Input Image: {image_path}, Input"
f" Resolution: {resolution}, Output Image: {updated_image_path}"
)
return updated_image_path
@ -1610,12 +1673,13 @@ class ObjectSegmenting:
@prompts(
name="Segment the given object",
description="useful when you only want to segment the certain objects in the picture"
"according to the given text"
"like: segment the cat,"
"or can you segment an obeject for me"
"The input to this tool should be a comma separated string of two, "
"representing the image_path, the text description of the object to be found",
description=(
"useful when you only want to segment the certain objects in the"
" pictureaccording to the given textlike: segment the cat,or can you"
" segment an obeject for meThe input to this tool should be a comma"
" separated string of two, representing the image_path, the text"
" description of the object to be found"
),
)
def inference(self, inputs):
image_path, det_prompt = inputs.split(",")
@ -1627,8 +1691,8 @@ class ObjectSegmenting:
image_pil, image_path, boxes_filt, pred_phrases
)
print(
f"\nProcessed ObejectSegmenting, Input Image: {image_path}, Object to be Segment {det_prompt}, "
f"Output Image: {updated_image_path}"
f"\nProcessed ObejectSegmenting, Input Image: {image_path}, Object to be"
f" Segment {det_prompt}, Output Image: {updated_image_path}"
)
return updated_image_path
@ -1710,10 +1774,12 @@ class ImageEditing:
@prompts(
name="Remove Something From The Photo",
description="useful when you want to remove and object or something from the photo "
"from its description or location. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the object need to be removed. ",
description=(
"useful when you want to remove and object or something from the photo "
"from its description or location. "
"The input to this tool should be a comma separated string of two, "
"representing the image_path and the object need to be removed. "
),
)
def inference_remove(self, inputs):
image_path, to_be_removed_txt = inputs.split(",")[0], ",".join(
@ -1725,10 +1791,12 @@ class ImageEditing:
@prompts(
name="Replace Something From The Photo",
description="useful when you want to replace an object from the object description or "
"location with another object from its description. "
"The input to this tool should be a comma separated string of three, "
"representing the image_path, the object to be replaced, the object to be replaced with ",
description=(
"useful when you want to replace an object from the object description or"
" location with another object from its description. The input to this tool"
" should be a comma separated string of three, representing the image_path,"
" the object to be replaced, the object to be replaced with "
),
)
def inference_replace_sam(self, inputs):
image_path, to_be_replaced_txt, replace_with_txt = inputs.split(",")
@ -1758,8 +1826,9 @@ class ImageEditing:
updated_image = updated_image.resize(image_pil.size)
updated_image.save(updated_image_path)
print(
f"\nProcessed ImageEditing, Input Image: {image_path}, Replace {to_be_replaced_txt} to {replace_with_txt}, "
f"Output Image: {updated_image_path}"
f"\nProcessed ImageEditing, Input Image: {image_path}, Replace"
f" {to_be_replaced_txt} to {replace_with_txt}, Output Image:"
f" {updated_image_path}"
)
return updated_image_path
@ -1782,8 +1851,10 @@ class BackgroundRemoving:
@prompts(
name="Remove the background",
description="useful when you want to extract the object or remove the background,"
"the input should be a string image_path",
description=(
"useful when you want to extract the object or remove the background,"
"the input should be a string image_path"
),
)
def inference(self, image_path):
"""
@ -1833,7 +1904,8 @@ class MultiModalVisualAgent:
if "ImageCaptioning" not in load_dict:
raise ValueError(
"You have to load ImageCaptioning as a basic function for MultiModalVisualAgent"
"You have to load ImageCaptioning as a basic function for"
" MultiModalVisualAgent"
)
self.models = {}
@ -1944,10 +2016,21 @@ class MultiModalVisualAgent:
description = self.models["ImageCaptioning"].inference(image_filename)
if lang == "Chinese":
Human_prompt = f'\nHuman: 提供一张名为 {image_filename}的图片。它的描述是: {description}。 这些信息帮助你理解这个图像,但是你应该使用工具来完成下面的任务,而不是直接从我的描述中想象。 如果你明白了, 说 "收到". \n'
Human_prompt = (
f"\nHuman: 提供一张名为 {image_filename}的图片。它的描述是:"
f" {description}。 这些信息帮助你理解这个图像,"
"但是你应该使用工具来完成下面的任务,而不是直接从我的描述中想象。"
' 如果你明白了, 说 "收到". \n'
)
AI_prompt = "收到。 "
else:
Human_prompt = f'\nHuman: provide a figure named {image_filename}. The description is: {description}. This information helps you to understand this image, but you should use tools to finish following tasks, rather than directly imagine from my description. If you understand, say "Received". \n'
Human_prompt = (
f"\nHuman: provide a figure named {image_filename}. The description is:"
f" {description}. This information helps you to understand this image,"
" but you should use tools to finish following tasks, rather than"
" directly imagine from my description. If you understand, say"
' "Received". \n'
)
AI_prompt = "Received. "
self.agent.memory.buffer = (

@ -16,7 +16,6 @@ from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from pydantic import BaseModel, Field
from swarms.prompts.sales import SALES_AGENT_TOOLS_PROMPT, conversation_stages
from swarms.tools.interpreter_tool import compile
# classes
@ -164,14 +163,10 @@ def get_tools(product_catalog):
Tool(
name="ProductSearch",
func=knowledge_base.run,
description="useful for when you need to answer questions about product information",
description=(
"useful for when you need to answer questions about product information"
),
),
# Interpreter
Tool(
name="Code Interepeter",
func=compile,
description="Useful when you need to run code locally, such as Python, Javascript, Shell, and more.",
)
# omnimodal agent
]
@ -231,7 +226,10 @@ class SalesConvoOutputParser(AgentOutputParser):
# TODO - this is not entirely reliable, sometimes results in an error.
return AgentFinish(
{
"output": "I apologize, I was unable to find the answer to your question. Is there anything else I can help with?"
"output": (
"I apologize, I was unable to find the answer to your question."
" Is there anything else I can help with?"
)
},
text,
)
@ -257,21 +255,62 @@ class ProfitPilot(Chain, BaseModel):
use_tools: bool = False
conversation_stage_dict: Dict = {
"1": "Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional. Your greeting should be welcoming. Always clarify in your greeting the reason why you are contacting the prospect.",
"2": "Qualification: Qualify the prospect by confirming if they are the right person to talk to regarding your product/service. Ensure that they have the authority to make purchasing decisions.",
"3": "Value proposition: Briefly explain how your product/service can benefit the prospect. Focus on the unique selling points and value proposition of your product/service that sets it apart from competitors.",
"4": "Needs analysis: Ask open-ended questions to uncover the prospect's needs and pain points. Listen carefully to their responses and take notes.",
"5": "Solution presentation: Based on the prospect's needs, present your product/service as the solution that can address their pain points.",
"6": "Objection handling: Address any objections that the prospect may have regarding your product/service. Be prepared to provide evidence or testimonials to support your claims.",
"7": "Close: Ask for the sale by proposing a next step. This could be a demo, a trial or a meeting with decision-makers. Ensure to summarize what has been discussed and reiterate the benefits.",
"1": (
"Introduction: Start the conversation by introducing yourself and your"
" company. Be polite and respectful while keeping the tone of the"
" conversation professional. Your greeting should be welcoming. Always"
" clarify in your greeting the reason why you are contacting the prospect."
),
"2": (
"Qualification: Qualify the prospect by confirming if they are the right"
" person to talk to regarding your product/service. Ensure that they have"
" the authority to make purchasing decisions."
),
"3": (
"Value proposition: Briefly explain how your product/service can benefit"
" the prospect. Focus on the unique selling points and value proposition of"
" your product/service that sets it apart from competitors."
),
"4": (
"Needs analysis: Ask open-ended questions to uncover the prospect's needs"
" and pain points. Listen carefully to their responses and take notes."
),
"5": (
"Solution presentation: Based on the prospect's needs, present your"
" product/service as the solution that can address their pain points."
),
"6": (
"Objection handling: Address any objections that the prospect may have"
" regarding your product/service. Be prepared to provide evidence or"
" testimonials to support your claims."
),
"7": (
"Close: Ask for the sale by proposing a next step. This could be a demo, a"
" trial or a meeting with decision-makers. Ensure to summarize what has"
" been discussed and reiterate the benefits."
),
}
salesperson_name: str = "Ted Lasso"
salesperson_role: str = "Business Development Representative"
company_name: str = "Sleep Haven"
company_business: str = "Sleep Haven is a premium mattress company that provides customers with the most comfortable and supportive sleeping experience possible. We offer a range of high-quality mattresses, pillows, and bedding accessories that are designed to meet the unique needs of our customers."
company_values: str = "Our mission at Sleep Haven is to help people achieve a better night's sleep by providing them with the best possible sleep solutions. We believe that quality sleep is essential to overall health and well-being, and we are committed to helping our customers achieve optimal sleep by offering exceptional products and customer service."
conversation_purpose: str = "find out whether they are looking to achieve better sleep via buying a premier mattress."
company_business: str = (
"Sleep Haven is a premium mattress company that provides customers with the"
" most comfortable and supportive sleeping experience possible. We offer a"
" range of high-quality mattresses, pillows, and bedding accessories that are"
" designed to meet the unique needs of our customers."
)
company_values: str = (
"Our mission at Sleep Haven is to help people achieve a better night's sleep by"
" providing them with the best possible sleep solutions. We believe that"
" quality sleep is essential to overall health and well-being, and we are"
" committed to helping our customers achieve optimal sleep by offering"
" exceptional products and customer service."
)
conversation_purpose: str = (
"find out whether they are looking to achieve better sleep via buying a premier"
" mattress."
)
conversation_type: str = "call"
def retrieve_conversation_stage(self, key):
@ -419,14 +458,32 @@ config = dict(
salesperson_name="Ted Lasso",
salesperson_role="Business Development Representative",
company_name="Sleep Haven",
company_business="Sleep Haven is a premium mattress company that provides customers with the most comfortable and supportive sleeping experience possible. We offer a range of high-quality mattresses, pillows, and bedding accessories that are designed to meet the unique needs of our customers.",
company_values="Our mission at Sleep Haven is to help people achieve a better night's sleep by providing them with the best possible sleep solutions. We believe that quality sleep is essential to overall health and well-being, and we are committed to helping our customers achieve optimal sleep by offering exceptional products and customer service.",
conversation_purpose="find out whether they are looking to achieve better sleep via buying a premier mattress.",
company_business=(
"Sleep Haven is a premium mattress company that provides customers with the"
" most comfortable and supportive sleeping experience possible. We offer a"
" range of high-quality mattresses, pillows, and bedding accessories that are"
" designed to meet the unique needs of our customers."
),
company_values=(
"Our mission at Sleep Haven is to help people achieve a better night's sleep by"
" providing them with the best possible sleep solutions. We believe that"
" quality sleep is essential to overall health and well-being, and we are"
" committed to helping our customers achieve optimal sleep by offering"
" exceptional products and customer service."
),
conversation_purpose=(
"find out whether they are looking to achieve better sleep via buying a premier"
" mattress."
),
conversation_history=[],
conversation_type="call",
conversation_stage=conversation_stages.get(
"1",
"Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.",
(
"Introduction: Start the conversation by introducing yourself and your"
" company. Be polite and respectful while keeping the tone of the"
" conversation professional."
),
),
use_tools=True,
product_catalog="sample_product_catalog.txt",

@ -19,7 +19,8 @@ class Registry(BaseModel):
def build(self, type: str, **kwargs):
if type not in self.entries:
raise ValueError(
f'{type} is not registered. Please register with the .register("{type}") method provided in {self.name} registry'
f"{type} is not registered. Please register with the"
f' .register("{type}") method provided in {self.name} registry'
)
return self.entries[type](**kwargs)

@ -3,7 +3,6 @@
# from swarms.chunkers.text import TextChunker
# from swarms.chunkers.pdf import PdfChunker
# __all__ = [
# "BaseChunker",
# "ChunkSeparator",

@ -1,10 +1,13 @@
from __future__ import annotations
from abc import ABC
from typing import Optional
from attr import define, field, Factory
from attr import Factory, define, field
from griptape.artifacts import TextArtifact
from swarms.chunkers.chunk_seperators import ChunkSeparator
from griptape.tokenizers import OpenAiTokenizer
from swarms.chunkers.chunk_seperator import ChunkSeparator
from swarms.models.openai_tokenizer import OpenAITokenizer
@define
@ -16,6 +19,24 @@ class BaseChunker(ABC):
Usage:
--------------
from swarms.chunkers.base import BaseChunker
from swarms.chunkers.chunk_seperator import ChunkSeparator
class PdfChunker(BaseChunker):
DEFAULT_SEPARATORS = [
ChunkSeparator("\n\n"),
ChunkSeparator(". "),
ChunkSeparator("! "),
ChunkSeparator("? "),
ChunkSeparator(" "),
]
# Example
pdf = "swarmdeck.pdf"
chunker = PdfChunker()
chunks = chunker.chunk(pdf)
print(chunks)
"""
@ -26,10 +47,10 @@ class BaseChunker(ABC):
default=Factory(lambda self: self.DEFAULT_SEPARATORS, takes_self=True),
kw_only=True,
)
tokenizer: OpenAiTokenizer = field(
tokenizer: OpenAITokenizer = field(
default=Factory(
lambda: OpenAiTokenizer(
model=OpenAiTokenizer.DEFAULT_OPENAI_GPT_3_CHAT_MODEL
lambda: OpenAITokenizer(
model=OpenAITokenizer.DEFAULT_OPENAI_GPT_3_CHAT_MODEL
)
),
kw_only=True,
@ -47,7 +68,7 @@ class BaseChunker(ABC):
def _chunk_recursively(
self, chunk: str, current_separator: Optional[ChunkSeparator] = None
) -> list[str]:
token_count = self.tokenizer.token_count(chunk)
token_count = self.tokenizer.count_tokens(chunk)
if token_count <= self.max_tokens:
return [chunk]

@ -15,3 +15,10 @@ class MarkdownChunker(BaseChunker):
ChunkSeparator("? "),
ChunkSeparator(" "),
]
# # Example using chunker to chunk a markdown file
# file = open("README.md", "r")
# text = file.read()
# chunker = MarkdownChunker()
# chunks = chunker.chunk(text)

@ -0,0 +1,117 @@
"""
Omni Chunker is a chunker that chunks all files into select chunks of size x strings
Usage:
--------------
from swarms.chunkers.omni_chunker import OmniChunker
# Example
pdf = "swarmdeck.pdf"
chunker = OmniChunker(chunk_size=1000, beautify=True)
chunks = chunker(pdf)
print(chunks)
"""
from dataclasses import dataclass
from typing import List, Optional, Callable
from termcolor import colored
import os
import sys
@dataclass
class OmniChunker:
""" """
chunk_size: int = 1000
beautify: bool = False
use_tokenizer: bool = False
tokenizer: Optional[Callable[[str], List[str]]] = None
def __call__(self, file_path: str) -> List[str]:
"""
Chunk the given file into parts of size `chunk_size`.
Args:
file_path (str): The path to the file to chunk.
Returns:
List[str]: A list of string chunks from the file.
"""
if not os.path.isfile(file_path):
print(colored("The file does not exist.", "red"))
return []
file_extension = os.path.splitext(file_path)[1]
try:
with open(file_path, "rb") as file:
content = file.read()
# Decode content based on MIME type or file extension
decoded_content = self.decode_content(content, file_extension)
chunks = self.chunk_content(decoded_content)
return chunks
except Exception as e:
print(colored(f"Error reading file: {e}", "red"))
return []
def decode_content(self, content: bytes, file_extension: str) -> str:
"""
Decode the content of the file based on its MIME type or file extension.
Args:
content (bytes): The content of the file.
file_extension (str): The file extension of the file.
Returns:
str: The decoded content of the file.
"""
# Add logic to handle different file types based on the extension
# For simplicity, this example assumes text files encoded in utf-8
try:
return content.decode("utf-8")
except UnicodeDecodeError as e:
print(
colored(
f"Could not decode file with extension {file_extension}: {e}",
"yellow",
)
)
return ""
def chunk_content(self, content: str) -> List[str]:
"""
Split the content into chunks of size `chunk_size`.
Args:
content (str): The content to chunk.
Returns:
List[str]: The list of chunks.
"""
return [
content[i : i + self.chunk_size]
for i in range(0, len(content), self.chunk_size)
]
def __str__(self):
return f"OmniChunker(chunk_size={self.chunk_size}, beautify={self.beautify})"
def metrics(self):
return {
"chunk_size": self.chunk_size,
"beautify": self.beautify,
}
def print_dashboard(self):
print(
colored(
f"""
Omni Chunker
------------
{self.metrics()}
""",
"cyan",
)
)

@ -10,3 +10,10 @@ class PdfChunker(BaseChunker):
ChunkSeparator("? "),
ChunkSeparator(" "),
]
# # Example
# pdf = "swarmdeck.pdf"
# chunker = PdfChunker()
# chunks = chunker.chunk(pdf)
# print(chunks)

@ -15,7 +15,6 @@ if TYPE_CHECKING:
from haystack.schema import Document as HaystackDocument
from semantic_kernel.memory.memory_record import MemoryRecord
####
DEFAULT_TEXT_NODE_TMPL = "{metadata_str}\n\n{content}"
DEFAULT_METADATA_TMPL = "{key}: {value}"
@ -125,7 +124,6 @@ class BaseNode(BaseComponent):
embedding: Optional[List[float]] = Field(
default=None, description="Embedding of the node."
)
""""
metadata fields
- injected as part of the text shown to LLMs as context

@ -460,7 +460,7 @@ class Chroma(VectorStore):
"""
if self._embedding_function is None:
raise ValueError(
"For MMR search, you must specify an embedding function on" "creation."
"For MMR search, you must specify an embedding function oncreation."
)
embedding = self._embedding_function.embed_query(query)

@ -111,7 +111,10 @@ class Step(StepRequestBody):
output: Optional[str] = Field(
None,
description="Output of the task step.",
example="I am going to use the write_to_file command and write Washington to a file called output.txt <write_to_file('output.txt', 'Washington')",
example=(
"I am going to use the write_to_file command and write Washington to a file"
" called output.txt <write_to_file('output.txt', 'Washington')"
),
)
additional_output: Optional[StepOutput] = None
artifacts: List[Artifact] = Field(

@ -0,0 +1,4 @@
"""
Weaviate API Client
"""

@ -9,7 +9,6 @@ from swarms.models.huggingface import HuggingfaceLLM
from swarms.models.wizard_storytelling import WizardLLMStoryTeller
from swarms.models.mpt import MPT7B
# MultiModal Models
from swarms.models.idefics import Idefics
from swarms.models.kosmos_two import Kosmos
@ -17,14 +16,17 @@ from swarms.models.vilt import Vilt
from swarms.models.nougat import Nougat
from swarms.models.layoutlm_document_qa import LayoutLMDocumentQA
# from swarms.models.fuyu import Fuyu # Not working, wait until they update
# from swarms.models.gpt4v import GPT4Vision
# from swarms.models.dalle3 import Dalle3
# from swarms.models.distilled_whisperx import DistilWhisperModel
# from swarms.models.fuyu import Fuyu # Not working, wait until they update
import sys
# Uncomment for clean ouput
# log_file = open("errors.txt", "w")
# sys.stderr = log_file
__all__ = [
"Anthropic",
"Petals",
@ -42,4 +44,7 @@ __all__ = [
"HuggingfaceLLM",
"MPT7B",
"WizardLLMStoryTeller",
# "GPT4Vision",
# "Dalle3",
# "Fuyu",
]

@ -1,41 +1,292 @@
import requests
import os
import contextlib
import datetime
import functools
import importlib
import re
import warnings
from importlib.metadata import version
from typing import (
Any,
AsyncIterator,
Callable,
Dict,
Iterator,
List,
Mapping,
Optional,
Set,
Tuple,
Union,
)
from langchain.callbacks.manager import (
AsyncCallbackManagerForLLMRun,
CallbackManagerForLLMRun,
)
from langchain.llms.base import LLM
from langchain.pydantic_v1 import Field, SecretStr, root_validator
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.output import GenerationChunk
from langchain.schema.prompt import PromptValue
from langchain.utils import (
check_package_version,
get_from_dict_or_env,
get_pydantic_field_names,
)
from packaging.version import parse
from requests import HTTPError, Response
class Anthropic:
def xor_args(*arg_groups: Tuple[str, ...]) -> Callable:
"""Validate specified keyword args are mutually exclusive."""
def decorator(func: Callable) -> Callable:
@functools.wraps(func)
def wrapper(*args: Any, **kwargs: Any) -> Any:
"""Validate exactly one arg in each group is not None."""
counts = [
sum(1 for arg in arg_group if kwargs.get(arg) is not None)
for arg_group in arg_groups
]
invalid_groups = [i for i, count in enumerate(counts) if count != 1]
if invalid_groups:
invalid_group_names = [", ".join(arg_groups[i]) for i in invalid_groups]
raise ValueError(
"Exactly one argument in each of the following"
" groups must be defined:"
f" {', '.join(invalid_group_names)}"
)
return func(*args, **kwargs)
return wrapper
return decorator
def raise_for_status_with_text(response: Response) -> None:
"""Raise an error with the response text."""
try:
response.raise_for_status()
except HTTPError as e:
raise ValueError(response.text) from e
@contextlib.contextmanager
def mock_now(dt_value): # type: ignore
"""Context manager for mocking out datetime.now() in unit tests.
Example:
with mock_now(datetime.datetime(2011, 2, 3, 10, 11)):
assert datetime.datetime.now() == datetime.datetime(2011, 2, 3, 10, 11)
"""
Anthropic large language models.
class MockDateTime(datetime.datetime):
"""Mock datetime.datetime.now() with a fixed datetime."""
@classmethod
def now(cls): # type: ignore
# Create a copy of dt_value.
return datetime.datetime(
dt_value.year,
dt_value.month,
dt_value.day,
dt_value.hour,
dt_value.minute,
dt_value.second,
dt_value.microsecond,
dt_value.tzinfo,
)
real_datetime = datetime.datetime
datetime.datetime = MockDateTime
try:
yield datetime.datetime
finally:
datetime.datetime = real_datetime
def guard_import(
module_name: str, *, pip_name: Optional[str] = None, package: Optional[str] = None
) -> Any:
"""Dynamically imports a module and raises a helpful exception if the module is not
installed."""
try:
module = importlib.import_module(module_name, package)
except ImportError:
raise ImportError(
f"Could not import {module_name} python package. "
f"Please install it with `pip install {pip_name or module_name}`."
)
return module
def check_package_version(
package: str,
lt_version: Optional[str] = None,
lte_version: Optional[str] = None,
gt_version: Optional[str] = None,
gte_version: Optional[str] = None,
) -> None:
"""Check the version of a package."""
imported_version = parse(version(package))
if lt_version is not None and imported_version >= parse(lt_version):
raise ValueError(
f"Expected {package} version to be < {lt_version}. Received "
f"{imported_version}."
)
if lte_version is not None and imported_version > parse(lte_version):
raise ValueError(
f"Expected {package} version to be <= {lte_version}. Received "
f"{imported_version}."
)
if gt_version is not None and imported_version <= parse(gt_version):
raise ValueError(
f"Expected {package} version to be > {gt_version}. Received "
f"{imported_version}."
)
if gte_version is not None and imported_version < parse(gte_version):
raise ValueError(
f"Expected {package} version to be >= {gte_version}. Received "
f"{imported_version}."
)
def get_pydantic_field_names(pydantic_cls: Any) -> Set[str]:
"""Get field names, including aliases, for a pydantic class.
Args:
pydantic_cls: Pydantic class."""
all_required_field_names = set()
for field in pydantic_cls.__fields__.values():
all_required_field_names.add(field.name)
if field.has_alias:
all_required_field_names.add(field.alias)
return all_required_field_names
def build_extra_kwargs(
extra_kwargs: Dict[str, Any],
values: Dict[str, Any],
all_required_field_names: Set[str],
) -> Dict[str, Any]:
"""Build extra kwargs from values and extra_kwargs.
Args:
extra_kwargs: Extra kwargs passed in by user.
values: Values passed in by user.
all_required_field_names: All required field names for the pydantic class.
"""
for field_name in list(values):
if field_name in extra_kwargs:
raise ValueError(f"Found {field_name} supplied twice.")
if field_name not in all_required_field_names:
warnings.warn(
f"""WARNING! {field_name} is not default parameter.
{field_name} was transferred to model_kwargs.
Please confirm that {field_name} is what you intended."""
)
extra_kwargs[field_name] = values.pop(field_name)
def __init__(
self,
model="claude-2",
max_tokens_to_sample=256,
temperature=None,
top_k=None,
top_p=None,
streaming=False,
default_request_timeout=None,
):
self.model = model
self.max_tokens_to_sample = max_tokens_to_sample
self.temperature = temperature
self.top_k = top_k
self.top_p = top_p
self.streaming = streaming
self.default_request_timeout = default_request_timeout or 600
self.anthropic_api_url = os.getenv(
"ANTHROPIC_API_URL", "https://api.anthropic.com"
invalid_model_kwargs = all_required_field_names.intersection(extra_kwargs.keys())
if invalid_model_kwargs:
raise ValueError(
f"Parameters {invalid_model_kwargs} should be specified explicitly. "
"Instead they were passed in as part of `model_kwargs` parameter."
)
self.anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")
def _default_params(self):
return extra_kwargs
def convert_to_secret_str(value: Union[SecretStr, str]) -> SecretStr:
"""Convert a string to a SecretStr if needed."""
if isinstance(value, SecretStr):
return value
return SecretStr(value)
class _AnthropicCommon(BaseLanguageModel):
client: Any = None #: :meta private:
async_client: Any = None #: :meta private:
model: str = Field(default="claude-2", alias="model_name")
"""Model name to use."""
max_tokens_to_sample: int = Field(default=256, alias="max_tokens")
"""Denotes the number of tokens to predict per generation."""
temperature: Optional[float] = None
"""A non-negative float that tunes the degree of randomness in generation."""
top_k: Optional[int] = None
"""Number of most likely tokens to consider at each step."""
top_p: Optional[float] = None
"""Total probability mass of tokens to consider at each step."""
streaming: bool = False
"""Whether to stream the results."""
default_request_timeout: Optional[float] = None
"""Timeout for requests to Anthropic Completion API. Default is 600 seconds."""
anthropic_api_url: Optional[str] = None
anthropic_api_key: Optional[SecretStr] = None
HUMAN_PROMPT: Optional[str] = None
AI_PROMPT: Optional[str] = None
count_tokens: Optional[Callable[[str], int]] = None
model_kwargs: Dict[str, Any] = Field(default_factory=dict)
@root_validator(pre=True)
def build_extra(cls, values: Dict) -> Dict:
extra = values.get("model_kwargs", {})
all_required_field_names = get_pydantic_field_names(cls)
values["model_kwargs"] = build_extra_kwargs(
extra, values, all_required_field_names
)
return values
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
values["anthropic_api_key"] = convert_to_secret_str(
get_from_dict_or_env(values, "anthropic_api_key", "ANTHROPIC_API_KEY")
)
# Get custom api url from environment.
values["anthropic_api_url"] = get_from_dict_or_env(
values,
"anthropic_api_url",
"ANTHROPIC_API_URL",
default="https://api.anthropic.com",
)
try:
import anthropic
check_package_version("anthropic", gte_version="0.3")
values["client"] = anthropic.Anthropic(
base_url=values["anthropic_api_url"],
api_key=values["anthropic_api_key"].get_secret_value(),
timeout=values["default_request_timeout"],
)
values["async_client"] = anthropic.AsyncAnthropic(
base_url=values["anthropic_api_url"],
api_key=values["anthropic_api_key"].get_secret_value(),
timeout=values["default_request_timeout"],
)
values["HUMAN_PROMPT"] = anthropic.HUMAN_PROMPT
values["AI_PROMPT"] = anthropic.AI_PROMPT
values["count_tokens"] = values["client"].count_tokens
except ImportError:
raise ImportError(
"Could not import anthropic python package. "
"Please it install it with `pip install anthropic`."
)
return values
@property
def _default_params(self) -> Mapping[str, Any]:
"""Get the default parameters for calling Anthropic API."""
d = {
"max_tokens_to_sample": self.max_tokens_to_sample,
@ -47,32 +298,229 @@ class Anthropic:
d["top_k"] = self.top_k
if self.top_p is not None:
d["top_p"] = self.top_p
return d
def run(self, task: str, stop=None):
"""Call out to Anthropic's completion endpoint."""
stop = stop or []
params = self._default_params()
headers = {"Authorization": f"Bearer {self.anthropic_api_key}"}
data = {"prompt": task, "stop_sequences": stop, **params}
response = requests.post(
f"{self.anthropic_api_url}/completions",
headers=headers,
json=data,
timeout=self.default_request_timeout,
return {**d, **self.model_kwargs}
@property
def _identifying_params(self) -> Mapping[str, Any]:
"""Get the identifying parameters."""
return {**{}, **self._default_params}
def _get_anthropic_stop(self, stop: Optional[List[str]] = None) -> List[str]:
if not self.HUMAN_PROMPT or not self.AI_PROMPT:
raise NameError("Please ensure the anthropic package is loaded")
if stop is None:
stop = []
# Never want model to invent new turns of Human / Assistant dialog.
stop.extend([self.HUMAN_PROMPT])
return stop
class Anthropic(LLM, _AnthropicCommon):
"""Anthropic large language models.
To use, you should have the ``anthropic`` python package installed, and the
environment variable ``ANTHROPIC_API_KEY`` set with your API key, or pass
it as a named parameter to the constructor.
Example:
.. code-block:: python
import anthropic
from langchain.llms import Anthropic
model = Anthropic(model="<model_name>", anthropic_api_key="my-api-key")
# Simplest invocation, automatically wrapped with HUMAN_PROMPT
# and AI_PROMPT.
response = model("What are the biggest risks facing humanity?")
# Or if you want to use the chat mode, build a few-shot-prompt, or
# put words in the Assistant's mouth, use HUMAN_PROMPT and AI_PROMPT:
raw_prompt = "What are the biggest risks facing humanity?"
prompt = f"{anthropic.HUMAN_PROMPT} {prompt}{anthropic.AI_PROMPT}"
response = model(prompt)
"""
class Config:
"""Configuration for this pydantic object."""
allow_population_by_field_name = True
arbitrary_types_allowed = True
@root_validator()
def raise_warning(cls, values: Dict) -> Dict:
"""Raise warning that this class is deprecated."""
warnings.warn(
"This Anthropic LLM is deprecated. "
"Please use `from langchain.chat_models import ChatAnthropic` instead"
)
return values
@property
def _llm_type(self) -> str:
"""Return type of llm."""
return "anthropic-llm"
def _wrap_prompt(self, prompt: str) -> str:
if not self.HUMAN_PROMPT or not self.AI_PROMPT:
raise NameError("Please ensure the anthropic package is loaded")
if prompt.startswith(self.HUMAN_PROMPT):
return prompt # Already wrapped.
# Guard against common errors in specifying wrong number of newlines.
corrected_prompt, n_subs = re.subn(r"^\n*Human:", self.HUMAN_PROMPT, prompt)
if n_subs == 1:
return corrected_prompt
# As a last resort, wrap the prompt ourselves to emulate instruct-style.
return f"{self.HUMAN_PROMPT} {prompt}{self.AI_PROMPT} Sure, here you go:\n"
def _call(
self,
prompt: str,
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> str:
r"""Call out to Anthropic's completion endpoint.
Args:
prompt: The prompt to pass into the model.
stop: Optional list of stop words to use when generating.
Returns:
The string generated by the model.
Example:
.. code-block:: python
prompt = "What are the biggest risks facing humanity?"
prompt = f"\n\nHuman: {prompt}\n\nAssistant:"
response = model(prompt)
"""
if self.streaming:
completion = ""
for chunk in self._stream(
prompt=prompt, stop=stop, run_manager=run_manager, **kwargs
):
completion += chunk.text
return completion
stop = self._get_anthropic_stop(stop)
params = {**self._default_params, **kwargs}
response = self.client.completions.create(
prompt=self._wrap_prompt(prompt),
stop_sequences=stop,
**params,
)
return response.json().get("completion")
def __call__(self, task: str, stop=None):
"""Call out to Anthropic's completion endpoint."""
stop = stop or []
params = self._default_params()
headers = {"Authorization": f"Bearer {self.anthropic_api_key}"}
data = {"prompt": task, "stop_sequences": stop, **params}
response = requests.post(
f"{self.anthropic_api_url}/completions",
headers=headers,
json=data,
timeout=self.default_request_timeout,
return response.completion
def convert_prompt(self, prompt: PromptValue) -> str:
return self._wrap_prompt(prompt.to_string())
async def _acall(
self,
prompt: str,
stop: Optional[List[str]] = None,
run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> str:
"""Call out to Anthropic's completion endpoint asynchronously."""
if self.streaming:
completion = ""
async for chunk in self._astream(
prompt=prompt, stop=stop, run_manager=run_manager, **kwargs
):
completion += chunk.text
return completion
stop = self._get_anthropic_stop(stop)
params = {**self._default_params, **kwargs}
response = await self.async_client.completions.create(
prompt=self._wrap_prompt(prompt),
stop_sequences=stop,
**params,
)
return response.json().get("completion")
return response.completion
def _stream(
self,
prompt: str,
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> Iterator[GenerationChunk]:
r"""Call Anthropic completion_stream and return the resulting generator.
Args:
prompt: The prompt to pass into the model.
stop: Optional list of stop words to use when generating.
Returns:
A generator representing the stream of tokens from Anthropic.
Example:
.. code-block:: python
prompt = "Write a poem about a stream."
prompt = f"\n\nHuman: {prompt}\n\nAssistant:"
generator = anthropic.stream(prompt)
for token in generator:
yield token
"""
stop = self._get_anthropic_stop(stop)
params = {**self._default_params, **kwargs}
for token in self.client.completions.create(
prompt=self._wrap_prompt(prompt), stop_sequences=stop, stream=True, **params
):
chunk = GenerationChunk(text=token.completion)
yield chunk
if run_manager:
run_manager.on_llm_new_token(chunk.text, chunk=chunk)
async def _astream(
self,
prompt: str,
stop: Optional[List[str]] = None,
run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> AsyncIterator[GenerationChunk]:
r"""Call Anthropic completion_stream and return the resulting generator.
Args:
prompt: The prompt to pass into the model.
stop: Optional list of stop words to use when generating.
Returns:
A generator representing the stream of tokens from Anthropic.
Example:
.. code-block:: python
prompt = "Write a poem about a stream."
prompt = f"\n\nHuman: {prompt}\n\nAssistant:"
generator = anthropic.stream(prompt)
for token in generator:
yield token
"""
stop = self._get_anthropic_stop(stop)
params = {**self._default_params, **kwargs}
async for token in await self.async_client.completions.create(
prompt=self._wrap_prompt(prompt),
stop_sequences=stop,
stream=True,
**params,
):
chunk = GenerationChunk(text=token.completion)
yield chunk
if run_manager:
await run_manager.on_llm_new_token(chunk.text, chunk=chunk)
def get_num_tokens(self, text: str) -> int:
"""Calculate number of tokens."""
if not self.count_tokens:
raise NameError("Please ensure the anthropic package is loaded")
return self.count_tokens(text)

@ -105,13 +105,15 @@ class BioGPT:
generator = pipeline(
"text-generation", model=self.model, tokenizer=self.tokenizer
)
return generator(
out = generator(
text,
max_length=self.max_length,
num_return_sequences=self.num_return_sequences,
do_sample=self.do_sample,
)
return out[0]["generated_text"]
def get_features(self, text):
"""
Get the features of a given text.

@ -0,0 +1,178 @@
import logging
import os
from dataclasses import dataclass
from io import BytesIO
import openai
from dotenv import load_dotenv
from openai import OpenAI
from PIL import Image
from pydantic import validator
from termcolor import colored
load_dotenv()
# api_key = os.getenv("OPENAI_API_KEY")
# Configure Logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
@dataclass
class Dalle3:
"""
Dalle3 model class
Attributes:
-----------
image_url: str
The image url generated by the Dalle3 API
Methods:
--------
__call__(self, task: str) -> Dalle3:
Makes a call to the Dalle3 API and returns the image url
Example:
--------
>>> dalle3 = Dalle3()
>>> task = "A painting of a dog"
>>> image_url = dalle3(task)
>>> print(image_url)
https://cdn.openai.com/dall-e/encoded/feats/feats_01J9J5ZKJZJY9.png
"""
model: str = "dall-e-3"
img: str = None
size: str = "1024x1024"
max_retries: int = 3
quality: str = "standard"
api_key: str = None
n: int = 4
client = OpenAI(
api_key=api_key,
max_retries=max_retries,
)
class Config:
"""Config class for the Dalle3 model"""
arbitrary_types_allowed = True
@validator("max_retries", "time_seconds")
def must_be_positive(cls, value):
if value <= 0:
raise ValueError("Must be positive")
return value
def read_img(self, img: str):
"""Read the image using pil"""
img = Image.open(img)
return img
def set_width_height(self, img: str, width: int, height: int):
"""Set the width and height of the image"""
img = self.read_img(img)
img = img.resize((width, height))
return img
def convert_to_bytesio(self, img: str, format: str = "PNG"):
"""Convert the image to an bytes io object"""
byte_stream = BytesIO()
img.save(byte_stream, format=format)
byte_array = byte_stream.getvalue()
return byte_array
# @lru_cache(maxsize=32)
def __call__(self, task: str):
"""
Text to image conversion using the Dalle3 API
Parameters:
-----------
task: str
The task to be converted to an image
Returns:
--------
Dalle3:
An instance of the Dalle3 class with the image url generated by the Dalle3 API
Example:
--------
>>> dalle3 = Dalle3()
>>> task = "A painting of a dog"
>>> image_url = dalle3(task)
>>> print(image_url)
https://cdn.openai.com/dall-e/encoded/feats/feats_01J9J5ZKJZJY9.png
"""
try:
# Making a call to the the Dalle3 API
response = self.client.images.generate(
model=self.model,
prompt=task,
size=self.size,
quality=self.quality,
n=self.n,
)
# Extracting the image url from the response
img = response.data[0].url
return img
except openai.OpenAIError as error:
# Handling exceptions and printing the errors details
print(
colored(
(
f"Error running Dalle3: {error} try optimizing your api key and"
" or try again"
),
"red",
)
)
raise error
def create_variations(self, img: str):
"""
Create variations of an image using the Dalle3 API
Parameters:
-----------
img: str
The image to be used for the API request
Returns:
--------
img: str
The image url generated by the Dalle3 API
Example:
--------
>>> dalle3 = Dalle3()
>>> img = "https://cdn.openai.com/dall-e/encoded/feats/feats_01J9J5ZKJZJY9.png"
>>> img = dalle3.create_variations(img)
>>> print(img)
"""
try:
response = self.client.images.create_variation(
img=open(img, "rb"), n=self.n, size=self.size
)
img = response.data[0].url
return img
except (Exception, openai.OpenAIError) as error:
print(
colored(
(
f"Error running Dalle3: {error} try optimizing your api key and"
" or try again"
),
"red",
)
)
print(colored(f"Error running Dalle3: {error.http_status}", "red"))
print(colored(f"Error running Dalle3: {error.error}", "red"))
raise error

@ -1,3 +1,160 @@
"""
import asyncio
import os
import time
from functools import wraps
from typing import Union
"""
import torch
from termcolor import colored
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
def async_retry(max_retries=3, exceptions=(Exception,), delay=1):
"""
A decorator for adding retry logic to async functions.
:param max_retries: Maximum number of retries before giving up.
:param exceptions: A tuple of exceptions to catch and retry on.
:param delay: Delay between retries.
"""
def decorator(func):
@wraps(func)
async def wrapper(*args, **kwargs):
retries = max_retries
while retries:
try:
return await func(*args, **kwargs)
except exceptions as e:
retries -= 1
if retries <= 0:
raise
print(f"Retry after exception: {e}, Attempts remaining: {retries}")
await asyncio.sleep(delay)
return wrapper
return decorator
class DistilWhisperModel:
"""
This class encapsulates the Distil-Whisper model for English speech recognition.
It allows for both synchronous and asynchronous transcription of short and long-form audio.
Args:
model_id: The model ID to use. Defaults to "distil-whisper/distil-large-v2".
Attributes:
device: The device to use for inference.
torch_dtype: The torch data type to use for inference.
model_id: The model ID to use.
model: The model instance.
processor: The processor instance.
Usage:
model_wrapper = DistilWhisperModel()
transcription = model_wrapper('path/to/audio.mp3')
# For async usage
transcription = asyncio.run(model_wrapper.async_transcribe('path/to/audio.mp3'))
"""
def __init__(self, model_id="distil-whisper/distil-large-v2"):
self.device = "cuda:0" if torch.cuda.is_available() else "cpu"
self.torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
self.model_id = model_id
self.model = AutoModelForSpeechSeq2Seq.from_pretrained(
model_id,
torch_dtype=self.torch_dtype,
low_cpu_mem_usage=True,
use_safetensors=True,
).to(self.device)
self.processor = AutoProcessor.from_pretrained(model_id)
def __call__(self, inputs: Union[str, dict]):
return self.transcribe(inputs)
def transcribe(self, inputs: Union[str, dict]):
"""
Synchronously transcribe the given audio input using the Distil-Whisper model.
:param inputs: A string representing the file path or a dict with audio data.
:return: The transcribed text.
"""
pipe = pipeline(
"automatic-speech-recognition",
model=self.model,
tokenizer=self.processor.tokenizer,
feature_extractor=self.processor.feature_extractor,
max_new_tokens=128,
torch_dtype=self.torch_dtype,
device=self.device,
)
return pipe(inputs)["text"]
@async_retry()
async def async_transcribe(self, inputs: Union[str, dict]):
"""
Asynchronously transcribe the given audio input using the Distil-Whisper model.
:param inputs: A string representing the file path or a dict with audio data.
:return: The transcribed text.
"""
loop = asyncio.get_event_loop()
return await loop.run_in_executor(None, self.transcribe, inputs)
def real_time_transcribe(self, audio_file_path, chunk_duration=5):
"""
Simulates real-time transcription of an audio file, processing and printing results
in chunks with colored output for readability.
:param audio_file_path: Path to the audio file to be transcribed.
:param chunk_duration: Duration in seconds of each audio chunk to be processed.
"""
if not os.path.isfile(audio_file_path):
print(colored("The audio file was not found.", "red"))
return
# Assuming `chunk_duration` is in seconds and `processor` can handle chunk-wise processing
try:
with torch.no_grad():
# Load the whole audio file, but process and transcribe it in chunks
audio_input = self.processor.audio_file_to_array(audio_file_path)
sample_rate = audio_input.sampling_rate
total_duration = len(audio_input.array) / sample_rate
chunks = [
audio_input.array[i : i + sample_rate * chunk_duration]
for i in range(
0, len(audio_input.array), sample_rate * chunk_duration
)
]
print(colored("Starting real-time transcription...", "green"))
for i, chunk in enumerate(chunks):
# Process the current chunk
processed_inputs = self.processor(
chunk,
sampling_rate=sample_rate,
return_tensors="pt",
padding=True,
)
processed_inputs = processed_inputs.input_values.to(self.device)
# Generate transcription for the chunk
logits = self.model.generate(processed_inputs)
transcription = self.processor.batch_decode(
logits, skip_special_tokens=True
)[0]
# Print the chunk's transcription
print(
colored(f"Chunk {i+1}/{len(chunks)}: ", "yellow")
+ transcription
)
# Wait for the chunk's duration to simulate real-time processing
time.sleep(chunk_duration)
except Exception as e:
print(colored(f"An error occurred during transcription: {e}", "red"))

File diff suppressed because it is too large Load Diff

@ -0,0 +1,81 @@
import json
import os
from typing import List
import numpy as np
import timm
import torch
from PIL import Image
from pydantic import BaseModel, StrictFloat, StrictInt, validator
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Load the classes for image classification
with open(os.path.join(os.path.dirname(__file__), "fast_vit_classes.json")) as f:
FASTVIT_IMAGENET_1K_CLASSES = json.load(f)
class ClassificationResult(BaseModel):
class_id: List[StrictInt]
confidence: List[StrictFloat]
@validator("class_id", "confidence", pre=True, each_item=True)
def check_list_contents(cls, v):
assert isinstance(v, int) or isinstance(v, float), "must be integer or float"
return v
class FastViT:
"""
FastViT model for image classification
Args:
img (str): path to the input image
confidence_threshold (float): confidence threshold for the model's predictions
Returns:
ClassificationResult: a pydantic BaseModel containing the class ids and confidences of the model's predictions
Example:
>>> fastvit = FastViT()
>>> result = fastvit(img="path_to_image.jpg", confidence_threshold=0.5)
To use, create a json file called: fast_vit_classes.json
"""
def __init__(self):
self.model = timm.create_model(
"hf_hub:timm/fastvit_s12.apple_in1k", pretrained=True
).to(DEVICE)
data_config = timm.data.resolve_model_data_config(self.model)
self.transforms = timm.data.create_transform(**data_config, is_training=False)
self.model.eval()
def __call__(
self, img: str, confidence_threshold: float = 0.5
) -> ClassificationResult:
"""classifies the input image and returns the top k classes and their probabilities"""
img = Image.open(img).convert("RGB")
img_tensor = self.transforms(img).unsqueeze(0).to(DEVICE)
with torch.no_grad():
output = self.model(img_tensor)
probabilities = torch.nn.functional.softmax(output, dim=1)
# Get top k classes and their probabilities
top_probs, top_classes = torch.topk(
probabilities, k=FASTVIT_IMAGENET_1K_CLASSES
)
# Filter by confidence threshold
mask = top_probs > confidence_threshold
top_probs, top_classes = top_probs[mask], top_classes[mask]
# Convert to Python lists and map class indices to labels if needed
top_probs = top_probs.cpu().numpy().tolist()
top_classes = top_classes.cpu().numpy().tolist()
# top_class_labels = [FASTVIT_IMAGENET_1K_CLASSES[i] for i in top_classes] # Uncomment if class labels are needed
return ClassificationResult(class_id=top_classes, confidence=top_probs)

@ -1,11 +1,13 @@
"""Fuyu model by Kye"""
from io import BytesIO
import requests
from PIL import Image
from transformers import (
FuyuForCausalLM,
AutoTokenizer,
FuyuProcessor,
FuyuForCausalLM,
FuyuImageProcessor,
FuyuProcessor,
)
from PIL import Image
class Fuyu:
@ -27,15 +29,15 @@ class Fuyu:
>>> fuyu = Fuyu()
>>> fuyu("Hello, my name is", "path/to/image.png")
"""
def __init__(
self,
pretrained_path: str = "adept/fuyu-8b",
device_map: str = "cuda:0",
max_new_tokens: int = 7,
device_map: str = "auto",
max_new_tokens: int = 500,
*args,
**kwargs,
):
self.pretrained_path = pretrained_path
self.device_map = device_map
@ -44,15 +46,22 @@ class Fuyu:
self.tokenizer = AutoTokenizer.from_pretrained(pretrained_path)
self.image_processor = FuyuImageProcessor()
self.processor = FuyuProcessor(
image_procesor=self.image_processor, tokenizer=self.tokenizer
image_processor=self.image_processor, tokenizer=self.tokenizer, **kwargs
)
self.model = FuyuForCausalLM.from_pretrained(
pretrained_path, device_map=device_map
pretrained_path,
device_map=device_map,
**kwargs,
)
def __call__(self, text: str, img_path: str):
def get_img(self, img: str):
"""Get the image from the path"""
image_pil = Image.open(img)
return image_pil
def __call__(self, text: str, img: str):
"""Call the model with text and img paths"""
image_pil = Image.open(img_path)
image_pil = Image.open(img)
model_inputs = self.processor(
text=text, images=[image_pil], device=self.device_map
)
@ -60,7 +69,12 @@ class Fuyu:
for k, v in model_inputs.items():
model_inputs[k] = v.to(self.device_map)
output = self.model.generate(
**model_inputs, max_new_tokens=self.fmax_new_tokens
)
output = self.model.generate(**model_inputs, max_new_tokens=self.max_new_tokens)
text = self.processor.batch_decode(output[:, -7:], skip_special_tokens=True)
return print(str(text))
def get_img_from_web(self, img_url: str):
"""Get the image from the web"""
response = requests.get(img_url)
image_pil = Image.open(BytesIO(response.content))
return image_pil

@ -0,0 +1,257 @@
import asyncio
import base64
import concurrent.futures
import re
from dataclasses import dataclass
from typing import List, Optional, Tuple
import openai
import requests
from cachetools import TTLCache
from dotenv import load_dotenv
from openai import OpenAI
from ratelimit import limits, sleep_and_retry
from termcolor import colored
# ENV
load_dotenv()
@dataclass
class GPT4VisionResponse:
"""A response structure for GPT-4"""
answer: str
@dataclass
class GPT4Vision:
"""
GPT4Vision model class
Attributes:
-----------
max_retries: int
The maximum number of retries to make to the API
backoff_factor: float
The backoff factor to use for exponential backoff
timeout_seconds: int
The timeout in seconds for the API request
api_key: str
The API key to use for the API request
quality: str
The quality of the image to generate
max_tokens: int
The maximum number of tokens to use for the API request
Methods:
--------
process_img(self, img_path: str) -> str:
Processes the image to be used for the API request
run(self, img: Union[str, List[str]], tasks: List[str]) -> GPT4VisionResponse:
Makes a call to the GPT-4 Vision API and returns the image url
Example:
>>> gpt4vision = GPT4Vision()
>>> img = "https://cdn.openai.com/dall-e/encoded/feats/feats_01J9J5ZKJZJY9.png"
>>> tasks = ["A painting of a dog"]
>>> answer = gpt4vision(img, tasks)
>>> print(answer)
"""
max_retries: int = 3
model: str = "gpt-4-vision-preview"
backoff_factor: float = 2.0
timeout_seconds: int = 10
openai_api_key: Optional[str] = None
# 'Low' or 'High' for respesctively fast or high quality, but high more token usage
quality: str = "low"
# Max tokens to use for the API request, the maximum might be 3,000 but we don't know
max_tokens: int = 200
client = OpenAI(
api_key=openai_api_key,
)
dashboard: bool = True
call_limit: int = 1
period_seconds: int = 60
# Cache for storing API Responses
cache = TTLCache(maxsize=100, ttl=600) # Cache for 10 minutes
class Config:
"""Config class for the GPT4Vision model"""
arbitary_types_allowed = True
def process_img(self, img: str) -> str:
"""Processes the image to be used for the API request"""
with open(img, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
@sleep_and_retry
@limits(
calls=call_limit, period=period_seconds
) # Rate limit of 10 calls per minute
def run(self, task: str, img: str):
"""
Run the GPT-4 Vision model
Task: str
The task to run
Img: str
The image to run the task on
"""
if self.dashboard:
self.print_dashboard()
try:
response = self.client.chat.completions.create(
model="gpt-4-vision-preview",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": task},
{
"type": "image_url",
"image_url": {
"url": str(img),
},
},
],
}
],
max_tokens=self.max_tokens,
)
out = print(response.choices[0])
# out = self.clean_output(out)
return out
except openai.OpenAIError as e:
# logger.error(f"OpenAI API error: {e}")
return f"OpenAI API error: Could not process the image. {e}"
except Exception as e:
return f"Unexpected error occurred while processing the image. {e}"
def clean_output(self, output: str):
# Regex pattern to find the Choice object representation in the output
pattern = r"Choice\(.*?\(content=\"(.*?)\".*?\)\)"
match = re.search(pattern, output, re.DOTALL)
if match:
# Extract the content from the matched pattern
content = match.group(1)
# Replace escaped quotes to get the clean content
content = content.replace(r"\"", '"')
print(content)
else:
print("No content found in the output.")
async def arun(self, task: str, img: str):
"""
Arun is an async version of run
Task: str
The task to run
Img: str
The image to run the task on
"""
try:
response = await self.client.chat.completions.create(
model="gpt-4-vision-preview",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": task},
{
"type": "image_url",
"image_url": {
"url": img,
},
},
],
}
],
max_tokens=self.max_tokens,
)
return print(response.choices[0])
except openai.OpenAIError as e:
# logger.error(f"OpenAI API error: {e}")
return f"OpenAI API error: Could not process the image. {e}"
except Exception as e:
return f"Unexpected error occurred while processing the image. {e}"
def run_batch(self, tasks_images: List[Tuple[str, str]]) -> List[str]:
"""Process a batch of tasks and images"""
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = [
executor.submit(self.run, task, img) for task, img in tasks_images
]
results = [future.result() for future in futures]
return results
async def run_batch_async(self, tasks_images: List[Tuple[str, str]]) -> List[str]:
"""Process a batch of tasks and images asynchronously"""
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(None, self.run, task, img)
for task, img in tasks_images
]
return await asyncio.gather(*futures)
async def run_batch_async_with_retries(
self, tasks_images: List[Tuple[str, str]]
) -> List[str]:
"""Process a batch of tasks and images asynchronously with retries"""
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(None, self.run_with_retries, task, img)
for task, img in tasks_images
]
return await asyncio.gather(*futures)
def print_dashboard(self):
dashboard = print(
colored(
f"""
GPT4Vision Dashboard
-------------------
Max Retries: {self.max_retries}
Model: {self.model}
Backoff Factor: {self.backoff_factor}
Timeout Seconds: {self.timeout_seconds}
Image Quality: {self.quality}
Max Tokens: {self.max_tokens}
""",
"green",
)
)
return dashboard
def health_check(self):
"""Health check for the GPT4Vision model"""
try:
response = requests.get("https://api.openai.com/v1/engines")
return response.status_code == 200
except requests.RequestException as error:
print(f"Health check failed: {error}")
return False
def sanitize_input(self, text: str) -> str:
"""
Sanitize input to prevent injection attacks.
Parameters:
text: str - The input text to be sanitized.
Returns:
The sanitized text.
"""
# Example of simple sanitization, this should be expanded based on the context and usage
sanitized_text = re.sub(r"[^\w\s]", "", text)
return sanitized_text

@ -4,6 +4,7 @@ import torch
from torch.nn.parallel import DistributedDataParallel as DDP
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from termcolor import colored
from termcolor import colored
class HuggingfaceLLM:
@ -23,7 +24,7 @@ class HuggingfaceLLM:
```
from swarms.models import HuggingfaceLLM
model_id = "gpt2-small"
model_id = "NousResearch/Yarn-Mistral-7b-128k"
inference = HuggingfaceLLM(model_id=model_id)
task = "Once upon a time"
@ -45,6 +46,8 @@ class HuggingfaceLLM:
decoding=False,
*args,
**kwargs,
*args,
**kwargs,
):
self.logger = logging.getLogger(__name__)
self.device = (
@ -74,15 +77,22 @@ class HuggingfaceLLM:
bnb_config = BitsAndBytesConfig(**quantization_config)
try:
self.tokenizer = AutoTokenizer.from_pretrained(self.model_id)
self.tokenizer = AutoTokenizer.from_pretrained(
self.model_id, *args, **kwargs
)
self.model = AutoModelForCausalLM.from_pretrained(
self.model_id, quantization_config=bnb_config
self.model_id, quantization_config=bnb_config, *args, **kwargs
)
self.model # .to(self.device)
except Exception as e:
self.logger.error(f"Failed to load the model or the tokenizer: {e}")
raise
# self.logger.error(f"Failed to load the model or the tokenizer: {e}")
# raise
print(colored(f"Failed to load the model and or the tokenizer: {e}", "red"))
def print_error(self, error: str):
"""Print error"""
print(colored(f"Error: {error}", "red"))
def load_model(self):
"""Load the model"""
@ -106,12 +116,14 @@ class HuggingfaceLLM:
self.logger.error(f"Failed to load the model or the tokenizer: {error}")
raise
def run(self, task: str):
def run(self, task: str):
"""
Generate a response based on the prompt text.
Args:
- task (str): Text to prompt the model.
- task (str): Text to prompt the model.
- max_length (int): Maximum length of the response.
Returns:
@ -123,8 +135,11 @@ class HuggingfaceLLM:
self.print_dashboard(task)
self.print_dashboard(task)
try:
inputs = self.tokenizer.encode(task, return_tensors="pt").to(self.device)
inputs = self.tokenizer.encode(task, return_tensors="pt").to(self.device)
# self.log.start()
@ -157,7 +172,15 @@ class HuggingfaceLLM:
del inputs
return self.tokenizer.decode(outputs[0], skip_special_tokens=True)
except Exception as e:
self.logger.error(f"Failed to generate the text: {e}")
print(
colored(
(
f"HuggingfaceLLM could not generate text because of error: {e},"
" try optimizing your arguments"
),
"red",
)
)
raise
async def run_async(self, task: str, *args, **kwargs) -> str:
@ -183,12 +206,14 @@ class HuggingfaceLLM:
# Wrapping synchronous calls with async
return self.run(task, *args, **kwargs)
def __call__(self, task: str):
def __call__(self, task: str):
"""
Generate a response based on the prompt text.
Args:
- task (str): Text to prompt the model.
- task (str): Text to prompt the model.
- max_length (int): Maximum length of the response.
Returns:
@ -198,10 +223,14 @@ class HuggingfaceLLM:
max_length = self.max_length
self.print_dashboard(task)
max_length = self.max_length
self.print_dashboard(task)
try:
inputs = self.tokenizer.encode(task, return_tensors="pt").to(self.device)
inputs = self.tokenizer.encode(task, return_tensors="pt").to(self.device)
# self.log.start()
@ -314,3 +343,57 @@ class HuggingfaceLLM:
def clear_chat_history(self):
"""Clear chat history"""
self.chat_history = []
def print_dashboard(self, task: str):
"""Print dashboard"""
dashboard = print(
colored(
f"""
HuggingfaceLLM Dashboard
--------------------------------------------
Model Name: {self.model_id}
Tokenizer: {self.tokenizer}
Model MaxLength: {self.max_length}
Model Device: {self.device}
Model Quantization: {self.quantize}
Model Quantization Config: {self.quantization_config}
Model Verbose: {self.verbose}
Model Distributed: {self.distributed}
Model Decoding: {self.decoding}
----------------------------------------
Metadata:
Task Memory Consumption: {self.memory_consumption()}
GPU Available: {self.gpu_available()}
----------------------------------------
Task Environment:
Task: {task}
""",
"red",
)
)
print(dashboard)
def set_device(self, device):
"""
Changes the device used for inference.
Parameters
----------
device : str
The new device to use for inference.
"""
self.device = device
self.model.to(self.device)
def set_max_length(self, max_length):
"""Set max_length"""
self.max_length = max_length
def clear_chat_history(self):
"""Clear chat history"""
self.chat_history = []

@ -0,0 +1,100 @@
from typing import List, Tuple
import numpy as np
from PIL import Image
from pydantic import BaseModel, root_validator, validator
from transformers import AutoModelForVision2Seq, AutoProcessor
# Assuming the Detections class represents the output of the model prediction
class Detections(BaseModel):
xyxy: List[Tuple[float, float, float, float]]
class_id: List[int]
confidence: List[float]
@root_validator
def check_length(cls, values):
assert (
len(values.get("xyxy"))
== len(values.get("class_id"))
== len(values.get("confidence"))
), "All fields must have the same length."
return values
@validator("xyxy", "class_id", "confidence", pre=True, each_item=True)
def check_not_empty(cls, v):
if isinstance(v, list) and len(v) == 0:
raise ValueError("List must not be empty")
return v
@classmethod
def empty(cls):
return cls(xyxy=[], class_id=[], confidence=[])
class Kosmos2(BaseModel):
model: AutoModelForVision2Seq
processor: AutoProcessor
@classmethod
def initialize(cls):
model = AutoModelForVision2Seq.from_pretrained(
"ydshieh/kosmos-2-patch14-224", trust_remote_code=True
)
processor = AutoProcessor.from_pretrained(
"ydshieh/kosmos-2-patch14-224", trust_remote_code=True
)
return cls(model=model, processor=processor)
def __call__(self, img: str) -> Detections:
image = Image.open(img)
prompt = "<grounding>An image of"
inputs = self.processor(text=prompt, images=image, return_tensors="pt")
outputs = self.model.generate(**inputs, use_cache=True, max_new_tokens=64)
generated_text = self.processor.batch_decode(outputs, skip_special_tokens=True)[
0
]
# The actual processing of generated_text to entities would go here
# For the purpose of this example, assume a mock function 'extract_entities' exists:
entities = self.extract_entities(generated_text)
# Convert entities to detections format
detections = self.process_entities_to_detections(entities, image)
return detections
def extract_entities(
self, text: str
) -> List[Tuple[str, Tuple[float, float, float, float]]]:
# Placeholder function for entity extraction
# This should be replaced with the actual method of extracting entities
return []
def process_entities_to_detections(
self,
entities: List[Tuple[str, Tuple[float, float, float, float]]],
image: Image.Image,
) -> Detections:
if not entities:
return Detections.empty()
class_ids = [0] * len(entities) # Replace with actual class ID extraction logic
xyxys = [
(
e[1][0] * image.width,
e[1][1] * image.height,
e[1][2] * image.width,
e[1][3] * image.height,
)
for e in entities
]
confidences = [1.0] * len(entities) # Placeholder confidence
return Detections(xyxy=xyxys, class_id=class_ids, confidence=confidences)
# Usage:
# kosmos2 = Kosmos2.initialize()
# detections = kosmos2(img="path_to_image.jpg")

@ -106,7 +106,10 @@ class Kosmos:
self.run(prompt, image_url)
def referring_expression_generation(self, phrase, image_url):
prompt = "<grounding><phrase> It</phrase><object><patch_index_0044><patch_index_0863></object> is"
prompt = (
"<grounding><phrase>"
" It</phrase><object><patch_index_0044><patch_index_0863></object> is"
)
self.run(prompt, image_url)
def grounded_vqa(self, question, image_url):

@ -3,10 +3,9 @@ LayoutLMDocumentQA is a multimodal good for
visual question answering on real world docs lik invoice, pdfs, etc
"""
from transformers import pipeline
from swarms.models.base import AbstractModel
class LayoutLMDocumentQA(AbstractModel):
class LayoutLMDocumentQA:
"""
LayoutLMDocumentQA for document question answering:
@ -25,9 +24,9 @@ class LayoutLMDocumentQA(AbstractModel):
def __init__(
self,
model_name: str = "impira/layoutlm-document-qa",
task: str = "document-question-answering",
task_type: str = "document-question-answering",
):
self.pipeline = pipeline(self.task, model=self.model_name)
self.pipeline = pipeline(self.task_type, model=self.model_name)
def __call__(self, task: str, img_path: str):
"""Call for model"""

@ -8,7 +8,7 @@ format
- Extracting metadata from pdfs
"""
import re
import torch
from PIL import Image
from transformers import NougatProcessor, VisionEncoderDecoderModel
@ -61,9 +61,28 @@ class Nougat:
pixel_values.to(self.device),
min_length=self.min_length,
max_new_tokens=self.max_new_tokens,
bad_words_ids=[[self.processor.unk_token - id]],
)
sequence = self.processor.batch_decode(outputs, skip_special_tokens=True)[0]
sequence = self.processor.post_process_generation(sequence, fix_markdown=False)
return sequence
out = print(sequence)
return out
def clean_nougat_output(raw_output):
# Define the pattern to extract the relevant data
daily_balance_pattern = (
r"\*\*(\d{2}/\d{2}/\d{4})\*\*\n\n\*\*([\d,]+\.\d{2})\*\*"
)
# Find all matches of the pattern
matches = re.findall(daily_balance_pattern, raw_output)
# Convert the matches to a readable format
cleaned_data = [
"Date: {}, Amount: {}".format(date, amount.replace(",", ""))
for date, amount in matches
]
# Join the cleaned data with new lines for readability
return "\n".join(cleaned_data)

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save