pull/491/head
Kye Gomez 7 months ago
parent 2c7a962d03
commit 47f5aa977c

@ -1,106 +0,0 @@
# **OmniModalAgent from Swarms: A Comprehensive Starting Guide**
---
**Table of Contents**
1. Introduction: The OmniModal Magic
2. The Mechanics: Unraveling the Underpinnings
3. The Installation Adventure: Setting the Stage
4. Practical Examples: Lets Get Our Hands Dirty!
5. Error Handling: Because Bumps on the Road are Inevitable
6. Dive Deeper: Advanced Features and Usage
7. Wrapping Up: The Road Ahead
---
**1. Introduction: The OmniModal Magic**
Imagine a world where you could communicate seamlessly across any modality, be it text, image, speech, or even video. Now, stop imagining because OmniModalAgent is here to turn that dream into reality. By leveraging advanced architecture and state-of-the-art tools, it can understand and generate any modality you can think of!
---
**2. The Mechanics: Unraveling the Underpinnings**
Dive into the world of OmniModalAgent and lets decipher how it works:
- **LLM (Language Model)**: Its the brain behind understanding and generating language-based interactions.
- **Chat Planner**: Think of it as the strategist. It lays out the plan for the user's input.
- **Task Executor**: The doer. Once the plan is ready, this component takes charge to execute tasks.
- **Tools**: A treasure chest full of tools, from image captioning to translation.
---
**3. The Installation Adventure: Setting the Stage**
Getting OmniModalAgent up and running is as easy as pie. Ready to bake?
```bash
pip install swarms
```
And voilà, your oven (system) is now equipped to bake any modality cake you desire!
---
**4. Practical Examples: Lets Get Our Hands Dirty!**
Lets embark on an exciting journey with OmniModalAgent:
**i. Basic Interaction**:
```python
from swarms.agents import OmniModalAgent
from swarms.models import OpenAIChat
llm = OpenAIChat(openai_api_key="sk-")
agent = OmniModalAgent(llm)
response = agent.run("Create an video of a swarm of fish concept art, game art")
print(response)
```
**ii. Dive into a Conversation**:
```python
agent = OmniModalAgent(llm)
print(agent.chat("What's the weather like?"))
```
---
**5. Error Handling: Because Bumps on the Road are Inevitable**
Errors are like rain, unpredictable but inevitable. Luckily, OmniModalAgent comes with an umbrella. If there's a hiccup during message processing, its gracious enough to let you know.
For instance, if there's a bump, youll receive:
```python
Error processing message: [Details of the error]
```
---
**6. Dive Deeper: Advanced Features and Usage**
The power of OmniModalAgent isnt just limited to basic interactions. Heres a sneak peek into its advanced capabilities:
**Streaming Responses**:
Imagine receiving responses as a gentle stream rather than a sudden splash. With the `_stream_response` method, you can achieve just that.
```python
for token in agent._stream_response(response):
print(token)
```
**The Treasure Chest: Tools**:
OmniModalAgent boasts a plethora of tools, from image captioning to speech-to-text. When you initialize the agent, it equips itself with these tools, ready to tackle any challenge you throw its way.
---
**7. Wrapping Up: The Road Ahead**
You've just scratched the surface of what OmniModalAgent can do. As you explore deeper, you'll discover more of its magic. The world of multi-modality is vast, and with OmniModalAgent as your companion, there's no limit to where you can go.
**Happy Exploring and Coding!** 🚀🎉

@ -12,7 +12,7 @@ Orchestrate enterprise-grade agents for multi-agent collaboration and orchestrat
</a>
</li>
<li>
<a href="./core-concepts/Memory">
<a href="swarms/memory/diy_memory.md">
Memory
</a>
</li>
@ -27,7 +27,7 @@ Orchestrate enterprise-grade agents for multi-agent collaboration and orchestrat
</a>
</li>
<li>
<a href="./core-concepts/Processes">
<a href="swarms/structs/agent_rearrange.md">
Multi-Agent Orchestration
</a>
</li>
@ -101,26 +101,6 @@ Orchestrate enterprise-grade agents for multi-agent collaboration and orchestrat
Self-Replicating Hiearchical Swarms
</a>
</li>
<li>
<a target='_blank' href="https://github.com/kyegomez/Swarms-examples/tree/main/stock_analysis">
Stock Analysis
</a>
</li>
<li>
<a target='_blank' href="https://github.com/kyegomez/Swarms-examples/tree/main/game-builder-crew">
Game Generator
</a>
</li>
<li>
<a target='_blank' href="https://github.com/kyegomez/Swarms-examples/tree/main/Swarms-LangGraph">
Drafting emails with LangGraph
</a>
</li>
<li>
<a target='_blank' href="https://github.com/kyegomez/Swarms-examples/tree/main/landing_page_generator">
Landing Page Generator
</a>
</li>
</ul>
</div>
<div style="width:25%">

@ -116,7 +116,6 @@ for name, prompt in zip(agent_names, system_prompts):
saved_state_path=f"{name.lower().replace(' ', '_')}.json",
# docs_folder="data", # Folder of docs to parse and add to the agent's memory
# long_term_memory=memory,
# dynamic_temperature_enabled=True,
# pdf_path="docs/medical_papers.pdf",
# list_of_pdf=["docs/medical_papers.pdf", "docs/medical_papers_2.pdf"],
# docs=["docs/medicalx_papers.pdf", "docs/medical_papers_2.txt"],

Loading…
Cancel
Save