From d5c0c6554813de0d9e438017640c48c3e19f7702 Mon Sep 17 00:00:00 2001 From: Kye Date: Wed, 4 Oct 2023 16:14:41 -0400 Subject: [PATCH] clean up omni modal chat with streaming, etc --- docs/swarms/agents/omni_agent.md | 79 +++++++++++++++++++------------- 1 file changed, 46 insertions(+), 33 deletions(-) diff --git a/docs/swarms/agents/omni_agent.md b/docs/swarms/agents/omni_agent.md index cf65b380..046586d2 100644 --- a/docs/swarms/agents/omni_agent.md +++ b/docs/swarms/agents/omni_agent.md @@ -9,55 +9,68 @@ The `OmniModalAgent` class is at the core of an architecture designed to facilit 3. **Task Executor**: As the name suggests, it's responsible for executing the formulated tasks. 4. **Tools**: A collection of tools and utilities used to process different types of tasks. They span across areas like image captioning, translation, and more. + + ## Structure & Organization ### Table of Contents: -1. Introduction -2. Architectural Analysis -3. Methods - - Initialization (`__init__`) - - Agent Runner (`run`) -4. Usage Examples +1. Class Introduction and Architecture +2. Constructor (`__init__`) +3. Core Methods + - `run` + - `chat` + - `_stream_response` +4. Example Usage 5. Error Messages & Exception Handling -6. Summary - -### Methods +6. Summary & Further Reading -#### Initialization (`__init__`): -This method initializes the agent with a given language model and loads a plethora of tools. -Parameters: -- **llm (BaseLanguageModel)**: The language model for the agent. +### Constructor (`__init__`): +The agent is initialized with a language model (`llm`). During initialization, the agent loads a myriad of tools to facilitate a broad spectrum of tasks, from document querying to image transformations. -During initialization, various tools like "document-question-answering", "image-captioning", and more are loaded. +### Core Methods: +#### 1. `run(self, input: str) -> str`: +Executes the OmniAgent. The agent plans its actions based on the user's input, executes those actions, and then uses a response generator to construct its reply. -#### Agent Runner (`run`): -`run(self, input: str) -> str`: +#### 2. `chat(self, msg: str, streaming: bool) -> str`: +Facilitates an interactive chat with the agent. It processes user messages, handles exceptions, and returns a response, either in streaming format or as a whole string. -This method represents the primary operation of the OmniModalAgent. It takes an input, devises a plan using the chat planner, executes the plan with the task executor, and finally, the response generator crafts a response based on the tasks executed. - -Parameters: -- **input (str)**: The input string provided by the user. - -Returns: -- **response (str)**: The generated response after executing the plan. +#### 3. `_stream_response(self, response: str)`: +For streaming mode, this function yields the response token by token, ensuring a smooth output flow. ## Examples & Use Cases - -### Usage: +Initialize the `OmniModalAgent` and communicate with it: ```python from swarms import OmniModalAgent, OpenAIChat - -llm = OpenAIChat() -agent = OmniModalAgent(llm) -response = agent.run("Hello, how are you? Create an image of how you are doing!") +llm_instance = OpenAIChat() +agent = OmniModalAgent(llm_instance) +response = agent.run("Translate 'Hello' to French.") print(response) ``` -This example showcases the instantiation of the OmniModalAgent with a language model and then running the agent with a sample input. + +For a chat-based interaction: +```python +agent = OmniModalAgent(llm_instance) +print(agent.chat("How are you doing today?")) +``` ## Error Messages & Exception Handling -Currently, the provided code does not specify particular errors or exceptions. However, future iterations might include error handling mechanisms to cater to issues like tool loading failures, task execution errors, etc. +The `chat` method in `OmniModalAgent` incorporates exception handling. When an error arises during message processing, it returns a formatted error message detailing the exception. This approach ensures that users receive informative feedback in case of unexpected situations. + +For example, if there's an internal processing error, the chat function would return: +``` +Error processing message: [Specific error details] +``` ## Summary -The `OmniModalAgent` is a robust framework designed to assimilate multiple tools and processes into a singular architecture. It aids in understanding, planning, executing, and responding to user inputs in a comprehensive manner. Developers aiming to integrate advanced interactions spanning multiple domains will find this class invaluable. +`OmniModalAgent` epitomizes the fusion of various AI tools, planners, and executors into one cohesive unit, providing a comprehensive interface for diverse tasks and modalities. The versatility and robustness of this agent make it indispensable for applications desiring to bridge multiple AI functionalities in a unified manner. + +For more extensive documentation, API references, and advanced use-cases, users are advised to refer to the primary documentation repository associated with the parent project. Regular updates, community feedback, and patches can also be found there. + + + + + + + + -For further details on the internal tools and modules like `BaseLanguageModel`, `TaskExecutor`, etc., refer to their respective documentation. \ No newline at end of file