[GPT4o]Docs]

1 year ago · 7f577acca3
parent 96e9cfd496
commit 7f577acca3
4 changed files with 507 additions and 569 deletions
--- a/docs/index.md
+++ b/docs/index.md
@ -1,6 +1,6 @@
 # Swarms Documentation
-Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, Swarms empowers agents to work together seamlessly, tackling complex tasks.
+Orchestrate enterprise-grade agents for multi-agent collaboration and orchestration to automate real-world problems.
 <div style="display:flex; margin:0 auto; justify-content: center;">
    <div style="width:25%">
@ -92,37 +92,37 @@ Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By
        <h2>Examples</h2>
        <ul>
            <li>
-                <a target='_blank' href="https://github.com/joaomdmoura/Swarms-examples/tree/main/prep-for-a-meeting">
+                <a target='_blank' href="https://github.com/kyegomez/Swarms-examples/tree/main/prep-for-a-meeting">
                    Prepare for meetings
                </a>
            </li>
            <li>
-                <a target='_blank' href="https://github.com/joaomdmoura/Swarms-examples/tree/main/trip_planner">
+                <a target='_blank' href="https://github.com/kyegomez/Swarms-examples/tree/main/trip_planner">
                    Trip Planner Crew
                </a>
            </li>
            <li>
-                <a target='_blank' href="https://github.com/joaomdmoura/Swarms-examples/tree/main/instagram_post">
+                <a target='_blank' href="https://github.com/kyegomez/Swarms-examples/tree/main/instagram_post">
                    Create Instagram Post
                </a>
            </li>
            <li>
-                <a target='_blank' href="https://github.com/joaomdmoura/Swarms-examples/tree/main/stock_analysis">
+                <a target='_blank' href="https://github.com/kyegomez/Swarms-examples/tree/main/stock_analysis">
                    Stock Analysis
                </a>
            </li>
            <li>
-                <a target='_blank' href="https://github.com/joaomdmoura/Swarms-examples/tree/main/game-builder-crew">
+                <a target='_blank' href="https://github.com/kyegomez/Swarms-examples/tree/main/game-builder-crew">
                    Game Generator
                </a>
            </li>
            <li>
-                <a target='_blank' href="https://github.com/joaomdmoura/Swarms-examples/tree/main/Swarms-LangGraph">
+                <a target='_blank' href="https://github.com/kyegomez/Swarms-examples/tree/main/Swarms-LangGraph">
                    Drafting emails with LangGraph
                </a>
            </li>
            <li>
-                <a target='_blank' href="https://github.com/joaomdmoura/Swarms-examples/tree/main/landing_page_generator">
+                <a target='_blank' href="https://github.com/kyegomez/Swarms-examples/tree/main/landing_page_generator">
                    Landing Page Generator
                </a>
            </li>
--- a/docs/mkdocs.yml
+++ b/docs/mkdocs.yml
@ -126,16 +126,17 @@ nav:
  - Contributors:
    - Contributing: "contributing.md"
 - Swarms Framework Reference:
  - Overview: "swarms/index.md"
  - swarms.models:
    - How to Create A Custom Language Model: "swarms/models/custom_model.md"
    - Deploying Azure OpenAI in Production A Comprehensive Guide: "swarms/models/azure_openai.md"
-    - Language Models Available:
+    - Language Models:
      - BaseLLM: "swarms/models/base_llm.md"
      - Overview: "swarms/models/index.md"
      - HuggingFaceLLM: "swarms/models/huggingface.md"
      - Anthropic: "swarms/models/anthropic.md"
      - OpenAIChat: "swarms/models/openai.md"
-    - MultiModal Models Available:
+    - MultiModal Models :
      - BaseMultiModalModel: "swarms/models/base_multimodal_model.md"
      - Fuyu: "swarms/models/fuyu.md"
      - Vilt: "swarms/models/vilt.md"
@ -144,6 +145,7 @@ nav:
      - Nougat: "swarms/models/nougat.md"
      - Dalle3: "swarms/models/dalle3.md"
      - GPT4VisionAPI: "swarms/models/gpt4v.md"
      - GPT4o: "swarms/models/gpt4o.md"
  - swarms.structs:
      - Foundational Structures:
        - Agent: "swarms/structs/agent.md"
--- a/docs/swarms/index.md
+++ b/docs/swarms/index.md
--- a/docs/swarms/models/gpt4o.md
+++ b/docs/swarms/models/gpt4o.md
@ -0,0 +1,150 @@
 # Documentation for GPT4o Module
 ## Overview and Introduction
 The `GPT4o` module is a multi-modal conversational model based on OpenAI's GPT-4 architecture. It extends the functionality of the `BaseMultiModalModel` class, enabling it to handle both text and image inputs for generating diverse and contextually rich responses. This module leverages the power of the GPT-4 model to enhance interactions by integrating visual information with textual prompts, making it highly relevant for applications requiring multi-modal understanding and response generation.
 ### Key Concepts
 - **Multi-Modal Model**: A model that can process and generate responses based on multiple types of inputs, such as text and images.
 - **System Prompt**: A predefined prompt to guide the conversation flow.
 - **Temperature**: A parameter that controls the randomness of the response generation.
 - **Max Tokens**: The maximum number of tokens (words or word pieces) in the generated response.
 ## Class Definition
 ### `GPT4o` Class
 ### Parameters
 | Parameter       | Type   | Description                                                                          |
 |-----------------|--------|--------------------------------------------------------------------------------------|
 | `system_prompt` | `str`  | The system prompt to be used in the conversation.                                     |
 | `temperature`   | `float`| The temperature parameter for generating diverse responses. Default is `0.1`.        |
 | `max_tokens`    | `int`  | The maximum number of tokens in the generated response. Default is `300`.            |
 | `openai_api_key`| `str`  | The API key for accessing the OpenAI GPT-4 API.                                       |
 | `*args`         |        | Additional positional arguments.                                                     |
 | `**kwargs`      |        | Additional keyword arguments.                                                        |
 ## Functionality and Usage
 ### `encode_image` Function
 The `encode_image` function is used to encode an image file into a base64 string format, which can then be included in the request to the GPT-4 API.
 #### Parameters
 | Parameter     | Type   | Description                                  |
 |---------------|--------|----------------------------------------------|
 | `image_path`  | `str`  | The local path to the image file to be encoded. |
 #### Returns
 | Return Type | Description                     |
 |-------------|---------------------------------|
 | `str`       | The base64 encoded string of the image. |
 ### `GPT4o.__init__` Method
 The constructor for the `GPT4o` class initializes the model with the specified parameters and sets up the OpenAI client.
 ### `GPT4o.run` Method
 The `run` method executes the GPT-4o model to generate a response based on the provided task and optional image.
 #### Parameters
 | Parameter     | Type   | Description                                        |
 |---------------|--------|----------------------------------------------------|
 | `task`        | `str`  | The task or user prompt for the conversation.      |
 | `local_img`   | `str`  | The local path to the image file.                  |
 | `img`         | `str`  | The URL of the image.                              |
 | `*args`       |        | Additional positional arguments.                   |
 | `**kwargs`    |        | Additional keyword arguments.                      |
 #### Returns
 | Return Type | Description                                      |
 |-------------|--------------------------------------------------|
 | `str`       | The generated response from the GPT-4o model.    |
 ## Usage Examples
 ### Example 1: Basic Text Prompt
 ```python
 from swarms import GPT4o
 # Initialize the model
 model = GPT4o(
    system_prompt="You are a helpful assistant.",
    temperature=0.7,
    max_tokens=150,
    openai_api_key="your_openai_api_key"
 )
 # Define the task
 task = "What is the capital of France?"
 # Generate response
 response = model.run(task)
 print(response)
 ```
 ### Example 2: Text Prompt with Local Image
 ```python
 from swarms import GPT4o
 # Initialize the model
 model = GPT4o(
    system_prompt="Describe the image content.",
    temperature=0.5,
    max_tokens=200,
    openai_api_key="your_openai_api_key"
 )
 # Define the task and image path
 task = "Describe the content of this image."
 local_img = "path/to/your/image.jpg"
 # Generate response
 response = model.run(task, local_img=local_img)
 print(response)
 ```
 ### Example 3: Text Prompt with Image URL
 ```python
 from swarms import GPT4o
 # Initialize the model
 model = GPT4o(
    system_prompt="You are a visual assistant.",
    temperature=0.6,
    max_tokens=250,
    openai_api_key="your_openai_api_key"
 )
 # Define the task and image URL
 task = "What can you tell about the scenery in this image?"
 img_url = "http://example.com/image.jpg"
 # Generate response
 response = model.run(task, img=img_url)
 print(response)
 ```
 ## Additional Information and Tips
 - **API Key Management**: Ensure that your OpenAI API key is securely stored and managed. Do not hard-code it in your scripts. Use environment variables or secure storage solutions.
 - **Image Encoding**: The `encode_image` function is crucial for converting images to a base64 format suitable for API requests. Ensure that the images are accessible and properly formatted.
 - **Temperature Parameter**: Adjust the `temperature` parameter to control the creativity of the model's responses. Lower values make the output more deterministic, while higher values increase randomness.
 - **Token Limit**: Be mindful of the `max_tokens` parameter to avoid exceeding the API's token limits. This parameter controls the length of the generated responses.
 ## References and Resources
 - [OpenAI API Documentation](https://beta.openai.com/docs/)
 - [Python Base64 Encoding](https://docs.python.org/3/library/base64.html)
 - [dotenv Documentation](https://saurabh-kumar.com/python-dotenv/)
 - [BaseMultiModalModel Documentation](https://swarms.apac.ai)