parent
4480ead6aa
commit
e236416bf9
@ -0,0 +1,124 @@
|
||||
# `Idea2Image` Documentation
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Introduction](#introduction)
|
||||
2. [Idea2Image Class](#idea2image-class)
|
||||
- [Initialization Parameters](#initialization-parameters)
|
||||
3. [Methods and Usage](#methods-and-usage)
|
||||
- [llm_prompt Method](#llm-prompt-method)
|
||||
- [generate_image Method](#generate-image-method)
|
||||
4. [Examples](#examples)
|
||||
- [Example 1: Generating an Image](#example-1-generating-an-image)
|
||||
5. [Additional Information](#additional-information)
|
||||
6. [References and Resources](#references-and-resources)
|
||||
|
||||
---
|
||||
|
||||
## 1. Introduction <a name="introduction"></a>
|
||||
|
||||
Welcome to the documentation for the Swarms library, with a focus on the `Idea2Image` class. This comprehensive guide provides in-depth information about the Swarms library and its core components. Before we dive into the details, it's crucial to understand the purpose and significance of this library.
|
||||
|
||||
### 1.1 Purpose
|
||||
|
||||
The Swarms library aims to simplify interactions with AI models for generating images from text prompts. The `Idea2Image` class is designed to generate images from textual descriptions using the DALLE-3 model and the OpenAI GPT-4 language model.
|
||||
|
||||
### 1.2 Key Features
|
||||
|
||||
- **Image Generation:** Swarms allows you to generate images based on natural language prompts, providing a bridge between textual descriptions and visual content.
|
||||
|
||||
- **Integration with DALLE-3:** The `Idea2Image` class leverages the power of DALLE-3 to create images that match the given textual descriptions.
|
||||
|
||||
- **Language Model Integration:** The class integrates with OpenAI's GPT-3 for prompt refinement, enhancing the specificity of image generation.
|
||||
|
||||
---
|
||||
|
||||
## 2. Idea2Image Class <a name="idea2image-class"></a>
|
||||
|
||||
The `Idea2Image` class is a fundamental module in the Swarms library, enabling the generation of images from text prompts.
|
||||
|
||||
### 2.1 Initialization Parameters <a name="initialization-parameters"></a>
|
||||
|
||||
Here are the initialization parameters for the `Idea2Image` class:
|
||||
|
||||
- `image` (str): Text prompt for the image to generate.
|
||||
|
||||
- `openai_api_key` (str): OpenAI API key. This key is used for prompt refinement with GPT-3. If not provided, the class will attempt to use the `OPENAI_API_KEY` environment variable.
|
||||
|
||||
- `cookie` (str): Cookie value for DALLE-3. This cookie is used to interact with the DALLE-3 API. If not provided, the class will attempt to use the `BING_COOKIE` environment variable.
|
||||
|
||||
- `output_folder` (str): Folder to save the generated images. The default folder is "images/".
|
||||
|
||||
### 2.2 Methods <a name="methods-and-usage"></a>
|
||||
|
||||
The `Idea2Image` class provides the following methods:
|
||||
|
||||
- `llm_prompt()`: Returns a prompt for refining the image generation. This method helps improve the specificity of the image generation prompt.
|
||||
|
||||
- `generate_image()`: Generates and downloads the image based on the prompt. It refines the prompt, opens the website with the query, retrieves image URLs, and downloads the images to the specified folder.
|
||||
|
||||
---
|
||||
|
||||
## 3. Methods and Usage <a name="methods-and-usage"></a>
|
||||
|
||||
Let's explore the methods provided by the `Idea2Image` class and how to use them effectively.
|
||||
|
||||
### 3.1 `llm_prompt` Method <a name="llm-prompt-method"></a>
|
||||
|
||||
The `llm_prompt` method returns a refined prompt for generating the image. It's a critical step in improving the specificity and accuracy of the image generation process. The method provides a guide for refining the prompt, helping users describe the desired image more precisely.
|
||||
|
||||
### 3.2 `generate_image` Method <a name="generate-image-method"></a>
|
||||
|
||||
The `generate_image` method combines the previous methods to execute the whole process of generating and downloading images based on the provided prompt. It's a convenient way to automate the image generation process.
|
||||
|
||||
---
|
||||
|
||||
## 4. Examples <a name="examples"></a>
|
||||
|
||||
Let's dive into practical examples to demonstrate the usage of the `Idea2Image` class.
|
||||
|
||||
### 4.1 Example 1: Generating an Image <a name="example-1-generating-an-image"></a>
|
||||
|
||||
In this example, we create an instance of the `Idea2Image` class and use it to generate an image based on a text prompt:
|
||||
|
||||
```python
|
||||
from swarms.agents import Idea2Image
|
||||
|
||||
# Create an instance of the Idea2Image class with your prompt and API keys
|
||||
idea2image = Idea2Image(
|
||||
image="Fish hivemind swarm in light blue avatar anime in zen garden pond concept art anime art, happy fish, anime scenery",
|
||||
openai_api_key="your_openai_api_key_here",
|
||||
cookie="your_cookie_value_here",
|
||||
)
|
||||
|
||||
# Generate and download the image
|
||||
idea2image.generate_image()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Additional Information <a name="additional-information"></a>
|
||||
|
||||
Here are some additional tips and information for using the Swarms library and the `Idea2Image` class effectively:
|
||||
|
||||
- Refining the prompt is a crucial step to influence the style, composition, and mood of the generated image. Follow the provided guide in the `llm_prompt` method to create precise prompts.
|
||||
|
||||
- Experiment with different prompts, variations, and editing techniques to create unique and interesting images.
|
||||
|
||||
- You can combine separate DALLE-3 outputs into panoramas and murals by careful positioning and editing.
|
||||
|
||||
- Consider sharing your creations and exploring resources in communities like Reddit r/dalle2 for inspiration and tools.
|
||||
|
||||
- The `output_folder` parameter allows you to specify the folder where generated images will be saved. Ensure that you have the necessary permissions to write to that folder.
|
||||
|
||||
---
|
||||
|
||||
## 6. References and Resources <a name="references-and-resources"></a>
|
||||
|
||||
For further information and resources related to the Swarms library and DALLE-3:
|
||||
|
||||
- [DALLE-3 Unofficial API Documentation](https://www.bing.com/images/create): The official documentation for the DALLE-3 Unofficial API, where you can explore additional features and capabilities.
|
||||
|
||||
- [OpenAI GPT-3 Documentation](https://beta.openai.com/docs/): The documentation for OpenAI's GPT-3, which is used for prompt refinement.
|
||||
|
||||
This concludes the documentation for the Swarms library and the `Idea2Image` class. You now have a comprehensive guide on how to generate images from text prompts using DALLE-3 and GPT-3 with Swarms.
|
@ -0,0 +1,111 @@
|
||||
import os
|
||||
import logging
|
||||
from dataclasses import dataclass
|
||||
from dalle3 import Dalle
|
||||
from swarms.models import OpenAIChat
|
||||
|
||||
|
||||
@dataclass
|
||||
class Idea2Image:
|
||||
"""
|
||||
A class used to generate images from text prompts using DALLE-3.
|
||||
|
||||
...
|
||||
|
||||
Attributes
|
||||
----------
|
||||
image : str
|
||||
Text prompt for the image to generate
|
||||
openai_api_key : str
|
||||
OpenAI API key
|
||||
cookie : str
|
||||
Cookie value for DALLE-3
|
||||
output_folder : str
|
||||
Folder to save the generated images
|
||||
|
||||
Methods
|
||||
-------
|
||||
llm_prompt():
|
||||
Returns a prompt for refining the image generation
|
||||
generate_image():
|
||||
Generates and downloads the image based on the prompt
|
||||
|
||||
|
||||
Usage:
|
||||
------
|
||||
from dalle3 import Idea2Image
|
||||
|
||||
idea2image = Idea2Image(
|
||||
image="Fish hivemind swarm in light blue avatar anime in zen garden pond concept art anime art, happy fish, anime scenery"
|
||||
)
|
||||
idea2image.run()
|
||||
"""
|
||||
|
||||
image: str
|
||||
openai_api_key: str = os.getenv("OPENAI_API_KEY") or None
|
||||
cookie: str = os.getenv("BING_COOKIE") or None
|
||||
output_folder: str = "images/"
|
||||
|
||||
def __post_init__(self):
|
||||
self.llm = OpenAIChat(openai_api_key=self.openai_api_key)
|
||||
self.dalle = Dalle(self.cookie)
|
||||
|
||||
def llm_prompt(self):
|
||||
LLM_PROMPT = f"""
|
||||
Refine the USER prompt to create a more precise image tailored to the user's needs using
|
||||
an image generator like DALLE-3.
|
||||
|
||||
###### FOLLOW THE GUIDE BELOW TO REFINE THE PROMPT ######
|
||||
|
||||
- Use natural language prompts up to 400 characters to describe the image you want to generate. Be as specific or vague as needed.
|
||||
|
||||
- Frame your photographic prompts like camera position, lighting, film type, year, usage context. This implicitly suggests image qualities.
|
||||
|
||||
- For illustrations, you can borrow photographic terms like "close up" and prompt for media, style, artist, animation style, etc.
|
||||
|
||||
- Prompt hack: name a film/TV show genre + year to "steal the look" for costumes, lighting, etc without knowing technical details.
|
||||
|
||||
- Try variations of a prompt, make edits, and do recursive uncropping to create interesting journeys and zoom-out effects.
|
||||
|
||||
- Use an image editor like Photopea to uncrop DALL-E outputs and prompt again to extend the image.
|
||||
|
||||
- Combine separate DALL-E outputs into panoramas and murals with careful positioning/editing.
|
||||
|
||||
- Browse communities like Reddit r/dalle2 to get inspired and share your creations. See tools, free image resources, articles.
|
||||
|
||||
- Focus prompts on size, structure, shape, mood, aesthetics to influence the overall vibe and composition.
|
||||
|
||||
- Be more vague or detailed as needed - DALL-E has studied over 400M images and can riff creatively or replicate specific styles.
|
||||
|
||||
- Be descriptive, describe the art style at the end like fusing concept art with anime art or game art or product design art.
|
||||
|
||||
###### END OF GUIDE ######
|
||||
|
||||
Prompt to refine: {self.image}
|
||||
"""
|
||||
return LLM_PROMPT
|
||||
|
||||
def run(self):
|
||||
"""
|
||||
Generates and downloads the image based on the prompt.
|
||||
|
||||
This method refines the prompt using the llm, opens the website with the query,
|
||||
gets the image URLs, and downloads the images to the specified folder.
|
||||
"""
|
||||
# Set up logging
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
# Refine the prompt using the llm
|
||||
image = self.llm_prompt()
|
||||
refined_prompt = self.llm(image)
|
||||
print(f"Refined prompt: {refined_prompt}")
|
||||
|
||||
# Open the website with your query
|
||||
self.dalle.create(refined_prompt)
|
||||
|
||||
# Get the image URLs
|
||||
urls = self.dalle.get_urls()
|
||||
|
||||
# Download the images to your specified folder
|
||||
self.dalle.download(urls, self.output_folder)
|
||||
|
@ -0,0 +1,59 @@
|
||||
import pytest
|
||||
import os
|
||||
import shutil
|
||||
from swarms.idea2image import Idea2Image
|
||||
|
||||
openai_key = os.getenv("OPENAI_API_KEY")
|
||||
dalle_cookie = os.getenv("BING_COOKIE")
|
||||
|
||||
# Constants for testing
|
||||
TEST_PROMPT = "Happy fish."
|
||||
TEST_OUTPUT_FOLDER = "test_images/"
|
||||
OPENAI_API_KEY = openai_key
|
||||
DALLE_COOKIE = dalle_cookie
|
||||
|
||||
@pytest.fixture(scope="module")
|
||||
def idea2image_instance():
|
||||
# Create an instance of the Idea2Image class
|
||||
idea2image = Idea2Image(
|
||||
image=TEST_PROMPT,
|
||||
openai_api_key=OPENAI_API_KEY,
|
||||
cookie=DALLE_COOKIE,
|
||||
output_folder=TEST_OUTPUT_FOLDER,
|
||||
)
|
||||
yield idea2image
|
||||
# Clean up the test output folder after testing
|
||||
if os.path.exists(TEST_OUTPUT_FOLDER):
|
||||
shutil.rmtree(TEST_OUTPUT_FOLDER)
|
||||
|
||||
def test_idea2image_instance(idea2image_instance):
|
||||
# Check if the instance is created successfully
|
||||
assert isinstance(idea2image_instance, Idea2Image)
|
||||
|
||||
def test_llm_prompt(idea2image_instance):
|
||||
# Test the llm_prompt method
|
||||
prompt = idea2image_instance.llm_prompt()
|
||||
assert isinstance(prompt, str)
|
||||
|
||||
def test_generate_image(idea2image_instance):
|
||||
# Test the generate_image method
|
||||
idea2image_instance.generate_image()
|
||||
# Check if the output folder is created
|
||||
assert os.path.exists(TEST_OUTPUT_FOLDER)
|
||||
# Check if files are downloaded (assuming DALLE-3 responds with URLs)
|
||||
files = os.listdir(TEST_OUTPUT_FOLDER)
|
||||
assert len(files) > 0
|
||||
|
||||
def test_invalid_openai_api_key():
|
||||
# Test with an invalid OpenAI API key
|
||||
with pytest.raises(Exception) as exc_info:
|
||||
Idea2Image(
|
||||
image=TEST_PROMPT,
|
||||
openai_api_key="invalid_api_key",
|
||||
cookie=DALLE_COOKIE,
|
||||
output_folder=TEST_OUTPUT_FOLDER,
|
||||
)
|
||||
assert "Failed to initialize OpenAIChat" in str(exc_info.value)
|
||||
|
||||
if __name__ == "__main__":
|
||||
pytest.main()
|
Loading…
Reference in new issue