You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
swarms/docs/examples/jarvis_agent.md

6.3 KiB

Gemini Nano Banana with Jarvis Agent by Swarms

Overview

Gemini Nano Banana, powered by Google's Gemini 2.5 Flash Image Preview, is now integrated with Swarms agents via the Jarvis Agent. This enables advanced image generation, editing, and analysis in Swarms workflows, with secure, real-time processing and flexible deployment.

Industry Use Case Business Impact
Healthcare Medical image analysis, radiology report generation, patient data visualization Faster diagnosis, reduced errors, improved patient outcomes
Manufacturing Quality control inspection, defect detection, predictive maintenance visualization Reduced waste, improved efficiency, proactive maintenance
Retail Product catalog generation, visual search, inventory management Enhanced customer experience, automated workflows, reduced manual labor
Real Estate Property visualization, virtual tours, market analysis charts Improved sales conversion, remote property viewing, data-driven insights
Insurance Claims processing, damage assessment, fraud detection Accelerated claims settlement, reduced fraud losses, improved accuracy
Financial Services Risk visualization, market trend charts, document processing Enhanced decision-making, automated reporting, regulatory compliance

Get Started

Prerequisites

Before using Gemini Nano Banana models, ensure you have:

  1. Swarms Framework: Install the latest version of Swarms

    pip install swarms
    
  2. API Access: Configure your Gemini API credentials

    export GEMINI_API_KEY="your-api-key-here"
    
  3. Image Dependencies: For image processing tasks, ensure you have the necessary image libraries

    pip install pillow requests
    

Model Configuration

The Gemini models are accessed through the gemini/gemini-2.5-flash-image-preview model identifier, which provides optimized performance for image-related tasks.

Image Generation

Create a new Python file (e.g., img_gen_nano_banana.py) and use the following code to generate images from text prompts:

from swarms import Agent

IMAGE_GEN_SYSTEM_PROMPT = (
    "You are an advanced image generation agent. Given a textual description, generate a high-quality, photorealistic image that matches the prompt. "
    "Return only the generated image."
)

image_gen_agent = Agent(
    agent_name="Image-Generation-Agent",
    agent_description="Agent specialized in generating high-quality, photorealistic images from textual prompts.",
    model_name="gemini/gemini-2.5-flash-image-preview",  # Replace with your preferred image generation model if available
    dynamic_temperature_enabled=True,
    max_loops=1,
    dynamic_context_window=True,
    retry_interval=1,
)

image_gen_out = image_gen_agent.run(
    task=f"{IMAGE_GEN_SYSTEM_PROMPT} \n\n Generate a photorealistic image of a futuristic city skyline at sunset.",
)

print("Image Generation Output:")
print(image_gen_out)

Image Editing and Annotation

For image editing and annotation tasks, create a new file (e.g., jarvis_agent.py) with the following code:

from swarms import Agent


SYSTEM_PROMPT = (
    "You are a location-based AR experience generator. Highlight points of interest in this image and annotate relevant information about it. "
    "Return the image only."
)

# Agent for AR annotation
agent = Agent(
    agent_name="Tactical-Strategist-Agent",
    agent_description="Agent specialized in tactical strategy, scenario analysis, and actionable recommendations for complex situations.",
    model_name="gemini/gemini-2.5-flash-image-preview",
    dynamic_temperature_enabled=True,
    max_loops=1,
    dynamic_context_window=True,
    retry_interval=1,
)

out = agent.run(
    task=f"{SYSTEM_PROMPT} \n\n Annotate all the tallest buildings in the image",
    img="hk.jpg",
)

Additional Features

Custom System Prompts

Prompt Type Example Enterprise Use Case
Architectural Analysis "Analyze building structures and materials" Real estate due diligence
Medical Imaging "Highlight anomalies in medical scans" Healthcare diagnostics
Artistic Enhancement "Enhance colors and composition" Marketing content creation
Object Detection "Identify and label all objects" Quality control automation

Optimization Strategies

Parameter Current Setting Alternative Impact Use Case
max_loops 1 2-3 Higher quality but slower Complex image editing
dynamic_temperature_enabled True False More consistent results Production environments
retry_interval 1 2-5 Better error handling Unstable connections
dynamic_context_window True False Memory efficient Large images

Security and Compliance

Gemini Nano Banana offers robust enterprise security, including end-to-end data encryption, role-based access controls, and comprehensive audit logging to ensure compliance and privacy. The platform meets major industry certifications, making it suitable for healthcare, finance, and other regulated sectors.

Security Feature Certification
Data Encryption SOC 2 Type II
Access Controls (RBAC) HIPAA, PCI DSS
Audit Logging ISO 27001

Connect With Us

If you'd like technical support, join our Discord below and stay updated on our Twitter for new updates!

Platform Link Description
📚 Documentation docs.swarms.world Official documentation and guides
📝 Blog Medium Latest updates and technical articles
💬 Discord Join Discord Live chat and community support
🐦 Twitter @kyegomez Latest news and announcements
👥 LinkedIn The Swarm Corporation Professional network and updates
📺 YouTube Swarms Channel Tutorials and demos
🎫 Events Sign up here Join our community events