You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
swarms/docs/swarms_cloud/available_models.md

2.2 KiB

Available Models

| Model Name            | Description                                                                                             | Input Price  | Output Price | Use Cases                                                              |
|-----------------------|---------------------------------------------------------------------------------------------------------|--------------|--------------|------------------------------------------------------------------------|
| **Llama3-70b**        | Llama 3 is an auto-regressive language model that uses an optimized transformer architecture.          | $0.80/1M Tokens | $1.60/1M Tokens | General natural language processing tasks.                             |
| **Llava-Internlm2-20b** | LLaVA model fine-tuned from InternLM2-Chat-20B and CLIP-ViT-Large-patch14-336.                         | Contact for pricing | Contact for pricing | Enhanced language understanding integrated with visual processing.    |
| **Llama-3-Giraffe-70B** | Abacus.AI presents our longer-necked variant of Llama 3 70B!                                           | $1/1M Tokens | $2/1M Tokens | Extensive natural language tasks with a focus on depth and efficiency. |
| **Qwen-vl**            | Qwen VL for real-world multi-modal function calling.                                                   | $5/1M Tokens | $10/1M Tokens | Multi-modal interactions and function handling in complex environments.|
| **XComposer2-4khd-7b** | One of the highest performing VLMs (Video Language Models).                                            | $4/1M Tokens | $8/1M Tokens | High-resolution video processing and understanding.                     |
| **Llava-Llama-3**      | Llama3 with Multi-Modal Processing.                                                                    | $5/1M Tokens | $10/1M Tokens | Advanced multi-modal scenarios involving language and image processing. |
| **cogvlm-chat-17b**    | Groundbreaking multimodal model designed to understand and reason about visual elements in images.     | $5/1M Tokens | $10/1M Tokens | Image-based chatbots and interactive systems.                           |

What models should we add?

Book a call with us to learn more about your needs: