From 7ca26e0162415d0f14a99d907a5ac90b6652a1d1 Mon Sep 17 00:00:00 2001 From: Kye Gomez Date: Sun, 6 Apr 2025 08:59:17 +0800 Subject: [PATCH] llama4 models --- docs/swarms/examples/llama4.md | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/docs/swarms/examples/llama4.md b/docs/swarms/examples/llama4.md index 4367bc1c..1e2b9e77 100644 --- a/docs/swarms/examples/llama4.md +++ b/docs/swarms/examples/llama4.md @@ -19,9 +19,21 @@ load_dotenv() model = VLLM(model_name="meta-llama/Llama-4-Maverick-17B-128E") ``` -!!! tip "Environment Setup" - Make sure to set up your environment variables properly before running the code. - Create a `.env` file in your project root if needed. +## Available Models + +| Model Name | Description | Type | +|------------|-------------|------| +| meta-llama/Llama-4-Maverick-17B-128E | Base model with 128 experts | Base | +| meta-llama/Llama-4-Maverick-17B-128E-Instruct | Instruction-tuned version with 128 experts | Instruct | +| meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | FP8 quantized instruction model | Instruct (Optimized) | +| meta-llama/Llama-4-Scout-17B-16E | Base model with 16 experts | Base | +| meta-llama/Llama-4-Scout-17B-16E-Instruct | Instruction-tuned version with 16 experts | Instruct | + +!!! tip "Model Selection" + - Choose Instruct models for better performance on instruction-following tasks + - FP8 models offer better memory efficiency with minimal performance impact + - Scout models (16E) are lighter but still powerful + - Maverick models (128E) offer maximum performance but require more resources ## Detailed Implementation