[swarms.utils][+++][DOCS] [TESTS]

pull/336/head
Kye 1 year ago
parent f8a2563228
commit a876838efa

@ -0,0 +1,86 @@
# check_device
# Module/Function Name: check_device
The `check_device` is a utility function in PyTorch designed to identify and return the appropriate device(s) for CUDA processing. If CUDA is not available, a CPU device is returned. If CUDA is available, the function returns a list of all available GPU devices.
The function examines the CUDA availability, checks for multiple GPUs, and finds additional properties for each device.
## Function Signature and Arguments
**Signature:**
```python
def check_device(
log_level: Any = logging.INFO,
memory_threshold: float = 0.8,
capability_threshold: float = 3.5,
return_type: str = "list",
) -> Union[torch.device, List[torch.device]]
```
| Parameter | Data Type | Default Value | Description |
| ------------- | ------------- | ------------- | ------------- |
| `log_level` | Any | logging.INFO | The log level. |
| `memory_threshold` | float | 0.8 | It is used to check the threshold of memory used on the GPU(s). |
| `capability_threshold` | float | 3.5 | It is used to consider only those GPU(s) which have higher compute capability compared to the threshold. |
| `return_type` | str | "list" | Depending on the `return_type` either a list of devices can be returned or a single device. |
This function does not take any mandatory argument. However, it supports optional arguments such as `log_level`, `memory_threshold`, `capability_threshold`, and `return_type`.
**Returns:**
- A single torch.device if one device or list of torch.devices if multiple CUDA devices are available, else returns the CPU device if CUDA is not available.
## Usage and Examples
### Example 1: Basic Usage
```python
import torch
import logging
from swarms.utils import check_device
# Basic usage
device = check_device(
log_level=logging.INFO,
memory_threshold=0.8,
capability_threshold=3.5,
return_type="list"
)
```
### Example 2: Using CPU when CUDA is not available
```python
import torch
import logging
from swarms.utils import check_device
# When CUDA is not available
device = check_device()
print(device) # If CUDA is not available it should return torch.device('cpu')
```
### Example 3: Multiple GPU Available
```python
import torch
import logging
from swarms.utils import check_device
# When multiple GPUs are available
device = check_device()
print(device) # Should return a list of available GPU devices
```
## Tips and Additional Information
- This function is useful when a user wants to exploit CUDA capabilities for faster computation but unsure of the available devices. This function abstracts all the necessary checks and provides a list of CUDA devices to the user.
- The `memory_threshold` and `capability_threshold` are utilized to filter the GPU devices. The GPUs which have memory usage above the `memory_threshold` and compute capability below the `capability_threshold` are not considered.
- As of now, CPU does not have memory or capability values, therefore, in the respective cases, it will be returned as default without any comparison.
## Relevant Resources
- For more details about the CUDA properties functions used (`torch.cuda.get_device_capability, torch.cuda.get_device_properties`), please refer to the official PyTorch [CUDA semantics documentation](https://pytorch.org/docs/stable/notes/cuda.html).
- For more information about Torch device objects, you can refer to the official PyTorch [device documentation](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device).
- For a better understanding of how the `logging` module works in Python, see the official Python [logging documentation](https://docs.python.org/3/library/logging.html).

@ -0,0 +1,86 @@
# display_markdown_message
# Module Name: `display_markdown_message`
## Introduction
`display_markdown_message` is a useful utility function for creating visually-pleasing markdown messages within Python scripts. This function automatically manages multiline strings with lots of indentation and makes single-line messages with ">" tags easy to read, providing users with convenient and elegant logging or messaging capacity.
## Function Definition and Arguments
Function Definition:
```python
def display_markdown_message(message: str, color: str = "cyan"):
```
This function accepts two parameters:
|Parameter |Type |Default Value |Description |
|--- |--- |--- |--- |
|message |str |None |This is the message that is to be displayed. This should be a string. It can contain markdown syntax.|
|color |str |"cyan" |This allows you to choose the color of the message. Default is "cyan". Accepts any valid color name.|
## Functionality and Usage
This utility function is used to display a markdown formatted message on the console. It accepts a message as a string and an optional color for the message. The function is ideal for generating stylized print outputs such as headers, status updates or pretty notifications.
By default, any text within the string which is enclosed within `>` tags or `---` is treated specially:
- Lines encased in `>` tags are rendered as a blockquote in markdown.
- Lines consisting of `---` are rendered as horizontal rules.
The function automatically strips off leading and trailing whitespaces from any line within the message, maintaining aesthetic consistency in your console output.
### Usage Examples
#### Basic Example
```python
display_markdown_message("> This is an important message", color="red")
```
Output:
```md
> **This is an important message**
```
This example will print out the string "This is an important message" in red color, enclosed in a blockquote tag.
#### Multiline Example
```python
message = """
> Header
My normal message here.
---
Another important information
"""
display_markdown_message(message, color="green")
```
Output:
```md
> **Header**
My normal message here.
_____
Another important information
```
The output is a green colored markdown styled text with the "Header" enclosed in a blockquote, followed by the phrase "My normal message here", a horizontal rule, and finally another phrase, "Another important information".
## Additional Information
Use newline characters `\n` to separate the lines of the message. Remember, each line of the message is stripped of leading and trailing whitespaces. If you have special markdown requirements, you may need to revise the input message string accordingly.
Also, keep in mind the console or terminal's ability to display the chosen color. If a particular console does not support the chosen color, the output may fallback to the default console color.
For a full list of color names supported by the `Console` module, refer to the official [Console documentation](http://console.readthedocs.io/).
## References and Resources
- Python Strings: https://docs.python.org/3/tutorial/introduction.html#strings
- Python Markdown: https://pypi.org/project/markdown/
- Console module: https://console.readthedocs.io/

@ -0,0 +1,114 @@
# extract_code_from_markdown
# swarms.utils Module
The `swarms.utils` module provides utility functions designed to facilitate specific tasks within the main Swarm codebase. The function `extract_code_from_markdown` is a critical function within this module that we will document in this example.
## Overview and Introduction
Many software projects use Markdown extensively for writing documentation, tutorials, and other text documents that can be easily rendered and viewed in different formats, including HTML.
The `extract_code_from_markdown` function plays a crucial role within the swarms.utils library. As developers write large volumes of Markdown, they often need to isolate code snippets from the whole Markdown file body. These isolated snippets can be used to generate test cases, transform into other languages, or analyze for metrics.
## Function Definition: `extract_code_from_markdown`
```python
def extract_code_from_markdown(markdown_content: str) -> str:
"""
Extracts code blocks from a Markdown string and returns them as a single string.
Args:
- markdown_content (str): The Markdown content as a string.
Returns:
- str: A single string containing all the code blocks separated by newlines.
"""
# Regular expression for fenced code blocks
pattern = r"```(?:\w+\n)?(.*?)```"
matches = re.findall(pattern, markdown_content, re.DOTALL)
# Concatenate all code blocks separated by newlines
return "\n".join(code.strip() for code in matches)
```
### Arguments
The function `extract_code_from_markdown` takes one argument:
| Argument | Description | Type | Default Value |
|-----------------------|----------------------------------------|-------------|-------------------|
| markdown_content | The input markdown content as a string | str | N/A |
## Function Explanation and Usage
This function uses a regular expression to find all fenced code blocks in a Markdown string. The pattern `r"```(?:\w+\n)?(.*?)```"` matches strings that start and end with three backticks, optionally followed by a newline and then any number of any characters (the `.*?` part) until the first occurrence of another triple backtick set.
Once we have the matches, we join all the code blocks into a single string, each block separated by a newline.
The method's functionality is particularly useful when we need to extract code blocks from markdown content for secondary processing, such as syntax highlighting or execution in a different environment.
### Usage Examples
Below are three examples of how you might use this function:
#### Example 1:
Extracting code blocks from a simple markdown string.
```python
import re
from swarms.utils import extract_code_from_markdown
markdown_string = '''# Example
This is an example of a code block:
```python
print("Hello World!")
``` '''
print(extract_code_from_markdown(markdown_string))
```
#### Example 2:
Extracting code blocks from a markdown file.
```python
import re
def extract_code_from_markdown(markdown_content: str) -> str:
pattern = r"```(?:\w+\n)?(.*?)```"
matches = re.findall(pattern, markdown_content, re.DOTALL)
return "\n".join(code.strip() for code in matches)
# Assume that 'example.md' contains multiple code blocks
with open('example.md', 'r') as file:
markdown_content = file.read()
print(extract_code_from_markdown(markdown_content))
```
#### Example 3:
Using the function in a pipeline to extract and then analyze code blocks.
```python
import re
def extract_code_from_markdown(markdown_content: str) -> str:
pattern = r"```(?:\w+\n)?(.*?)```"
matches = re.findall(pattern, markdown_content, re.DOTALL)
return "\n".join(code.strip() for code in matches)
def analyze_code_blocks(code: str):
# Add your analysis logic here
pass
# Assume that 'example.md' contains multiple code blocks
with open('example.md', 'r') as file:
markdown_content = file.read()
code_blocks = extract_code_from_markdown(markdown_content)
analyze_code_blocks(code_blocks)
```
## Conclusion
This concludes the detailed documentation of the `extract_code_from_markdown` function from the swarms.utils module. With this documentation, you should be able to understand the function's purpose, how it works, its parameters, and see examples of how to use it effectively.

@ -0,0 +1,94 @@
# find_image_path
Firstly, we will divide this documentation into multiple sections.
# Overview
The module **swarms.utils** has the main goal of providing necessary utility functions that are crucial during the creation of the swarm intelligence frameworks. These utility functions can include common operations such as handling input-output operations for files, handling text parsing, and handling basic mathematical computations necessary during the creation of swarm intelligence models.
The current function `find_image_path` in the module is aimed at extracting an image path from a given text document.
# Function Detailed Explanation
## Definition
The function `find_image_path` takes a singular argument as an input:
```python
def find_image_path(text):
# function body
```
## Parameter
The parameter `text` in the function is a string that represents the document or text from which the function is trying to extract all paths to the images present. The function scans the given text, looking for <em>absolute</em> or <em>relative</em> paths to image files (.png, .jpg, .jpeg) on the disk.
| Parameter Name | Data Type | Default Value | Description |
|:--------------:|:---------:|:-------------:|:--------:|
| `text` | `str` | - | The text content to scan for image paths |
## Return Value
The return value of the function `find_image_path` is a string that represents the longest existing image path extracted from the input text. If no image paths exist within the text, the function returns `None`.
| Return Value | Data Type | Description |
|:------------:|:-----------:|:-----------:|
| Path | `str` | Longest image path found in the text or `None` if no path found |
# Function's Code
The function `find_image_path` performs text parsing and pattern recognition to find image paths within the provided text. The function uses `regular expressions (re)` module to detect all potential paths.
```python
def find_image_path(text):
pattern = r"([A-Za-z]:\\[^:\n]*?\.(png|jpg|jpeg|PNG|JPG|JPEG))|(/[^:\n]*?\.(png|jpg|jpeg|PNG|JPG|JPEG))"
matches = [
match.group()
for match in re.finditer(pattern, text)
if match.group()
]
matches += [match.replace("\\", "") for match in matches if match]
existing_paths = [
match for match in matches if os.path.exists(match)
]
return max(existing_paths, key=len) if existing_paths else None
```
# Usage Examples
Let's consider examples of how the function `find_image_path` can be used in different scenarios.
**Example 1:**
Consider the case where a text without any image path is provided.
```python
from swarms.utils import find_image_path
text = "There are no image paths in this text"
print(find_image_path(text)) # Outputs: None
```
**Example 2:**
Consider the case where the text has multiple image paths.
```python
from swarms.utils import find_image_path
text = "Here is an image path: /home/user/image1.png. Here is another one: C:\\Users\\User\\Documents\\image2.jpeg"
print(find_image_path(text)) # Outputs: the longest image path (depends on your file system and existing files)
```
**Example 3:**
In the final example, we consider a case where the text has an image path, but the file does not exist.
```python
from swarms.utils import find_image_path
text = "Here is an image path: /home/user/non_existant.png"
print(find_image_path(text)) # Outputs: None
```
# Closing Notes
In conclusion, the `find_image_path` function is crucial in the `swarms.utils` module as it supports a key operation of identifying image paths within given input text. This allows users to automate the extraction of such data from larger documents/text. However, it's important to note the function returns only existing paths in your file system and only the longest if multiple exist.

@ -0,0 +1,82 @@
# limit_tokens_from_string
## Introduction
The `Swarms.utils` library contains utility functions used across codes that handle machine learning and other operations. The `Swarms.utils` library includes a notable function named `limit_tokens_from_string()`. This function particularly limits the number of tokens in a given string.
# Function: limit_tokens_from_string()
Within the `Swarms.utils` library, there is a method `limit_tokens_from_string(string: str, model: str = "gpt-4", limit: int = 500) -> str:`
## Description
The function `limit_tokens_from_string()` limits the number of tokens in a given string based on the specified threshold. It is primarily useful when you are handling large text data and need to chunk or limit your text to a certain length. Limiting token length could be useful in various scenarios such as when working with data with limited computational resources, or when dealing with models that accept a specific maximum limit of text.
## Parameters
| Parameter | Type | Default Value | Description
| :-----------| :----------- | :------------ | :------------|
| `string` | `str` | `None` | The input string from which the tokens need to be limited. |
| `model` | `str` | `"gpt-4"` | The model used to encode and decode the token. The function defaults to `gpt-4` but you can specify any model supported by `tiktoken`. If a model is not found, it falls back to use `gpt2` |
| `limit` | `int` | `500` | The limit up to which the tokens have to be sliced. Default limit is 500.|
## Returns
| Return | Type | Description
| :-----------| :----------- | :------------
| `out` | `str` | A string that is constructed back from the encoded tokens that have been limited to a count of `limit` |
## Method Detail and Usage Examples
The method `limit_tokens_from_string()` takes in three parameters - `string`, `model`, and `limit`.
First, it tries to get the encoding for the model specified in the `model` argument using `tiktoken.encoding_for_model(model)`. In case the specified model is not found, the function uses `gpt2` model encoding as a fallback.
Next, the input `string` is tokenized using the `encode` method on the `encoding` tensor. This results in the `encoded` tensor.
Then, the function slices the `encoded` tensor to get the first `limit` number of tokens.
Finally, the function converts back the tokens into the string using the `decode` method of the `encoding` tensor. The resulting string `out` is returned.
### Example 1:
```python
from swarms.utils import limit_tokens_from_string
# longer input string
string = "This is a very long string that needs to be tokenized. This string might exceed the maximum token limit, so it will need to be truncated."
# lower token limit
limit = 10
output = limit_tokens_from_string(string, limit=limit)
```
### Example 2:
```python
from swarms.utils import limit_tokens_from_string
# longer input string with different model
string = "This string will be tokenized using gpt2 model. If the string is too long, it will be truncated."
# model
model = "gpt2"
output = limit_tokens_from_string(string, model=model)
```
### Example 3:
```python
from swarms.utils import limit_tokens_from_string
# try with a random model string
string = "In case the method does not find the specified model, it will fall back to gpt2 model."
# model
model = "gpt-4"
output = limit_tokens_from_string(string, model=model)
```
**Note:** If specifying a model not supported by `tiktoken` intentionally, it will fall back to `gpt2` model for encoding.

@ -0,0 +1,102 @@
# load_model_torch
# load_model_torch: Utility Function Documentation
## Introduction:
`load_model_torch` is a utility function in the `swarms.utils` library that is designed to load a saved PyTorch model and move it to the designated device. It provides flexibility allowing the user to specify the model file location, the device where the loaded model should be moved to, whether to strictly enforce the keys in the state dictionary to match the keys returned by the model's `state_dict()`, and many more.
Moreover, if the saved model file only contains the state dictionary, but not the model architecture, you can pass the model architecture as an argument.
## Function Definition and Parameters:
```python
def load_model_torch(
model_path: str = None,
device: torch.device = None,
model: nn.Module = None,
strict: bool = True,
map_location=None,
*args,
**kwargs,
) -> nn.Module:
```
The following table describes the parameters in detail:
| Name | Type | Default Value | Description |
| ------ | ------ | ------------- | ------------|
| model_path | str | None | A string specifying the path to the saved model file on disk. _Required_ |
| device | torch.device | None | A `torch.device` object that specifies the target device for the loaded model. If not provided, the function checks for the availability of a GPU and uses it if available. If not, it defaults to CPU. |
| model | nn.Module | None | An instance of `torch.nn.Module` representing the model's architecture. This parameter is required if the model file only contains the model's state dictionary and not the model architecture. |
| strict | bool | True | A boolean that determines whether to strictly enforce that the keys in the state dictionary match the keys returned by the model's `state_dict()` function. If set to `True`, the function will raise a KeyError when the state dictionary and `state_dict()` keys do not match. |
| map_location | callable | None | A function to remap the storage locations of the loaded model's parameters. Useful for loading models saved on a device type that is different from the current one. |
| *args, **kwargs | - | - | Additional arguments and keyword arguments to be passed to `torch.load`.
Returns:
- `torch.nn.Module` - The loaded model after moving it to the desired device.
Raises:
- `FileNotFoundError` - If the saved model file is not found at the specified path.
- `RuntimeError` - If there was an error while loading the model.
## Example of Usage:
This function can be used directly inside your code as shown in the following examples:
### Example 1:
Loading a model without specifying a device results in the function choosing the most optimal available device automatically.
```python
from swarms.utils import load_model_torch
import torch.nn as nn
# Assume `mymodel.pth` is in the current directory
model_path = "./mymodel.pth"
# Define your model architecture if the model file only contains state dict
class MyModel(nn.Module):
def __init__(self):
super().__init__()
self.linear = nn.Linear(10, 2)
def forward(self, x):
return self.linear(x)
model = MyModel()
# Load the model
loaded_model = load_model_torch(model_path, model=model)
# Now you can use the loaded model for prediction or further training
```
### Example 2:
Explicitly specifying a device.
```python
# Assume `mymodel.pth` is in the current directory
model_path = "./mymodel.pth"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Load the model
loaded_model = load_model_torch(model_path, device=device)
```
### Example 3:
Using a model file that contains only the state dictionary, not the model architecture.
```python
# Assume `mymodel_state_dict.pth` is in the current directory
model_path = "./mymodel_state_dict.pth"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Define your model architecture
model = MyModel()
# Load the model
loaded_model = load_model_torch(model_path, device=device, model=model)
```
This gives you an insight on how to use `load_model_torch` utility function from `swarms.utils` library efficiently. Always remember to pass the model path argument while the other arguments can be optional based on your requirements. Furthermore, handle exceptions properly for smooth functioning of your PyTorch related projects.

@ -1,99 +1,78 @@
# Math Evaluation Decorator Documentation # math_eval
## Introduction
The Math Evaluation Decorator is a utility function that helps you compare the output of two functions, `func1` and `func2`, when given the same input. This decorator is particularly useful for validating whether a generated function produces the same results as a ground truth function. This documentation provides a detailed explanation of the Math Evaluation Decorator, its purpose, usage, and examples.
## Purpose The `math_eval` function is a python decorator that wraps around a function to run two functions on the same inputs and compare their results. The decorator can be used for testing functions that are expected to have equivalent functionality, or in situations where two different methods are used to calculate or retrieve a value, and the results need to be compared.
The Math Evaluation Decorator serves the following purposes:
1. To compare the output of two functions, `func1` and `func2`, when given the same input.
2. To log any errors that may occur during the evaluation.
3. To provide a warning if the outputs of `func1` and `func2` do not match.
## Decorator Definition The `math_eval` function in this case accepts two functions as parameters: `func1` and `func2`, and returns a decorator. This returned decorator, when applied to a function, enhances that function to execute both `func1` and `func2`, and compare the results.
```python
def math_eval(func1, func2):
"""Math evaluation decorator.
Args:
func1 (_type_): The first function to be evaluated.
func2 (_type_): The second function to be evaluated.
Example:
>>> @math_eval(ground_truth, generated_func)
>>> def test_func(x):
>>> return x
>>> result1, result2 = test_func(5)
>>> print(f"Result from ground_truth: {result1}")
>>> print(f"Result from generated_func: {result2}")
"""
```
### Parameters
| Parameter | Type | Description |
|-----------|--------|--------------------------------------------------|
| `func1` | _type_ | The first function to be evaluated. |
| `func2` | _type_ | The second function to be evaluated. |
## Usage This can be particularly useful in situations when you are implementing a new function and wants to compare its behavior and results with that of an existing one under the same set of input parameters. It also logs the results if they do not match which could be quite useful during the debug process.
The Math Evaluation Decorator is used as a decorator for a test function that you want to evaluate. Here's how to use it:
1. Define the two functions, `func1` and `func2`, that you want to compare. ## Usage Example
2. Create a test function and decorate it with `@math_eval(func1, func2)`. Let's say you have two functions: `ground_truth` and `generated_func`, that have similar functionalities or serve the same purpose. You are writing a new function called `test_func`, and you'd like to compare the results of `ground_truth` and `generated_func` when `test_func` is run. Here is how you would use the `math_eval` decorator:
3. In the test function, provide the input(s) to both `func1` and `func2`.
4. The decorator will compare the outputs of `func1` and `func2` when given the same input(s).
5. Any errors that occur during the evaluation will be logged.
6. If the outputs of `func1` and `func2` do not match, a warning will be generated.
## Examples
### Example 1: Comparing Two Simple Functions
```python ```python
# Define the ground truth function
def ground_truth(x):
return x * 2
# Define the generated function
def generated_func(x):
return x - 10
# Create a test function and decorate it
@math_eval(ground_truth, generated_func) @math_eval(ground_truth, generated_func)
def test_func(x): def test_func(x):
return x return x
# Evaluate the test function with an input
result1, result2 = test_func(5) result1, result2 = test_func(5)
# Print the results
print(f"Result from ground_truth: {result1}") print(f"Result from ground_truth: {result1}")
print(f"Result from generated_func: {result2}") print(f"Result from generated_func: {result2}")
``` ```
In this example, the decorator compares the outputs of `ground_truth` and `generated_func` when given the input `5`. If the outputs do not match, a warning will be generated. ## Parameters
| Parameter | Data Type | Description |
| ---- | ---- | ---- |
| func1 | Callable | The first function whose result you want to compare. |
| func2 | Callable | The second function whose result you want to compare. |
The data types for `func1` and `func2` cannot be specified as they can be any python function (or callable object). The decorator verifies that they are callable and exceptions are handled within the decorator function.
## Return Values
### Example 2: Handling Errors The `math_eval` function does not return a direct value, since it is a decorator. When applied to a function, it alters the behavior of the wrapped function to return two values:
If an error occurs in either `func1` or `func2`, the decorator will log the error and set the result to `None`. This ensures that the evaluation continues even if one of the functions encounters an issue.
## Additional Information and Tips 1. `result1`: The result of running `func1` with the given input parameters.
2. `result2`: The result of running `func2` with the given input parameters.
- The Math Evaluation Decorator is a powerful tool for comparing the outputs of functions, especially when validating machine learning models or generated code. These two return values are provided in that order as a tuple.
- Ensure that the functions `func1` and `func2` take the same input(s) to ensure a meaningful comparison. ## Source Code
- Regularly check the logs for any errors or warnings generated during the evaluation. Here's how to implement the `math_eval` decorator:
- If the decorator logs a warning about mismatched outputs, investigate and debug the functions accordingly. ```python
import functools
import logging
def math_eval(func1, func2):
"""Math evaluation decorator."""
## References and Resources def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
try:
result1 = func1(*args, **kwargs)
except Exception as e:
logging.error(f"Error in func1: {e}")
result1 = None
- For more information on Python decorators, refer to the [Python Decorators Documentation](https://docs.python.org/3/glossary.html#term-decorator). try:
result2 = func2(*args, **kwargs)
except Exception as e:
logging.error(f"Error in func2: {e}")
result2 = None
- Explore advanced use cases of the Math Evaluation Decorator in your projects to ensure code correctness and reliability. if result1 != result2:
logging.warning(
f"Outputs do not match: {result1} != {result2}"
)
This comprehensive documentation explains the Math Evaluation Decorator, its purpose, usage, and examples. Use this decorator to compare the outputs of functions and validate code effectively. return result1, result2
return wrapper
return decorator
```
Please note that the code is logging exceptions to facilitate debugging, but the actual processing and handling of the exception would depend on how you want your application to respond to exceptions. Therefore, you may want to customize the error handling depending upon your application's requirements.

@ -0,0 +1,86 @@
# metrics_decorator
This documentation explains the use and functionality of the `metrics_decorator` function in the LLM (Large Language Models).
The `metrics_decorator` function is a standard Python decorator that augments a specific function by wrapping extra functionality around it. It is commonly used for things like timing, logging or memoization.
--
The `metrics_decorator` in LLM is specially designed to measure and calculate three key performance metrics when generating language models:
1. `Time to First Token`: Measures the elapsed time from the start of function execution until the generation of the first token.
2. `Generation Latency`: It measures the total time taken for a complete run.
3. `Throughput`: Calculates the rate of production of tokens per unit of time.
```python
def metrics_decorator(func: Callable):
"""
Metrics decorator for LLM
Args:
func (Callable): The function to be decorated.
"""
@wraps(func)
def wrapper(self, *args, **kwargs):
"""
An inner function that wraps the decorated function. It calculates 'Time to First Token',
'Generation Latency' and 'Throughput' metrics.
Args:
self : The object instance.
*args : Variable length argument list of the decorated function.
**kwargs : Arbitrary keyword arguments of the decorated function.
"""
# Measure Time to First Token
start_time = time.time()
result = func(self, *args, **kwargs)
first_token_time = time.time()
# Measure Generation Latency
end_time = time.time()
# Calculate Throughput (assuming the function returns a list of tokens)
throughput = len(result) / (end_time - start_time)
return f"""
Time to First Token: {first_token_time - start_time}
Generation Latency: {end_time - start_time}
Throughput: {throughput}
"""
return wrapper
```
## Example Usage
Now let's discuss the usage of the `metrics_decorator` function with an example.
Assuming that we have a language generation function called `text_generator()` that generates a list of tokens.
```python
@metrics_decorator
def text_generator(self, text: str):
"""
Args:
text (str): The input text.
Returns:
A list of tokens generated from the input text.
"""
# language generation implementation goes here
return tokens
# Instantiate the class and call the decorated function
obj = ClassName()
obj.text_generator("Hello, world!")
```
When the decorated `text_generator()` function is called, it will measure and return:
- Time elapsed until the first token is generated.
- The total execution time of the function.
- The rate of tokens generation per unit time.
This example provides a basic overview of how a function can be decorated with the `metrics_decorator`. The provided `func` argument could be any method from any class, as long as it complies with the structure defined in `metrics_decorator`. It is worth noting that the decorated function must return a list of tokens for the `Throughput` metric to work correctly.
Remember, applying the `metrics_decorator` does not affect the original functionality of the decorated function, it just adds additional measurement and logging capabilities to it. It's a great utility for tracking and optimizing the performance of your language models.

@ -0,0 +1,71 @@
# pdf_to_text
## Introduction
The function `pdf_to_text` is a Python utility for converting a PDF file into a string of text content. It leverages the `PyPDF2` library, an excellent Python library for processing PDF files. The function takes in a PDF file's path and reads its content, subsequently returning the extracted textual data.
This function can be very useful when you want to extract textual information from PDF files automatically. For instance, when processing a large number of documents, performing textual analysis, or when you're dealing with text data that is only available in PDF format.
## Class / Function Definition
`pdf_to_text` is a standalone function defined as follows:
```python
def pdf_to_text(pdf_path: str) -> str:
```
## Parameters
| Parameter | Type | Description |
|:-:|---|---|
| pdf_path | str | The path to the PDF file to be converted |
## Returns
| Return Value | Type | Description |
|:-:|---|---|
| text | str | The text extracted from the PDF file. |
## Raises
| Exception | Description |
|---|---|
| FileNotFoundError | If the PDF file is not found at the specified path. |
| Exception | If there is an error in reading the PDF file. |
## Function Description
`pdf_to_text` utilises the `PdfReader` function from the `PyPDF2` library to read the PDF file. If the PDF file does not exist at the specified path or there was an error while reading the file, appropriate exceptions will be raised. It then iterates through each page in the PDF and uses the `extract_text` function to extract the text content from each page. These contents are then concatenated into a single variable and returned as the result.
## Usage Examples
To use this function, you first need to install the `PyPDF2` library. It can be installed via pip:
```python
!pip install pypdf2
```
Then, you should import the `pdf_to_text` function:
```python
from swarms.utils import pdf_to_text
```
Here is an example of how to use `pdf_to_text`:
```python
# Define the path to the pdf file
pdf_path = 'sample.pdf'
# Use the function to extract text
text = pdf_to_text(pdf_path)
# Print the extracted text
print(text)
```
## Tips and Additional Information
- Ensure that the PDF file path is valid and that the file exists at the specified location. If the file does not exist, a `FileNotFoundError` will be raised.
- This function reads the text from the PDF. It does not handle images, graphical elements, or any non-text content.
- If the PDF contains scanned images rather than textual data, the `extract_text` function may not be able to extract any text. In such cases, you would require OCR (Optical Character Recognition) tools to extract the text.
- Be aware of the possibility that the output string might contain special characters or escape sequences because they were part of the PDF's content. You might need to clean the resulting text according to your requirements.
- The function uses the PyPDF2 library to facilitate the PDF reading and text extraction. For any issues related to PDF manipulation, consult the [PyPDF2 library documentation](https://pythonhosted.org/PyPDF2/).

@ -0,0 +1,102 @@
# prep_torch_inference
```python
def prep_torch_inference(
model_path: str = None,
device: torch.device = None,
*args,
**kwargs,
):
"""
Prepare a Torch model for inference.
Args:
model_path (str): Path to the model file.
device (torch.device): Device to run the model on.
*args: Additional positional arguments.
**kwargs: Additional keyword arguments.
Returns:
torch.nn.Module: The prepared model.
"""
try:
model = load_model_torch(model_path, device)
model.eval()
return model
except Exception as e:
# Add error handling code here
print(f"Error occurred while preparing Torch model: {e}")
return None
```
This method is part of the 'swarms.utils' module. It accepts a model file path and a torch device as input and returns a model that is ready for inference.
## Detailed Functionality
The method loads a PyTorch model from the file specified by `model_path`. This model is then moved to the specified `device` if it is provided. Subsequently, the method sets the model to evaluation mode by calling `model.eval()`. This is a crucial step when preparing a model for inference, as certain layers like dropout or batch normalization behave differently during training vs during evaluation.
In the case of any exception (e.g., the model file not found or the device unavailable), it prints an error message and returns `None`.
## Parameters
| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| model_path | str | Path to the model file. | None |
| device | torch.device | Device to run the model on. | None |
| args | tuple | Additional positional arguments. | None |
| kwargs | dict | Additional keyword arguments. | None |
## Returns
| Type | Description |
|------|-------------|
| torch.nn.Module | The prepared model ready for inference. Returns `None` if any exception occurs. |
## Usage Examples
Here are some examples of how you can use the `prep_torch_inference` method. Before that, you need to import the necessary modules as follows:
```python
import torch
from swarms.utils import prep_torch_inference, load_model_torch
```
### Example 1: Load a model for inference on CPU
```python
model_path = "saved_model.pth"
model = prep_torch_inference(model_path)
if model is not None:
print("Model loaded successfully and is ready for inference.")
else:
print("Failed to load the model.")
```
### Example 2: Load a model for inference on CUDA device
```python
model_path = "saved_model.pth"
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = prep_torch_inference(model_path, device)
if model is not None:
print(f"Model loaded successfully on device {device} and is ready for inference.")
else:
print("Failed to load the model.")
```
### Example 3: Load a model with additional arguments for `load_model_torch`
```python
model_path = "saved_model.pth"
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Suppose load_model_torch accepts an additional argument, map_location
model = prep_torch_inference(model_path, device, map_location=device)
if model is not None:
print(f"Model loaded successfully on device {device} and is ready for inference.")
else:
print("Failed to load the model.")
```
Please note, you need to ensure the given model path does exist and the device is available on your machine, else `prep_torch_inference` method will return `None`. Depending on the complexity and size of your models, loading them onto a specific device might take a while. So it's important that you take this into consideration when designing your machine learning workflows.

@ -0,0 +1,110 @@
# print_class_parameters
# Module Function Name: print_class_parameters
The `print_class_parameters` function is a utility function developed to help developers and users alike in retrieving and printing the parameters of a class constructor in Python, either in standard output or returned as a dictionary if the `api_format` is set to `True`.
This utility function utilizes the `inspect` module to fetch the signature of the class constructor and fetches the parameters from the obtained signature. The parameter values and their respective types are then outputted.
This function allows developers to easily inspect and understand the class' constructor parameters without the need to individually go through the class structure. This eases the testing and debugging process for developers and users alike, aiding in generating more efficient and readable code.
__Function Definition:__
```python
def print_class_parameters(cls, api_format: bool = False):
```
__Parameters:__
| Parameter | Type | Description | Default value |
|---|---|---|---|
| cls | type | The Python class to inspect. | None |
| api_format | bool | Flag to determine if the output should be returned in dictionary format (if set to True) or printed out (if set to False) | False |
__Functionality and Usage:__
Inside the `print_class_parameters` function, it starts by getting the signature of the constructor of the inputted class by invoking `inspect.signature(cls.__init__)`. It then extracts the parameters from the signature and stores it in the `params` variable.
If the `api_format` argument is set to `True`, instead of printing the parameters and their types, it stores them inside a dictionary where each key-value pair is a parameter name and its type. It then returns this dictionary.
If `api_format` is set to `False` or not set at all (defaulting to False), the function iterates over the parameters and prints the parameter name and its type. "self" parameters are excluded from the output as they are inherent to all class methods in Python.
A possible exception that may occur during the execution of this function is during the invocation of the `inspect.signature()` function call. If the inputted class does not have an `__init__` method or any error occurs during the retrieval of the class constructor's signature, an exception will be triggered. In that case, an error message that includes the error details is printed out.
__Usage and Examples:__
Assuming the existence of a class:
```python
class Agent:
def __init__(self, x: int, y: int):
self.x = x
self.y = y
```
One could use `print_class_parameters` in its typical usage:
```python
print_class_parameters(Agent)
```
Results in:
```
Parameter: x, Type: <class 'int'>
Parameter: y, Type: <class 'int'>
```
Or, with `api_format` set to `True`
```python
output = print_class_parameters(Agent, api_format=True)
print(output)
```
Results in:
```
{'x': "<class 'int'>", 'y': "<class 'int'>"}
```
__Note:__
The function `print_class_parameters` is not limited to custom classes. It can inspect built-in Python classes such as `list`, `dict`, and others. However, it is most useful when inspecting custom-defined classes that aren't inherently documented in Python or third-party libraries.
__Source Code__
```python
def print_class_parameters(cls, api_format: bool = False):
"""
Print the parameters of a class constructor.
Parameters:
cls (type): The class to inspect.
Example:
>>> print_class_parameters(Agent)
Parameter: x, Type: <class 'int'>
Parameter: y, Type: <class 'int'>
"""
try:
# Get the parameters of the class constructor
sig = inspect.signature(cls.__init__)
params = sig.parameters
if api_format:
param_dict = {}
for name, param in params.items():
if name == "self":
continue
param_dict[name] = str(param.annotation)
return param_dict
# Print the parameters
for name, param in params.items():
if name == "self":
continue
print(f"Parameter: {name}, Type: {param.annotation}")
except Exception as e:
print(f"An error occurred while inspecting the class: {e}")
```

@ -0,0 +1,12 @@
- pdf_to_text: "pdf_to_text.md"
- load_model_torch: "load_model_torch.md"
- metrics_decorator: "metrics_decorator.md"
- prep_torch_inference: "prep_torch_inference.md"
- find_image_path: "find_image_path.md"
- print_class_parameters: "print_class_parameters.md"
- extract_code_from_markdown: "extract_code_from_markdown.md"
- check_device: "check_device.md"
- display_markdown_message: "display_markdown_message.md"
- phoenix_tracer: "phoenix_tracer.md"
- limit_tokens_from_string: "limit_tokens_from_string.md"
- math_eval: "math_eval.md"

@ -114,6 +114,18 @@ nav:
- swarms.utils: - swarms.utils:
- phoenix_trace_decorator: "swarms/utils/phoenix_tracer.md" - phoenix_trace_decorator: "swarms/utils/phoenix_tracer.md"
- math_eval: "swarms/utils/math_eval.md" - math_eval: "swarms/utils/math_eval.md"
- pdf_to_text: "pdf_to_text.md"
- load_model_torch: "load_model_torch.md"
- metrics_decorator: "metrics_decorator.md"
- prep_torch_inference: "prep_torch_inference.md"
- find_image_path: "find_image_path.md"
- print_class_parameters: "print_class_parameters.md"
- extract_code_from_markdown: "extract_code_from_markdown.md"
- check_device: "check_device.md"
- display_markdown_message: "display_markdown_message.md"
- phoenix_tracer: "phoenix_tracer.md"
- limit_tokens_from_string: "limit_tokens_from_string.md"
- math_eval: "math_eval.md"
- Guides: - Guides:
- Overview: "examples/index.md" - Overview: "examples/index.md"
- Agents: - Agents:

@ -4,7 +4,6 @@ from dotenv import load_dotenv
# Import the OpenAIChat model and the Agent struct # Import the OpenAIChat model and the Agent struct
from swarms.models import OpenAIChat from swarms.models import OpenAIChat
from swarms.structs import Agent
# Load the environment variables # Load the environment variables
load_dotenv() load_dotenv()

@ -30,4 +30,4 @@ out = parallelizer.run(task)
# Print the responses 1 by 1 # Print the responses 1 by 1
for i in range(len(out)): for i in range(len(out)):
print(f"Response from LLM {i}: {out[i]}") print(f"Response from LLM {i}: {out[i]}")

@ -0,0 +1,77 @@
import inspect
import os
import sys
import threading
from dotenv import load_dotenv
from scripts.auto_tests_docs.docs import DOCUMENTATION_WRITER_SOP
from swarms import OpenAIChat
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
model = OpenAIChat(
model_name="gpt-4",
openai_api_key=api_key,
max_tokens=4000,
)
def process_documentation(item):
"""
Process the documentation for a given function using OpenAI model and save it in a Markdown file.
"""
doc = inspect.getdoc(item)
source = inspect.getsource(item)
input_content = (
f"Name: {item.__name__}\n\nDocumentation:\n{doc}\n\nSource"
f" Code:\n{source}"
)
print(input_content)
# Process with OpenAI model
processed_content = model(
DOCUMENTATION_WRITER_SOP(input_content, "swarms.utils")
)
doc_content = f"# {item.__name__}\n\n{processed_content}\n"
# Create the directory if it doesn't exist
dir_path = "docs/swarms/utils"
os.makedirs(dir_path, exist_ok=True)
# Write the processed documentation to a Markdown file
file_path = os.path.join(dir_path, f"{item.__name__.lower()}.md")
with open(file_path, "w") as file:
file.write(doc_content)
def main():
# Gathering all functions from the swarms.utils module
functions = [
obj
for name, obj in inspect.getmembers(
sys.modules["swarms.utils"]
)
if inspect.isfunction(obj)
]
threads = []
for func in functions:
thread = threading.Thread(
target=process_documentation, args=(func,)
)
threads.append(thread)
thread.start()
# Wait for all threads to complete
for thread in threads:
thread.join()
print("Documentation generated in 'docs/swarms/utils' directory.")
if __name__ == "__main__":
main()

@ -0,0 +1,85 @@
import inspect
import os
import sys
import threading
from dotenv import load_dotenv
from scripts.auto_tests_docs.docs import TEST_WRITER_SOP_PROMPT
from swarms import OpenAIChat
from swarms.utils.parse_code import extract_code_from_markdown
from swarms.utils import (
extract_code_from_markdown,
)
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
model = OpenAIChat(
model_name="gpt-4",
openai_api_key=api_key,
max_tokens=4000,
)
def process_documentation(item):
"""
Process the documentation for a given function using OpenAI model and save it in a Markdown file.
"""
doc = inspect.getdoc(item)
source = inspect.getsource(item)
input_content = (
f"Name: {item.__name__}\n\nDocumentation:\n{doc}\n\nSource"
f" Code:\n{source}"
)
# print(input_content)
# Process with OpenAI model
processed_content = model(
TEST_WRITER_SOP_PROMPT(
input_content, "swarms.utils", "swarms.utils"
)
)
processed_content = extract_code_from_markdown(processed_content)
print(processed_content)
doc_content = f"{processed_content}"
# Create the directory if it doesn't exist
dir_path = "tests/utils"
os.makedirs(dir_path, exist_ok=True)
# Write the processed documentation to a Markdown file
file_path = os.path.join(dir_path, f"{item.__name__.lower()}.py")
with open(file_path, "w") as file:
file.write(doc_content)
def main():
# Gathering all functions from the swarms.utils module
functions = [
obj
for name, obj in inspect.getmembers(
sys.modules["swarms.utils"]
)
if inspect.isfunction(obj)
]
threads = []
for func in functions:
thread = threading.Thread(
target=process_documentation, args=(func,)
)
threads.append(thread)
thread.start()
# Wait for all threads to complete
for thread in threads:
thread.join()
print("Tests generated in 'tests/utils' directory.")
if __name__ == "__main__":
main()

@ -0,0 +1,23 @@
import os
def generate_file_list(directory, output_file):
"""
Generate a list of files in a directory in the specified format and write it to a file.
Args:
directory (str): The directory to list the files from.
output_file (str): The file to write the output to.
"""
with open(output_file, 'w') as f:
for root, dirs, files in os.walk(directory):
for file in files:
if file.endswith('.md'):
# Remove the directory from the file path and replace slashes with dots
file_path = os.path.join(root, file).replace(directory + '/', '').replace('/', '.')
# Remove the file extension
file_name, _ = os.path.splitext(file)
# Write the file name and path to the output file
f.write(f"- {file_name}: \"{file_path}\"\n")
# Use the function to generate the file list
generate_file_list('docs/swarms/utils', 'file_list.txt')

@ -1,4 +1,4 @@
from abc import ABC, abstractmethod from abc import abstractmethod
from swarms.models.base_llm import AbstractLLM from swarms.models.base_llm import AbstractLLM
from diffusers.utils import export_to_video from diffusers.utils import export_to_video
from typing import Optional, List from typing import Optional, List

@ -1,84 +0,0 @@
import json
import os
from typing import List
import timm
import torch
from PIL import Image
from pydantic import BaseModel, StrictFloat, StrictInt, validator
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Load the classes for image classification
with open(
os.path.join(os.path.dirname(__file__), "fast_vit_classes.json")
) as f:
FASTVIT_IMAGENET_1K_CLASSES = json.load(f)
class ClassificationResult(BaseModel):
class_id: List[StrictInt]
confidence: List[StrictFloat]
@validator("class_id", "confidence", pre=True, each_item=True)
def check_list_contents(cls, v):
assert isinstance(v, int) or isinstance(
v, float
), "must be integer or float"
return v
class FastViT:
"""
FastViT model for image classification
Args:
img (str): path to the input image
confidence_threshold (float): confidence threshold for the model's predictions
Returns:
ClassificationResult: a pydantic BaseModel containing the class ids and confidences of the model's predictions
Example:
>>> fastvit = FastViT()
>>> result = fastvit(img="path_to_image.jpg", confidence_threshold=0.5)
To use, create a json file called: fast_vit_classes.json
"""
def __init__(self):
self.model = timm.create_model(
"hf_hub:timm/fastvit_s12.apple_in1k", pretrained=True
).to(DEVICE)
data_config = timm.data.resolve_model_data_config(self.model)
self.transforms = timm.data.create_transform(
**data_config, is_training=False
)
self.model.eval()
def __call__(
self, img: str, confidence_threshold: float = 0.5
) -> ClassificationResult:
"""Classifies the input image and returns the top k classes and their probabilities"""
img = Image.open(img).convert("RGB")
img_tensor = self.transforms(img).unsqueeze(0).to(DEVICE)
with torch.no_grad():
output = self.model(img_tensor)
probabilities = torch.nn.functional.softmax(output, dim=1)
# Get top k classes and their probabilities
top_probs, top_classes = torch.topk(
probabilities, k=FASTVIT_IMAGENET_1K_CLASSES
)
# Filter by confidence threshold
mask = top_probs > confidence_threshold
top_probs, top_classes = top_probs[mask], top_classes[mask]
# Convert to Python lists and map class indices to labels if needed
top_probs = top_probs.cpu().numpy().tolist()
top_classes = top_classes.cpu().numpy().tolist()
return ClassificationResult(
class_id=top_classes, confidence=top_probs
)

@ -25,7 +25,7 @@ from swarms.tools.tool import BaseTool
from swarms.tools.tool_func_doc_scraper import scrape_tool_func_docs from swarms.tools.tool_func_doc_scraper import scrape_tool_func_docs
from swarms.utils.code_interpreter import SubprocessCodeInterpreter from swarms.utils.code_interpreter import SubprocessCodeInterpreter
from swarms.utils.parse_code import ( from swarms.utils.parse_code import (
extract_code_in_backticks_in_string, extract_code_from_markdown,
) )
from swarms.utils.pdf_to_text import pdf_to_text from swarms.utils.pdf_to_text import pdf_to_text
from swarms.utils.token_count_tiktoken import limit_tokens_from_string from swarms.utils.token_count_tiktoken import limit_tokens_from_string
@ -1257,7 +1257,7 @@ class Agent:
""" """
text -> parse_code by looking for code inside 6 backticks `````-> run_code text -> parse_code by looking for code inside 6 backticks `````-> run_code
""" """
parsed_code = extract_code_in_backticks_in_string(code) parsed_code = extract_code_from_markdown(code)
run_code = self.code_executor.run(parsed_code) run_code = self.code_executor.run(parsed_code)
return run_code return run_code

@ -31,7 +31,7 @@ def check_for_update():
BOOL: Flag to indicate if there is an update BOOL: Flag to indicate if there is an update
""" """
# Fetch the latest version from the PyPI API # Fetch the latest version from the PyPI API
response = requests.get(f"https://pypi.org/pypi/swarms/json") response = requests.get("https://pypi.org/pypi/swarms/json")
latest_version = response.json()["info"]["version"] latest_version = response.json()["info"]["version"]
# Get the current version using pkg_resources # Get the current version using pkg_resources

@ -1,27 +1,30 @@
from swarms.utils.class_args_wrapper import print_class_parameters
from swarms.utils.code_interpreter import SubprocessCodeInterpreter from swarms.utils.code_interpreter import SubprocessCodeInterpreter
from swarms.utils.markdown_message import display_markdown_message
from swarms.utils.parse_code import (
extract_code_in_backticks_in_string,
)
from swarms.utils.pdf_to_text import pdf_to_text
from swarms.utils.math_eval import math_eval
from swarms.utils.llm_metrics_decorator import metrics_decorator
from swarms.utils.device_checker_cuda import check_device from swarms.utils.device_checker_cuda import check_device
from swarms.utils.find_img_path import find_image_path
from swarms.utils.llm_metrics_decorator import metrics_decorator
from swarms.utils.load_model_torch import load_model_torch from swarms.utils.load_model_torch import load_model_torch
from swarms.utils.markdown_message import display_markdown_message
from swarms.utils.math_eval import math_eval
from swarms.utils.parse_code import extract_code_from_markdown
from swarms.utils.pdf_to_text import pdf_to_text
from swarms.utils.prep_torch_model_inference import ( from swarms.utils.prep_torch_model_inference import (
prep_torch_inference, prep_torch_inference,
) )
from swarms.utils.find_img_path import find_image_path from swarms.utils.token_count_tiktoken import limit_tokens_from_string
__all__ = [ __all__ = [
"display_markdown_message",
"SubprocessCodeInterpreter", "SubprocessCodeInterpreter",
"extract_code_in_backticks_in_string", "display_markdown_message",
"pdf_to_text", "extract_code_from_markdown",
"find_image_path",
"limit_tokens_from_string",
"load_model_torch",
"math_eval", "math_eval",
"metrics_decorator", "metrics_decorator",
"check_device", "pdf_to_text",
"load_model_torch",
"prep_torch_inference", "prep_torch_inference",
"find_image_path", "print_class_parameters",
"check_device",
] ]

@ -206,10 +206,10 @@ class SubprocessCodeInterpreter:
self.output_queue.put({"output": line}) self.output_queue.put({"output": line})
interpreter = SubprocessCodeInterpreter() # interpreter = SubprocessCodeInterpreter()
interpreter.start_cmd = "python3" # interpreter.start_cmd = "python3"
for output in interpreter.run(""" # for output in interpreter.run("""
print("hello") # print("hello")
print("world") # print("world")
"""): # """):
print(output) # print(output)

@ -1,31 +1,19 @@
import re import re
# def extract_code_in_backticks_in_string(s: str) -> str:
# """
# Extracts code blocks from a markdown string.
# Args: def extract_code_from_markdown(markdown_content: str):
# s (str): The markdown string to extract code from.
# Returns:
# list: A list of tuples. Each tuple contains the language of the code block (if specified) and the code itself.
# """
# pattern = r"```([\w\+\#\-\.\s]*)\n(.*?)```"
# matches = re.findall(pattern, s, re.DOTALL)
# out = [(match[0], match[1].strip()) for match in matches]
# print(out)
def extract_code_in_backticks_in_string(s: str) -> str:
""" """
Extracts code blocks from a markdown string. Extracts code blocks from a Markdown string and returns them as a single string.
Args: Args:
s (str): The markdown string to extract code from. - markdown_content (str): The Markdown content as a string.
Returns: Returns:
str: A string containing all the code blocks. - str: A single string containing all the code blocks separated by newlines.
""" """
pattern = r"```([\w\+\#\-\.\s]*)(.*?)```" # Regular expression for fenced code blocks
matches = re.findall(pattern, s, re.DOTALL) pattern = r"```(?:\w+\n)?(.*?)```"
return "\n".join(match[1].strip() for match in matches) matches = re.findall(pattern, markdown_content, re.DOTALL)
# Concatenate all code blocks separated by newlines
return "\n".join(code.strip() for code in matches)

@ -113,13 +113,13 @@ def test_summary(capsys):
def test_enable_load_balancing(): def test_enable_load_balancing():
mp = ModelParallelizer([huggingface_llm]) mp = ModelParallelizer([huggingface_llm])
mp.enable_load_balancing() mp.enable_load_balancing()
assert mp.load_balancing == True assert mp.load_balancing is True
def test_disable_load_balancing(): def test_disable_load_balancing():
mp = ModelParallelizer([huggingface_llm]) mp = ModelParallelizer([huggingface_llm])
mp.disable_load_balancing() mp.disable_load_balancing()
assert mp.load_balancing == False assert mp.load_balancing is False
def test_concurrent_run(): def test_concurrent_run():

@ -0,0 +1,64 @@
import torch
import logging
from swarms.utils import check_device
# For the purpose of the test, we're assuming that the `memory_allocated`
# and `memory_reserved` function behave the same as `torch.cuda.memory_allocated`
# and `torch.cuda.memory_reserved`
def test_check_device_no_cuda(monkeypatch):
# Mock torch.cuda.is_available to always return False
monkeypatch.setattr(torch.cuda, "is_available", lambda: False)
result = check_device(log_level=logging.DEBUG)
assert result.type == "cpu"
def test_check_device_cuda_exception(monkeypatch):
# Mock torch.cuda.is_available to raise an exception
monkeypatch.setattr(
torch.cuda, "is_available", lambda: 1 / 0
) # Raises ZeroDivisionError
result = check_device(log_level=logging.DEBUG)
assert result.type == "cpu"
def test_check_device_one_cuda(monkeypatch):
# Mock torch.cuda.is_available to return True
monkeypatch.setattr(torch.cuda, "is_available", lambda: True)
# Mock torch.cuda.device_count to return 1
monkeypatch.setattr(torch.cuda, "device_count", lambda: 1)
# Mock torch.cuda.memory_allocated and torch.cuda.memory_reserved to return 0
monkeypatch.setattr(
torch.cuda, "memory_allocated", lambda device: 0
)
monkeypatch.setattr(
torch.cuda, "memory_reserved", lambda device: 0
)
result = check_device(log_level=logging.DEBUG)
assert len(result) == 1
assert result[0].type == "cuda"
assert result[0].index == 0
def test_check_device_multiple_cuda(monkeypatch):
# Mock torch.cuda.is_available to return True
monkeypatch.setattr(torch.cuda, "is_available", lambda: True)
# Mock torch.cuda.device_count to return 4
monkeypatch.setattr(torch.cuda, "device_count", lambda: 4)
# Mock torch.cuda.memory_allocated and torch.cuda.memory_reserved to return 0
monkeypatch.setattr(
torch.cuda, "memory_allocated", lambda device: 0
)
monkeypatch.setattr(
torch.cuda, "memory_reserved", lambda device: 0
)
result = check_device(log_level=logging.DEBUG)
assert len(result) == 4
for i in range(4):
assert result[i].type == "cuda"
assert result[i].index == i

@ -0,0 +1,65 @@
# import necessary modules
import pytest
from swarms.utils import display_markdown_message
from rich.console import Console
from rich.markdown import Markdown
from rich.rule import Rule
from unittest import mock
def test_basic_message():
# Test basic message functionality
with mock.patch.object(Console, "print") as mock_print:
display_markdown_message("This is a test")
mock_print.assert_called_once_with(
Markdown("This is a test", style="cyan")
)
def test_empty_message():
# Test how function handles empty input
with mock.patch.object(Console, "print") as mock_print:
display_markdown_message("")
mock_print.assert_called_once_with("")
@pytest.mark.parametrize("color", ["cyan", "red", "blue"])
def test_colors(color):
# Test different colors
with mock.patch.object(Console, "print") as mock_print:
display_markdown_message("This is a test", color)
mock_print.assert_called_once_with(
Markdown("This is a test", style=color)
)
def test_dash_line():
# Test how function handles "---"
with mock.patch.object(Console, "print") as mock_print:
display_markdown_message("---")
mock_print.assert_called_once_with(Rule(style="cyan"))
def test_message_with_whitespace():
# Test how function handles message with whitespaces
with mock.patch.object(Console, "print") as mock_print:
display_markdown_message(" \n Test \n --- \n Test \n")
calls = [
mock.call(""),
mock.call(Markdown("Test", style="cyan")),
mock.call(Rule(style="cyan")),
mock.call(Markdown("Test", style="cyan")),
mock.call(""),
]
mock_print.assert_has_calls(calls)
def test_message_start_with_greater_than():
# Test how function handles message line starting with ">"
with mock.patch.object(Console, "print") as mock_print:
display_markdown_message(">This is a test")
calls = [
mock.call(Markdown(">This is a test", style="cyan")),
mock.call(""),
]
mock_print.assert_has_calls(calls)

@ -0,0 +1,47 @@
import pytest
from swarms.utils import extract_code_from_markdown
@pytest.fixture
def markdown_content_with_code():
return """
# This is a markdown document
Some intro text here.
Some additional text.
"""
@pytest.fixture
def markdown_content_without_code():
return """
# This is a markdown document
There is no code in this document.
"""
def test_extract_code_from_markdown_with_code(
markdown_content_with_code,
):
extracted_code = extract_code_from_markdown(
markdown_content_with_code
)
assert "def my_func():" in extracted_code
assert 'print("This is my function.")' in extracted_code
assert "class MyClass:" in extracted_code
assert "pass" in extracted_code
def test_extract_code_from_markdown_without_code(
markdown_content_without_code,
):
extracted_code = extract_code_from_markdown(
markdown_content_without_code
)
assert extracted_code == ""
def test_extract_code_from_markdown_exception():
with pytest.raises(TypeError):
extract_code_from_markdown(None)

@ -0,0 +1,52 @@
# Filename: test_utils.py
import pytest
from swarms.utils import find_image_path
import os
def test_find_image_path_no_images():
assert (
find_image_path(
"This is a test string without any image paths."
)
is None
)
def test_find_image_path_one_image():
text = "This is a string with one image path: sample_image.jpg."
assert find_image_path(text) == "sample_image.jpg"
def test_find_image_path_multiple_images():
text = "This string has two image paths: img1.png, and img2.jpg."
assert (
find_image_path(text) == "img2.jpg"
) # Assuming both images exist
def test_find_image_path_wrong_input():
with pytest.raises(TypeError):
find_image_path(123)
@pytest.mark.parametrize(
"text, expected",
[
("no image path here", None),
("image: sample.png", "sample.png"),
("image: sample.png, another: another.jpeg", "another.jpeg"),
],
)
def test_find_image_path_parameterized(text, expected):
assert find_image_path(text) == expected
def mock_os_path_exists(path):
return True
def test_find_image_path_mocking(monkeypatch):
monkeypatch.setattr(os.path, "exists", mock_os_path_exists)
assert find_image_path("image.jpg") == "image.jpg"

@ -0,0 +1,45 @@
import pytest
from swarms.utils import limit_tokens_from_string
def test_limit_tokens_from_string():
sentence = (
"This is a test sentence. It is used for testing the number"
" of tokens."
)
limited = limit_tokens_from_string(sentence, limit=5)
assert (
len(limited.split()) <= 5
), "The output string has more than 5 tokens."
def test_limit_zero_tokens():
sentence = "Expect empty result when limit is set to zero."
limited = limit_tokens_from_string(sentence, limit=0)
assert limited == "", "The output is not empty."
def test_negative_token_limit():
sentence = (
"This test will raise an exception when limit is negative."
)
with pytest.raises(Exception):
limit_tokens_from_string(sentence, limit=-1)
@pytest.mark.parametrize(
"sentence, model", [("Some sentence", "unavailable-model")]
)
def test_unknown_model(sentence, model):
with pytest.raises(Exception):
limit_tokens_from_string(sentence, model=model)
def test_string_token_limit_exceeded():
sentence = (
"This is a long sentence with more than twenty tokens which"
" is used for testing. It checks whether the function"
" correctly limits the tokens to a specified amount."
)
limited = limit_tokens_from_string(sentence, limit=20)
assert len(limited.split()) <= 20, "The token limit is exceeded."

@ -0,0 +1,111 @@
import pytest
import torch
from torch import nn
from swarms.utils import load_model_torch
class DummyModel(nn.Module):
def __init__(self):
super().__init__()
self.fc = nn.Linear(10, 2)
def forward(self, x):
return self.fc(x)
# Test case 1: Test if model can be loaded successfully
def test_load_model_torch_success(tmp_path):
model = DummyModel()
# Save the model to a temporary directory
model_path = tmp_path / "model.pt"
torch.save(model.state_dict(), model_path)
# Load the model
model_loaded = load_model_torch(model_path, model=DummyModel())
# Check if loaded model has the same architecture
assert isinstance(
model_loaded, DummyModel
), "Loaded model type mismatch."
# Test case 2: Test if function raises FileNotFoundError for non-existent file
def test_load_model_torch_file_not_found():
with pytest.raises(FileNotFoundError):
load_model_torch("non_existent_model.pt")
# Test case 3: Test if function catches and raises RuntimeError for invalid model file
def test_load_model_torch_invalid_file(tmp_path):
file = tmp_path / "invalid_model.pt"
file.write_text("Invalid model file.")
with pytest.raises(RuntimeError):
load_model_torch(file)
# Test case 4: Test for handling of 'strict' parameter
def test_load_model_torch_strict_handling(tmp_path):
# Create a model and modify it to cause a mismatch
model = DummyModel()
model.fc = nn.Linear(10, 3)
model_path = tmp_path / "model.pt"
torch.save(model.state_dict(), model_path)
# Try to load the modified model with 'strict' parameter set to True
with pytest.raises(RuntimeError):
load_model_torch(model_path, model=DummyModel(), strict=True)
# Test case 5: Test for 'device' parameter handling
def test_load_model_torch_device_handling(tmp_path):
model = DummyModel()
model_path = tmp_path / "model.pt"
torch.save(model.state_dict(), model_path)
# Define a device other than default and load the model to the specified device
device = torch.device("cpu")
model_loaded = load_model_torch(
model_path, model=DummyModel(), device=device
)
assert (
model_loaded.fc.weight.device == device
), "Model not loaded to specified device."
# Test case 6: Testing for correct handling of '*args' and '**kwargs'
def test_load_model_torch_args_kwargs_handling(monkeypatch, tmp_path):
model = DummyModel()
model_path = tmp_path / "model.pt"
torch.save(model.state_dict(), model_path)
def mock_torch_load(*args, **kwargs):
assert (
"pickle_module" in kwargs
), "Keyword arguments not passed to 'torch.load'."
# Monkeypatch 'torch.load' to check if '*args' and '**kwargs' are passed correctly
monkeypatch.setattr(torch, "load", mock_torch_load)
load_model_torch(
model_path, model=DummyModel(), pickle_module="dummy_module"
)
# Test case 7: Test for model loading on CPU if no GPU is available
def test_load_model_torch_cpu(tmp_path):
model = DummyModel()
model_path = tmp_path / "model.pt"
torch.save(model.state_dict(), model_path)
def mock_torch_cuda_is_available():
return False
# Monkeypatch to simulate no GPU available
pytest.MonkeyPatch.setattr(
torch.cuda, "is_available", mock_torch_cuda_is_available
)
model_loaded = load_model_torch(model_path, model=DummyModel())
# Ensure model is loaded on CPU
assert next(model_loaded.parameters()).device.type == "cpu"

@ -1,89 +1,41 @@
import pytest from swarms.utils import math_eval
from swarms.utils.math_eval import math_eval
def test_math_eval_same_output(): def func1_no_exception(x):
@math_eval(lambda x: x + 1, lambda x: x + 1) return x + 2
def func(x):
return x
for i in range(20):
result1, result2 = func(i)
assert result1 == result2
assert result1 == i + 1
def test_math_eval_different_output(): def func2_no_exception(x):
@math_eval(lambda x: x + 1, lambda x: x + 2) return x + 2
def func(x):
return x
for i in range(20):
result1, result2 = func(i)
assert result1 != result2
assert result1 == i + 1
assert result2 == i + 2
def func1_with_exception(x):
raise ValueError()
def test_math_eval_exception_in_func1():
@math_eval(lambda x: 1 / x, lambda x: x)
def func(x):
return x
with pytest.raises(ZeroDivisionError): def func2_with_exception(x):
func(0) raise ValueError()
def test_math_eval_exception_in_func2(): def test_same_results_no_exception(caplog):
@math_eval(lambda x: x, lambda x: 1 / x) @math_eval(func1_no_exception, func2_no_exception)
def func(x): def test_func(x):
return x return x
with pytest.raises(ZeroDivisionError): result1, result2 = test_func(5)
func(0) assert result1 == result2 == 7
assert "Outputs do not match" not in caplog.text
def test_math_eval_with_multiple_arguments():
@math_eval(lambda x, y: x + y, lambda x, y: y + x)
def func(x, y):
return x, y
for i in range(10):
for j in range(10):
result1, result2 = func(i, j)
assert result1 == result2
assert result1 == i + j
def test_math_eval_with_kwargs(): def test_func1_exception(caplog):
@math_eval(lambda x, y=0: x + y, lambda x, y=0: y + x) @math_eval(func1_with_exception, func2_no_exception)
def func(x, y=0): def test_func(x):
return x, y return x
for i in range(10):
for j in range(10):
result1, result2 = func(i, y=j)
assert result1 == result2
assert result1 == i + j
def test_math_eval_with_no_arguments():
@math_eval(lambda: 1, lambda: 1)
def func():
return
result1, result2 = func()
assert result1 == result2
assert result1 == 1
result1, result2 = test_func(5)
assert result1 is None
assert result2 == 7
assert "Error in func1:" in caplog.text
def test_math_eval_with_different_types():
@math_eval(lambda x: str(x), lambda x: x)
def func(x):
return x
for i in range(10): # similar tests for func2_with_exception and when func1 and func2 return different results
result1, result2 = func(i)
assert result1 != result2
assert result1 == str(i)
assert result2 == i

@ -0,0 +1,84 @@
# pytest imports
import pytest
from unittest.mock import Mock
# Imports from your project
from swarms.utils import metrics_decorator
import time
# Basic successful test
def test_metrics_decorator_success():
@metrics_decorator
def decorated_func():
time.sleep(0.1)
return [1, 2, 3, 4, 5]
metrics = decorated_func()
assert "Time to First Token" in metrics
assert "Generation Latency" in metrics
assert "Throughput:" in metrics
@pytest.mark.parametrize(
"wait_time, return_val",
[
(0, []),
(0.1, [1, 2, 3]),
(0.5, list(range(50))),
],
)
def test_metrics_decorator_with_various_wait_times_and_return_vals(
wait_time, return_val
):
@metrics_decorator
def decorated_func():
time.sleep(wait_time)
return return_val
metrics = decorated_func()
assert "Time to First Token" in metrics
assert "Generation Latency" in metrics
assert "Throughput:" in metrics
# Test to ensure that mocked time function was called and throughputs are calculated as expected
def test_metrics_decorator_with_mocked_time(mocker):
mocked_time = Mock()
mocker.patch("time.time", mocked_time)
mocked_time.side_effect = [0, 5, 10, 20]
@metrics_decorator
def decorated_func():
return ["tok_1", "tok_2"]
metrics = decorated_func()
assert metrics == """
Time to First Token: 5
Generation Latency: 20
Throughput: 0.1
"""
mocked_time.assert_any_call()
# Test to ensure that exceptions in the decorated function are propagated
def test_metrics_decorator_raises_exception():
@metrics_decorator
def decorated_func():
raise ValueError("Oops!")
with pytest.raises(ValueError, match="Oops!"):
decorated_func()
# Test to ensure proper handling when decorated function returns non-list value
def test_metrics_decorator_with_non_list_return_val():
@metrics_decorator
def decorated_func():
return "Hello, world!"
metrics = decorated_func()
assert "Time to First Token" in metrics
assert "Generation Latency" in metrics
assert "Throughput:" in metrics

@ -0,0 +1,40 @@
import pytest
import PyPDF2
from swarms.utils import pdf_to_text
@pytest.fixture
def pdf_file(tmpdir):
pdf_writer = PyPDF2.PdfWriter()
pdf_page = PyPDF2.pdf.PageObject.createBlankPage(None, 200, 200)
pdf_writer.add_page(pdf_page)
pdf_file = tmpdir.join("temp.pdf")
with open(pdf_file, "wb") as output:
pdf_writer.write(output)
return str(pdf_file)
def test_valid_pdf_to_text(pdf_file):
result = pdf_to_text(pdf_file)
assert isinstance(result, str)
def test_non_existing_file():
with pytest.raises(FileNotFoundError):
pdf_to_text("non_existing_file.pdf")
def test_passing_non_pdf_file(tmpdir):
file = tmpdir.join("temp.txt")
file.write("This is a test")
with pytest.raises(
Exception,
match=r"An error occurred while reading the PDF file",
):
pdf_to_text(str(file))
@pytest.mark.parametrize("invalid_pdf_file", [None, 123, {}, []])
def test_invalid_pdf_to_text(invalid_pdf_file):
with pytest.raises(Exception):
pdf_to_text(invalid_pdf_file)

@ -0,0 +1,49 @@
import unittest
import pytest
import torch
from unittest.mock import Mock
from swarms.utils import prep_torch_inference
def test_prep_torch_inference():
model_path = "model_path"
device = torch.device(
"cuda" if torch.cuda.is_available() else "cpu"
)
model_mock = Mock()
model_mock.eval = Mock()
# Mocking the load_model_torch function to return our mock model.
with unittest.mock.patch(
"swarms.utils.load_model_torch", return_value=model_mock
) as _:
model = prep_torch_inference(model_path, device)
# Check if model was properly loaded and eval function was called
assert model == model_mock
model_mock.eval.assert_called_once()
@pytest.mark.parametrize(
"model_path, device",
[
(
"invalid_path",
torch.device("cuda"),
), # Invalid file path, valid device
(None, torch.device("cuda")), # None file path, valid device
("model_path", None), # Valid file path, None device
(None, None), # None file path, None device
],
)
def test_prep_torch_inference_exceptions(model_path, device):
with pytest.raises(Exception):
prep_torch_inference(model_path, device)
def test_prep_torch_inference_return_none():
model_path = "invalid_path" # Invalid file path
device = torch.device("cuda") # Valid device
# Since load_model_torch function will raise an exception, prep_torch_inference should return None
assert prep_torch_inference(model_path, device) is None

@ -0,0 +1,119 @@
import pytest
from swarms.utils import print_class_parameters
class TestObject:
def __init__(self, value1, value2: int):
pass
class TestObject2:
def __init__(self: "TestObject2", value1, value2: int = 5):
pass
def test_class_with_complex_parameters():
class ComplexArgs:
def __init__(self, value1: list, value2: dict = {}):
pass
output = {"value1": "<class 'list'>", "value2": "<class 'dict'>"}
assert (
print_class_parameters(ComplexArgs, api_format=True) == output
)
def test_empty_class():
class Empty:
pass
with pytest.raises(Exception):
print_class_parameters(Empty)
def test_class_with_no_annotations():
class NoAnnotations:
def __init__(self, value1, value2):
pass
output = {
"value1": "<class 'inspect._empty'>",
"value2": "<class 'inspect._empty'>",
}
assert (
print_class_parameters(NoAnnotations, api_format=True)
== output
)
def test_class_with_partial_annotations():
class PartialAnnotations:
def __init__(self, value1, value2: int):
pass
output = {
"value1": "<class 'inspect._empty'>",
"value2": "<class 'int'>",
}
assert (
print_class_parameters(PartialAnnotations, api_format=True)
== output
)
@pytest.mark.parametrize(
"obj, expected",
[
(
TestObject,
{
"value1": "<class 'inspect._empty'>",
"value2": "<class 'int'>",
},
),
(
TestObject2,
{
"value1": "<class 'inspect._empty'>",
"value2": "<class 'int'>",
},
),
],
)
def test_parametrized_class_parameters(obj, expected):
assert print_class_parameters(obj, api_format=True) == expected
@pytest.mark.parametrize(
"value",
[
int,
float,
str,
list,
set,
dict,
bool,
tuple,
complex,
bytes,
bytearray,
memoryview,
range,
frozenset,
slice,
object,
],
)
def test_not_class_exception(value):
with pytest.raises(Exception):
print_class_parameters(value)
def test_api_format_flag():
assert print_class_parameters(TestObject2, api_format=True) == {
"value1": "<class 'inspect._empty'>",
"value2": "<class 'int'>",
}
print_class_parameters(TestObject)
# TODO: Capture printed output and assert correctness.
Loading…
Cancel
Save