You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
119 lines
4.3 KiB
119 lines
4.3 KiB
# extract_code_from_markdown
|
|
|
|
# swarms.utils Module
|
|
|
|
The `swarms.utils` module provides utility functions designed to facilitate specific tasks within the main Swarm codebase. The function `extract_code_from_markdown` is a critical function within this module that we will document in this example.
|
|
|
|
## Overview and Introduction
|
|
|
|
Many software projects use Markdown extensively for writing documentation, tutorials, and other text documents that can be easily rendered and viewed in different formats, including HTML.
|
|
|
|
The `extract_code_from_markdown` function plays a crucial role within the swarms.utils library. As developers write large volumes of Markdown, they often need to isolate code snippets from the whole Markdown file body. These isolated snippets can be used to generate test cases, transform into other languages, or analyze for metrics.
|
|
|
|
## Function Definition: `extract_code_from_markdown`
|
|
|
|
```python
|
|
def extract_code_from_markdown(markdown_content: str) -> str:
|
|
"""
|
|
Extracts code blocks from a Markdown string and returns them as a single string.
|
|
|
|
Args:
|
|
- markdown_content (str): The Markdown content as a string.
|
|
|
|
Returns:
|
|
- str: A single string containing all the code blocks separated by newlines.
|
|
"""
|
|
# Regular expression for fenced code blocks
|
|
pattern = r"```(?:\w+\n)?(.*?)```"
|
|
matches = re.findall(pattern, markdown_content, re.DOTALL)
|
|
|
|
# Concatenate all code blocks separated by newlines
|
|
return "\n".join(code.strip() for code in matches)
|
|
```
|
|
|
|
### Arguments
|
|
|
|
The function `extract_code_from_markdown` takes one argument:
|
|
|
|
| Argument | Description | Type | Default Value |
|
|
|-----------------------|----------------------------------------|-------------|-------------------|
|
|
| markdown_content | The input markdown content as a string | str | N/A |
|
|
|
|
|
|
## Function Explanation and Usage
|
|
|
|
This function uses a regular expression to find all fenced code blocks in a Markdown string. The pattern `r"```(?:\w+\n)?(.*?)```"` matches strings that start and end with three backticks, optionally followed by a newline and then any number of any characters (the `.*?` part) until the first occurrence of another triple backtick set.
|
|
|
|
Once we have the matches, we join all the code blocks into a single string, each block separated by a newline.
|
|
|
|
The method's functionality is particularly useful when we need to extract code blocks from markdown content for secondary processing, such as syntax highlighting or execution in a different environment.
|
|
|
|
### Usage Examples
|
|
|
|
Below are three examples of how you might use this function:
|
|
|
|
#### Example 1:
|
|
|
|
Extracting code blocks from a simple markdown string.
|
|
|
|
```python
|
|
from swarms.utils import extract_code_from_markdown
|
|
|
|
markdown_string = """# Example
|
|
This is an example of a code block:
|
|
```python
|
|
print("Hello World!")
|
|
``` """
|
|
print(extract_code_from_markdown(markdown_string))
|
|
```
|
|
|
|
#### Example 2:
|
|
|
|
Extracting code blocks from a markdown file.
|
|
|
|
```python
|
|
import re
|
|
|
|
|
|
def extract_code_from_markdown(markdown_content: str) -> str:
|
|
pattern = r"```(?:\w+\n)?(.*?)```"
|
|
matches = re.findall(pattern, markdown_content, re.DOTALL)
|
|
return "\n".join(code.strip() for code in matches)
|
|
|
|
|
|
# Assume that 'example.md' contains multiple code blocks
|
|
with open("example.md") as file:
|
|
markdown_content = file.read()
|
|
print(extract_code_from_markdown(markdown_content))
|
|
```
|
|
|
|
#### Example 3:
|
|
|
|
Using the function in a pipeline to extract and then analyze code blocks.
|
|
|
|
```python
|
|
import re
|
|
|
|
|
|
def extract_code_from_markdown(markdown_content: str) -> str:
|
|
pattern = r"```(?:\w+\n)?(.*?)```"
|
|
matches = re.findall(pattern, markdown_content, re.DOTALL)
|
|
return "\n".join(code.strip() for code in matches)
|
|
|
|
|
|
def analyze_code_blocks(code: str):
|
|
# Add your analysis logic here
|
|
pass
|
|
|
|
|
|
# Assume that 'example.md' contains multiple code blocks
|
|
with open("example.md") as file:
|
|
markdown_content = file.read()
|
|
code_blocks = extract_code_from_markdown(markdown_content)
|
|
analyze_code_blocks(code_blocks)
|
|
```
|
|
|
|
## Conclusion
|
|
|
|
This concludes the detailed documentation of the `extract_code_from_markdown` function from the swarms.utils module. With this documentation, you should be able to understand the function's purpose, how it works, its parameters, and see examples of how to use it effectively.
|