You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
102 lines
6.8 KiB
102 lines
6.8 KiB
def documentation(task: str):
|
|
documentation = f"""Create multi-page long and explicit professional pytorch-like documentation for the <MODULE> code below follow the outline for the <MODULE> library,
|
|
provide many examples and teach the user about the code, provide examples for every function, make the documentation 10,000 words,
|
|
provide many usage examples and note this is markdown docs, create the documentation for the code to document,
|
|
put the arguments and methods in a table in markdown to make it visually seamless
|
|
|
|
Now make the professional documentation for this code, provide the architecture and how the class works and why it works that way,
|
|
it's purpose, provide args, their types, 3 ways of usage examples, in examples show all the code like imports main example etc
|
|
|
|
BE VERY EXPLICIT AND THOROUGH, MAKE IT DEEP AND USEFUL
|
|
|
|
########
|
|
Step 1: Understand the purpose and functionality of the module or framework
|
|
|
|
Read and analyze the description provided in the documentation to understand the purpose and functionality of the module or framework.
|
|
Identify the key features, parameters, and operations performed by the module or framework.
|
|
Step 2: Provide an overview and introduction
|
|
|
|
Start the documentation by providing a brief overview and introduction to the module or framework.
|
|
Explain the importance and relevance of the module or framework in the context of the problem it solves.
|
|
Highlight any key concepts or terminology that will be used throughout the documentation.
|
|
Step 3: Provide a class or function definition
|
|
|
|
Provide the class or function definition for the module or framework.
|
|
Include the parameters that need to be passed to the class or function and provide a brief description of each parameter.
|
|
Specify the data types and default values for each parameter.
|
|
Step 4: Explain the functionality and usage
|
|
|
|
Provide a detailed explanation of how the module or framework works and what it does.
|
|
Describe the steps involved in using the module or framework, including any specific requirements or considerations.
|
|
Provide code examples to demonstrate the usage of the module or framework.
|
|
Explain the expected inputs and outputs for each operation or function.
|
|
Step 5: Provide additional information and tips
|
|
|
|
Provide any additional information or tips that may be useful for using the module or framework effectively.
|
|
Address any common issues or challenges that developers may encounter and provide recommendations or workarounds.
|
|
Step 6: Include references and resources
|
|
|
|
Include references to any external resources or research papers that provide further information or background on the module or framework.
|
|
Provide links to relevant documentation or websites for further exploration.
|
|
Example Template for the given documentation:
|
|
|
|
# Module/Function Name: MultiheadAttention
|
|
|
|
class torch.nn.MultiheadAttention(embed_dim, num_heads, dropout=0.0, bias=True, add_bias_kv=False, add_zero_attn=False, kdim=None, vdim=None, batch_first=False, device=None, dtype=None):
|
|
```
|
|
Creates a multi-head attention module for joint information representation from the different subspaces.
|
|
|
|
Parameters:
|
|
- embed_dim (int): Total dimension of the model.
|
|
- num_heads (int): Number of parallel attention heads. The embed_dim will be split across num_heads.
|
|
- dropout (float): Dropout probability on attn_output_weights. Default: 0.0 (no dropout).
|
|
- bias (bool): If specified, adds bias to input/output projection layers. Default: True.
|
|
- add_bias_kv (bool): If specified, adds bias to the key and value sequences at dim=0. Default: False.
|
|
- add_zero_attn (bool): If specified, adds a new batch of zeros to the key and value sequences at dim=1. Default: False.
|
|
- kdim (int): Total number of features for keys. Default: None (uses kdim=embed_dim).
|
|
- vdim (int): Total number of features for values. Default: None (uses vdim=embed_dim).
|
|
- batch_first (bool): If True, the input and output tensors are provided as (batch, seq, feature). Default: False.
|
|
- device (torch.device): If specified, the tensors will be moved to the specified device.
|
|
- dtype (torch.dtype): If specified, the tensors will have the specified dtype.
|
|
```
|
|
|
|
def forward(query, key, value, key_padding_mask=None, need_weights=True, attn_mask=None, average_attn_weights=True, is_causal=False):
|
|
```
|
|
Forward pass of the multi-head attention module.
|
|
|
|
Parameters:
|
|
- query (Tensor): Query embeddings of shape (L, E_q) for unbatched input, (L, N, E_q) when batch_first=False, or (N, L, E_q) when batch_first=True.
|
|
- key (Tensor): Key embeddings of shape (S, E_k) for unbatched input, (S, N, E_k) when batch_first=False, or (N, S, E_k) when batch_first=True.
|
|
- value (Tensor): Value embeddings of shape (S, E_v) for unbatched input, (S, N, E_v) when batch_first=False, or (N, S, E_v) when batch_first=True.
|
|
- key_padding_mask (Optional[Tensor]): If specified, a mask indicating elements to be ignored in key for attention computation.
|
|
- need_weights (bool): If specified, returns attention weights in addition to attention outputs. Default: True.
|
|
- attn_mask (Optional[Tensor]): If specified, a mask preventing attention to certain positions.
|
|
- average_attn_weights (bool): If true, returns averaged attention weights per head. Otherwise, returns attention weights separately per head. Note that this flag only has an effect when need_weights=True. Default: True.
|
|
- is_causal (bool): If specified, applies a causal mask as the attention mask. Default: False.
|
|
|
|
Returns:
|
|
Tuple[Tensor, Optional[Tensor]]:
|
|
- attn_output (Tensor): Attention outputs of shape (L, E) for unbatched input, (L, N, E) when batch_first=False, or (N, L, E) when batch_first=True.
|
|
- attn_output_weights (Optional[Tensor]): Attention weights of shape (L, S) when unbatched or (N, L, S) when batched. Optional, only returned when need_weights=True.
|
|
```
|
|
|
|
# Implementation of the forward pass of the attention module goes here
|
|
|
|
return attn_output, attn_output_weights
|
|
|
|
```
|
|
# Usage example:
|
|
|
|
multihead_attn = nn.MultiheadAttention(embed_dim, num_heads)
|
|
attn_output, attn_output_weights = multihead_attn(query, key, value)
|
|
Note:
|
|
|
|
The above template includes the class or function definition, parameters, description, and usage example.
|
|
To replicate the documentation for any other module or framework, follow the same structure and provide the specific details for that module or framework.
|
|
|
|
|
|
############# DOCUMENT THE FOLLOWING CODE ########
|
|
{task}
|
|
"""
|
|
return documentation
|