Model¶
dnallm.models.model ¶
DNA Model loading and management utilities.
This module provides functions for downloading, loading, and managing DNA language models from various sources including Hugging Face Hub, ModelScope, and local storage.
Classes¶
BasicCNNHead ¶
BasicCNNHead(
input_dim,
num_classes,
task_type="binary",
num_filters=128,
kernel_sizes=None,
dropout=0.2,
**kwargs,
)
Bases: Module
A CNN-based head for processing Transformer output sequences. This head applies multiple 1D convolutional layers with different kernel sizes to capture local patterns in the sequence data, followed by a fully connected layer for classification or regression tasks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim
|
int
|
Dimension of the input features |
required |
num_classes
|
int
|
Number of output classes (for classification tasks) |
required |
task_type
|
str
|
Type of task - 'binary', 'multiclass', 'multilabel', or 'regression' |
'binary'
|
hidden_dims
|
List of hidden layer dimensions |
required | |
activation_fn
|
Activation function to use ('relu', 'gelu', 'silu', 'tanh', 'sigmoid') |
required | |
use_normalization
|
Whether to use normalization layers |
required | |
norm_type
|
Type of normalization - 'batchnorm' or 'layernorm' |
required | |
dropout
|
float
|
Dropout probability |
0.2
|
BasicLSTMHead ¶
BasicLSTMHead(
input_dim,
num_classes,
task_type="binary",
hidden_size=256,
num_layers=1,
dropout=0.1,
bidirectional=True,
**kwargs,
)
Bases: Module
A LSTM-based head for processing Transformer output sequences. This head applies a multi-layer LSTM to capture sequential dependencies in the sequence data, followed by a fully connected layer for classification or regression tasks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim
|
int
|
Dimension of the input features |
required |
num_classes
|
int
|
Number of output classes (for classification tasks) |
required |
task_type
|
str
|
Type of task - 'binary', 'multiclass', 'multilabel', or 'regression' |
'binary'
|
hidden_size
|
int
|
Number of features in the hidden state of the LSTM |
256
|
num_layers
|
int
|
Number of recurrent layers in the LSTM |
1
|
dropout
|
float
|
Dropout probability between LSTM layers |
0.1
|
bidirectional
|
bool
|
Whether to use a bidirectional LSTM |
True
|
BasicMLPHead ¶
BasicMLPHead(
input_dim,
num_classes=2,
task_type="binary",
hidden_dims=None,
activation_fn="relu",
use_normalization=True,
norm_type="layernorm",
dropout=0.1,
**kwargs,
)
Bases: Module
A universal and customizable MLP model designed to be appended after the embedding output of models like Transformers to perform various downstream tasks such as classification and regression.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim
|
int
|
Dimension of the input features |
required |
num_classes
|
int
|
Number of output classes (for classification tasks) |
2
|
task_type
|
str
|
Type of task - 'binary', 'multiclass', 'multilabel', or 'regression' |
'binary'
|
hidden_dims
|
list | None
|
List of hidden layer dimensions |
None
|
activation_fn
|
str
|
Activation function to use ('relu', 'gelu', 'silu', 'tanh', 'sigmoid') |
'relu'
|
use_normalization
|
bool
|
Whether to use normalization layers |
True
|
norm_type
|
str
|
Type of normalization - 'batchnorm' or 'layernorm' |
'layernorm'
|
dropout
|
float
|
Dropout probability |
0.1
|
BasicUNet1DHead ¶
BasicUNet1DHead(
input_dim,
num_classes,
task_type="binary",
num_layers=2,
initial_filters=64,
**kwargs,
)
Bases: Module
An U-net architecture adapted for 1D sequence data, suitable for classification and regression tasks. This model consists of an encoder-decoder structure with skip connections, allowing it to capture both local and global features in the inputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim
|
int
|
The number of input features (channels) in the inputs. |
required |
num_classes
|
int
|
The number of output classes for the classification task. |
required |
task_type
|
str
|
The type of task (e.g., "binary" or "multi-class"). |
'binary'
|
num_layers
|
int
|
The number of downsampling/upsampling layers in the U-net. |
2
|
initial_filters
|
int
|
The number of filters in the first convolutional layer. |
64
|
DNALLMforSequenceClassification ¶
DNALLMforSequenceClassification(config, custom_model=None)
Bases: PreTrainedModel
An automated wrapper that selects an appropriate pooling strategy based on the underlying model architecture and appends a customizable MLP head for sequence classification or regression tasks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Pre-trained transformer model (e.g., BERT, GPT) |
required | |
tokenizer
|
Corresponding tokenizer for the model |
required | |
mlp_head_config
|
Configuration dictionary for the MLP head |
required |
DoubleConv ¶
DoubleConv(in_channels, out_channels)
Bases: Module
(Convolution => [BatchNorm] => ReLU) * 2
MegaDNAMultiScaleHead ¶
MegaDNAMultiScaleHead(
embedding_dims=None,
num_classes=2,
task_type="binary",
hidden_dims=None,
dropout=0.2,
**kwargs,
)
Bases: Module
A classification head tailored for the multi-scale embedding outputs of the MegaDNA model. It takes a list of embedding tensors, pools each tensor, and concatenates the results before passing them to an MLP for classification.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embedding_dims
|
list | None
|
A list of integers representing the dimensions of the input embeddings. |
None
|
num_classes
|
int
|
The number of output classes for classification. |
2
|
task_type
|
str
|
The type of task (e.g., "binary" or "multi-class"). |
'binary'
|
hidden_dims
|
list | None
|
A list of integers representing the sizes of hidden layers in the MLP. |
None
|
dropout
|
float
|
Dropout probability for regularization. |
0.2
|
Functions¶
clear_model_cache ¶
clear_model_cache(source='huggingface')
Remove all the cached models
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str
|
Source to clear model cache from ( 'huggingface', 'modelscope'), default 'huggingface' |
'huggingface'
|
download_model ¶
download_model(
model_name, downloader, revision=None, max_try=10
)
Download a model with retry mechanism for network issues.
In case of network issues, this function will attempt to download the model multiple times before giving up.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Name of the model to download |
required |
downloader
|
Any
|
Download function to use (e.g., snapshot_download) |
required |
max_try
|
int
|
Maximum number of download attempts, default 10 |
10
|
Returns:
| Type | Description |
|---|---|
str
|
Path where the model files are stored |
Raises:
| Type | Description |
|---|---|
ValueError
|
If model download fails after all attempts |
is_flash_attention_capable ¶
is_flash_attention_capable()
Check if Flash Attention has been installed. Returns: True if Flash Attention is installed and the device supports it False otherwise
is_fp8_capable ¶
is_fp8_capable()
Check if the current CUDA device supports FP8 precision.
Returns:
| Type | Description |
|---|---|
bool
|
True if the device supports FP8 ( |
compute capability >= 9.0),
False otherwise
load_model_and_tokenizer ¶
load_model_and_tokenizer(
model_name,
task_config,
source="local",
use_mirror=False,
revision=None,
)
Load model and tokenizer from either HuggingFace or ModelScope.
This function handles loading of various model types based on the task configuration, including sequence classification, token classification, masked language modeling, and causal language modeling.
Args:
model_name: Model name or path
task_config: Task configuration object containing task type and
label information
source: Source to load model and tokenizer from (
'local',
'huggingface',
'modelscope'),
default 'local'
use_mirror: Whether to use HuggingFace mirror (
hf-mirror.com),
default False
Returns:
Tuple containing (model, tokenizer)
Raises:
ValueError: If model is not found locally or loading fails
load_preset_model ¶
load_preset_model(model_name, task_config)
Load a preset model and tokenizer based on the task configuration.
This function loads models from the preset model registry, which contains pre-configured models for various DNA analysis tasks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Name or path of the model task_config: Task configuration object containing task type and label information |
required |
Returns:
| Type | Description |
|---|---|
tuple[Any, Any] | int
|
Tuple containing (model, tokenizer) if successful, 0 if model not found |
Note
If the model is not found in preset models,
the function will print a warning
and
return 0. Use `load_model_and_tokenizer` function for custom model
loading.
peft_forward_compatiable ¶
peft_forward_compatiable(model)
Convert base model forward to be compatiable with HF
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Any
|
Base model |
required |
Returns:
| Type | Description |
|---|---|
Any
|
model with changed forward function |