Skip to content

Model

dnallm.models.model

DNA Model loading and management utilities.

This module provides functions for downloading, loading, and managing DNA language models from various sources including Hugging Face Hub, ModelScope, and local storage.

Classes

BasicCNNHead

BasicCNNHead(
    input_dim,
    num_classes,
    task_type="binary",
    num_filters=128,
    kernel_sizes=None,
    dropout=0.2,
    **kwargs,
)

Bases: Module

A CNN-based head for processing Transformer output sequences. This head applies multiple 1D convolutional layers with different kernel sizes to capture local patterns in the sequence data, followed by a fully connected layer for classification or regression tasks.

Parameters:

Name Type Description Default
input_dim int

Dimension of the input features

required
num_classes int

Number of output classes (for classification tasks)

required
task_type str

Type of task - 'binary', 'multiclass', 'multilabel', or 'regression'

'binary'
hidden_dims

List of hidden layer dimensions

required
activation_fn

Activation function to use ('relu', 'gelu', 'silu', 'tanh', 'sigmoid')

required
use_normalization

Whether to use normalization layers

required
norm_type

Type of normalization - 'batchnorm' or 'layernorm'

required
dropout float

Dropout probability

0.2

BasicLSTMHead

BasicLSTMHead(
    input_dim,
    num_classes,
    task_type="binary",
    hidden_size=256,
    num_layers=1,
    dropout=0.1,
    bidirectional=True,
    **kwargs,
)

Bases: Module

A LSTM-based head for processing Transformer output sequences. This head applies a multi-layer LSTM to capture sequential dependencies in the sequence data, followed by a fully connected layer for classification or regression tasks.

Parameters:

Name Type Description Default
input_dim int

Dimension of the input features

required
num_classes int

Number of output classes (for classification tasks)

required
task_type str

Type of task - 'binary', 'multiclass', 'multilabel', or 'regression'

'binary'
hidden_size int

Number of features in the hidden state of the LSTM

256
num_layers int

Number of recurrent layers in the LSTM

1
dropout float

Dropout probability between LSTM layers

0.1
bidirectional bool

Whether to use a bidirectional LSTM

True

BasicMLPHead

BasicMLPHead(
    input_dim,
    num_classes=2,
    task_type="binary",
    hidden_dims=None,
    activation_fn="relu",
    use_normalization=True,
    norm_type="layernorm",
    dropout=0.1,
    **kwargs,
)

Bases: Module

A universal and customizable MLP model designed to be appended after the embedding output of models like Transformers to perform various downstream tasks such as classification and regression.

Parameters:

Name Type Description Default
input_dim int

Dimension of the input features

required
num_classes int

Number of output classes (for classification tasks)

2
task_type str

Type of task - 'binary', 'multiclass', 'multilabel', or 'regression'

'binary'
hidden_dims list | None

List of hidden layer dimensions

None
activation_fn str

Activation function to use ('relu', 'gelu', 'silu', 'tanh', 'sigmoid')

'relu'
use_normalization bool

Whether to use normalization layers

True
norm_type str

Type of normalization - 'batchnorm' or 'layernorm'

'layernorm'
dropout float

Dropout probability

0.1

BasicUNet1DHead

BasicUNet1DHead(
    input_dim,
    num_classes,
    task_type="binary",
    num_layers=2,
    initial_filters=64,
    **kwargs,
)

Bases: Module

An U-net architecture adapted for 1D sequence data, suitable for classification and regression tasks. This model consists of an encoder-decoder structure with skip connections, allowing it to capture both local and global features in the inputs.

Parameters:

Name Type Description Default
input_dim int

The number of input features (channels) in the inputs.

required
num_classes int

The number of output classes for the classification task.

required
task_type str

The type of task (e.g., "binary" or "multi-class").

'binary'
num_layers int

The number of downsampling/upsampling layers in the U-net.

2
initial_filters int

The number of filters in the first convolutional layer.

64

DNALLMforSequenceClassification

DNALLMforSequenceClassification(config, custom_model=None)

Bases: PreTrainedModel

An automated wrapper that selects an appropriate pooling strategy based on the underlying model architecture and appends a customizable MLP head for sequence classification or regression tasks.

Parameters:

Name Type Description Default
model

Pre-trained transformer model (e.g., BERT, GPT)

required
tokenizer

Corresponding tokenizer for the model

required
mlp_head_config

Configuration dictionary for the MLP head

required
Functions
from_base_model classmethod
from_base_model(model_name_or_path, config, module=None)

Handles weights diffusion when loading a model from a pre-trained base model.

DoubleConv

DoubleConv(in_channels, out_channels)

Bases: Module

(Convolution => [BatchNorm] => ReLU) * 2

MegaDNAMultiScaleHead

MegaDNAMultiScaleHead(
    embedding_dims=None,
    num_classes=2,
    task_type="binary",
    hidden_dims=None,
    dropout=0.2,
    **kwargs,
)

Bases: Module

A classification head tailored for the multi-scale embedding outputs of the MegaDNA model. It takes a list of embedding tensors, pools each tensor, and concatenates the results before passing them to an MLP for classification.

Parameters:

Name Type Description Default
embedding_dims list | None

A list of integers representing the dimensions of the input embeddings.

None
num_classes int

The number of output classes for classification.

2
task_type str

The type of task (e.g., "binary" or "multi-class").

'binary'
hidden_dims list | None

A list of integers representing the sizes of hidden layers in the MLP.

None
dropout float

Dropout probability for regularization.

0.2

Functions

clear_model_cache

clear_model_cache(source='huggingface')

Remove all the cached models

Parameters:

Name Type Description Default
source str

Source to clear model cache from ( 'huggingface', 'modelscope'), default 'huggingface'

'huggingface'

download_model

download_model(
    model_name, downloader, revision=None, max_try=10
)

Download a model with retry mechanism for network issues.

In case of network issues, this function will attempt to download the model multiple times before giving up.

Parameters:

Name Type Description Default
model_name str

Name of the model to download

required
downloader Any

Download function to use (e.g., snapshot_download)

required
max_try int

Maximum number of download attempts, default 10

10

Returns:

Type Description
str

Path where the model files are stored

Raises:

Type Description
ValueError

If model download fails after all attempts

is_flash_attention_capable

is_flash_attention_capable()

Check if Flash Attention has been installed. Returns: True if Flash Attention is installed and the device supports it False otherwise

is_fp8_capable

is_fp8_capable()

Check if the current CUDA device supports FP8 precision.

Returns:

Type Description
bool

True if the device supports FP8 (

    compute capability >= 9.0),
    False otherwise

load_model_and_tokenizer

load_model_and_tokenizer(
    model_name,
    task_config,
    source="local",
    use_mirror=False,
    revision=None,
)

Load model and tokenizer from either HuggingFace or ModelScope.

This function handles loading of various model types based on the task configuration, including sequence classification, token classification, masked language modeling, and causal language modeling.

Args:
    model_name: Model name or path
    task_config: Task configuration object containing task type and
        label information
            source: Source to load model and tokenizer from (
        'local',
        'huggingface',
        'modelscope'),
        default 'local'
            use_mirror: Whether to use HuggingFace mirror (
        hf-mirror.com),
        default False

Returns:
    Tuple containing (model, tokenizer)

Raises:
    ValueError: If model is not found locally or loading fails

load_preset_model

load_preset_model(model_name, task_config)

Load a preset model and tokenizer based on the task configuration.

This function loads models from the preset model registry, which contains pre-configured models for various DNA analysis tasks.

Parameters:

Name Type Description Default
model_name str

Name or path of the model task_config: Task configuration object containing task type and label information

required

Returns:

Type Description
tuple[Any, Any] | int

Tuple containing (model, tokenizer) if successful, 0 if model not found

Note

If the model is not found in preset models,

    the function will print a warning
        and
    return 0. Use `load_model_and_tokenizer` function for custom model
    loading.

peft_forward_compatiable

peft_forward_compatiable(model)

Convert base model forward to be compatiable with HF

Parameters:

Name Type Description Default
model Any

Base model

required

Returns:

Type Description
Any

model with changed forward function