Model¶
dnallm.models.model ¶
DNA Model loading and management utilities.
This module provides functions for downloading, loading, and managing DNA language models from various sources including Hugging Face Hub, ModelScope, and local storage.
Classes¶
DNALLMforSequenceClassification ¶
DNALLMforSequenceClassification(config, custom_model=None)
Bases: PreTrainedModel
An automated wrapper that selects an appropriate pooling strategy based on the underlying model architecture and appends a customizable MLP head for sequence classification or regression tasks.
Functions¶
clear_model_cache ¶
clear_model_cache(source='huggingface')
Remove all the cached models
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str
|
Source to clear model cache from ( 'huggingface', 'modelscope'), default 'huggingface' |
'huggingface'
|
download_model ¶
download_model(
model_name, downloader, revision=None, max_try=10
)
Download a model with retry mechanism for network issues.
In case of network issues, this function will attempt to download the model multiple times before giving up.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Name of the model to download |
required |
downloader
|
Any
|
Download function to use (e.g., snapshot_download) |
required |
max_try
|
int
|
Maximum number of download attempts, default 10 |
10
|
Returns:
| Type | Description |
|---|---|
str
|
Path where the model files are stored |
Raises:
| Type | Description |
|---|---|
ValueError
|
If model download fails after all attempts |
load_model_and_tokenizer ¶
load_model_and_tokenizer(
model_name,
task_config,
source="local",
use_mirror=False,
revision=None,
custom_tokenizer=None,
)
Load model and tokenizer from either HuggingFace or ModelScope.
This function handles loading of various model types based on the task configuration, including sequence classification, token classification, masked language modeling, and causal language modeling.
Args:
model_name: Model name or path
task_config: Task configuration object containing task type and
label information
source: Source to load model and tokenizer from (
'local',
'huggingface',
'modelscope'),
default 'local'
use_mirror: Whether to use HuggingFace mirror (
hf-mirror.com),
default False
Returns:
Tuple containing (model, tokenizer)
Raises:
ValueError: If model is not found locally or loading fails
load_preset_model ¶
load_preset_model(model_name, task_config)
Load a preset model and tokenizer based on the task configuration.
This function loads models from the preset model registry, which contains pre-configured models for various DNA analysis tasks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Name or path of the model task_config: Task configuration object containing task type and label information |
required |
Returns:
| Type | Description |
|---|---|
tuple[Any, Any] | int
|
Tuple containing (model, tokenizer) if successful, 0 if model not found |
Note
If the model is not found in preset models,
the function will print a warning
and
return 0. Use `load_model_and_tokenizer` function for custom model
loading.
peft_forward_compatiable ¶
peft_forward_compatiable(model)
Convert base model forward to be compatiable with HF
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Any
|
Base model |
required |
Returns:
| Type | Description |
|---|---|
Any
|
model with changed forward function |