Task¶
dnallm.tasks.task ¶
DNA Language Model Fine-tuning Task Definition Module.
This module defines various task types and related components supported by DNA language models during fine-tuning, including:
- TaskType: Task type enumeration
- Binary classification (BINARY): e.g., promoter prediction, enhancer identification
- Multi-class classification (MULTICLASS): e.g., protein family classification, functional region classification
- Regression (REGRESSION): e.g., expression level prediction, binding strength prediction
- Token classification (NER): Named Entity Recognition tasks
-
Generation and embedding tasks for different model architectures
-
TaskConfig: Task configuration class
- Configures task type, number of labels, label names, etc.
-
Provides threshold settings for binary classification tasks
-
TaskHead: Task-specific prediction heads
- Provides specialized neural network layers for different task types
- Supports feature dimensionality reduction and dropout to prevent overfitting
-
Automatically selects output dimensions based on task type
-
compute_metrics: Evaluation metric computation
- Binary: accuracy, F1 score
- Multi-class: accuracy, macro F1, weighted F1
- Regression: mean squared error, R-squared value
Usage example
task_config = TaskConfig( task_type=TaskType.BINARY, num_labels=2, label_names=["negative", "positive"] )
Classes¶
TaskConfig ¶
Bases: BaseModel
Configuration class for different fine-tuning tasks.
This class provides a structured way to configure task-specific parameters including task type, number of labels, label names, and classification thresholds.
Attributes:
| Name | Type | Description |
|---|---|---|
task_type |
str
|
Type of task to perform (must match regex pattern) |
num_labels |
int
|
Number of output labels/classes |
label_names |
list | None
|
List of label names for classification tasks |
threshold |
float
|
Classification threshold for binary and multi-label tasks |
TaskType ¶
Bases: Enum
Enum for supported task types in DNALLM.
This enum defines the various tasks that can be performed with DNA language models, including classification, regression, and generation tasks.