Skip to content

Task

dnallm.tasks.task

DNA Language Model Fine-tuning Task Definition Module.

This module defines various task types and related components supported by DNA language models during fine-tuning, including:

  1. TaskType: Task type enumeration
  2. Binary classification (BINARY): e.g., promoter prediction, enhancer identification
  3. Multi-class classification (MULTICLASS): e.g., protein family classification, functional region classification
  4. Regression (REGRESSION): e.g., expression level prediction, binding strength prediction
  5. Token classification (NER): Named Entity Recognition tasks
  6. Generation and embedding tasks for different model architectures

  7. TaskConfig: Task configuration class

  8. Configures task type, number of labels, label names, etc.
  9. Provides threshold settings for binary classification tasks

  10. TaskHead: Task-specific prediction heads

  11. Provides specialized neural network layers for different task types
  12. Supports feature dimensionality reduction and dropout to prevent overfitting
  13. Automatically selects output dimensions based on task type

  14. compute_metrics: Evaluation metric computation

  15. Binary: accuracy, F1 score
  16. Multi-class: accuracy, macro F1, weighted F1
  17. Regression: mean squared error, R-squared value
Usage example

task_config = TaskConfig( task_type=TaskType.BINARY, num_labels=2, label_names=["negative", "positive"] )

Classes

TaskConfig

Bases: BaseModel

Configuration class for different fine-tuning tasks.

This class provides a structured way to configure task-specific parameters including task type, number of labels, label names, and classification thresholds.

Attributes:

Name Type Description
task_type str

Type of task to perform (must match regex pattern)

num_labels int

Number of output labels/classes

label_names list | None

List of label names for classification tasks

threshold float

Classification threshold for binary and multi-label tasks

Functions
model_post_init
model_post_init(__context)

Initialize task configuration after model validation.

This method is called after Pydantic model validation and automatically sets appropriate default values based on task type.

TaskType

Bases: Enum

Enum for supported task types in DNALLM.

This enum defines the various tasks that can be performed with DNA language models, including classification, regression, and generation tasks.