Pydantic AI with Ollama and MCP Server¶
This notebook demonstrates how to use Pydantic AI with Ollama models and MCP servers for DNA sequence analysis.
In [1]:
Copied!
import nest_asyncio
import asyncio
import time
nest_asyncio.apply()
import nest_asyncio
import asyncio
import time
nest_asyncio.apply()
In [2]:
Copied!
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.ollama import OllamaProvider
from pydantic_ai.mcp import MCPServerStreamableHTTP
ollama_model = OpenAIChatModel(
model_name='qwen3:latest',
provider=OllamaProvider(base_url='http://localhost:11434/v1'),
)
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.ollama import OllamaProvider
from pydantic_ai.mcp import MCPServerStreamableHTTP
ollama_model = OpenAIChatModel(
model_name='qwen3:latest',
provider=OllamaProvider(base_url='http://localhost:11434/v1'),
)
In [3]:
Copied!
# Create MCP server connection
server = MCPServerStreamableHTTP('http://localhost:8000/mcp')
# Create MCP server connection
server = MCPServerStreamableHTTP('http://localhost:8000/mcp')
In [4]:
Copied!
# Create agent with MCP server tools and proper system prompt
agent_ollama = Agent(
ollama_model,
toolsets=[server],
system_prompt='''You are a DNA analysis assistant with access to specialized DNA analysis tools via MCP server.
When analyzing a DNA sequence, you should:
1. First call _list_loaded_models to see what models are available
2. Then call _dna_multi_model_predict with the DNA sequence and appropriate model names
3. Interpret and explain the results in a comprehensive way, not just only list the results
Available tools should include:
- _list_loaded_models: Lists available DNA analysis models
- _dna_multi_model_predict: Predicts DNA sequence properties using multiple models
Always use the tools to provide accurate analysis. Based on the returned results, make reasonable inferences with comprehensive biological functions of this sequence.'''
)
# Create agent with MCP server tools and proper system prompt
agent_ollama = Agent(
ollama_model,
toolsets=[server],
system_prompt='''You are a DNA analysis assistant with access to specialized DNA analysis tools via MCP server.
When analyzing a DNA sequence, you should:
1. First call _list_loaded_models to see what models are available
2. Then call _dna_multi_model_predict with the DNA sequence and appropriate model names
3. Interpret and explain the results in a comprehensive way, not just only list the results
Available tools should include:
- _list_loaded_models: Lists available DNA analysis models
- _dna_multi_model_predict: Predicts DNA sequence properties using multiple models
Always use the tools to provide accurate analysis. Based on the returned results, make reasonable inferences with comprehensive biological functions of this sequence.'''
)
In [5]:
Copied!
# Analyze DNA sequence using MCP server with proper async context
async def analyze_dna_sequence():
async with agent_ollama: # This ensures proper MCP server connection
result = await agent_ollama.run(
'What is the function of following DNA sequence? Please analyze it thoroughly using all available models: AGAAAAAACATGACAAGAAATCGATAATAATACAAAAGCTATGATGGTGTGCAATGTCCGTGTGCATGCGTGCACGCATTGCAACCGGCCCAAATCAAGGCCCATCGATCAGTGAATACTCATGGGCCGGCGGCCCACCACCGCTTCATCTCCTCCTCCGACGACGGGAGCACCCCCGCCGCATCGCCACCGACGAGGAGGAGGCCATTGCCGGCGGCGCCCCCGGTGAGCCGCTGCACCACGTCCCTGA'
)
return result
# Run the analysis
result = await analyze_dna_sequence()
time.sleep(3)
# Analyze DNA sequence using MCP server with proper async context
async def analyze_dna_sequence():
async with agent_ollama: # This ensures proper MCP server connection
result = await agent_ollama.run(
'What is the function of following DNA sequence? Please analyze it thoroughly using all available models: AGAAAAAACATGACAAGAAATCGATAATAATACAAAAGCTATGATGGTGTGCAATGTCCGTGTGCATGCGTGCACGCATTGCAACCGGCCCAAATCAAGGCCCATCGATCAGTGAATACTCATGGGCCGGCGGCCCACCACCGCTTCATCTCCTCCTCCGACGACGGGAGCACCCCCGCCGCATCGCCACCGACGAGGAGGAGGCCATTGCCGGCGGCGCCCCCGGTGAGCCGCTGCACCACGTCCCTGA'
)
return result
# Run the analysis
result = await analyze_dna_sequence()
time.sleep(3)
In [6]:
Copied!
# Display the comprehensive analysis result
print("=== DNA Sequence Analysis Result ===")
print(result.output)
print("\n=== Usage Statistics ===")
print(result.usage())
# Display the comprehensive analysis result
print("=== DNA Sequence Analysis Result ===")
print(result.output)
print("\n=== Usage Statistics ===")
print(result.usage())
=== DNA Sequence Analysis Result === The analyzed DNA sequence exhibits characteristics of a **highly conserved regulatory region** with **promoter activity** and **open chromatin structure**, suggesting it plays a critical role in gene regulation. Here's the biological interpretation: --- ### **1. Promoter Activity (Core Promoter)** - **Model**: `promoter_model` - **Prediction**: Labeled as **"Core promoter"** with 94% confidence. - **Biological Significance**: - This sequence is likely a **transcription start site (TSS)** or a **core promoter region** critical for initiating gene transcription. - Core promoters typically include elements like the **TATA box**, **Nucleosome Remodeling Domain (NRD)**, or **Inr motifs**. - The high score suggests this region is actively involved in **RNA polymerase II recruitment** and **cis-regulatory control**. --- ### **2. Evolutionary Conservation** - **Model**: `conservation_model` - **Prediction**: Labeled as **"Conserved"** with 92% confidence. - **Biological Significance**: - The sequence is **evolutionarily conserved** across species, implying it is functionally important. - Conservation here indicates **selective pressure** to maintain this region, likely due to its role in **developmental regulation** or **housekeeping gene expression**. - Such regions are often part of **cis-regulatory modules (CRMs)** or **enhancer regions** that control gene expression patterns. --- ### **3. Open Chromatin Structure** - **Model**: `open_chromatin_model` - **Prediction**: Labeled as **"Full open"** with 94.6% confidence. - **Biological Significance**: - The sequence corresponds to an **open chromatin region** (e.g., **enhancer**, **promoter**, or **regulatory island**). - Open chromatin is typically **accessible to transcription factors (TFs)** and **epigenetic modifiers**, enabling **gene activation**. - This suggests the region is in an **active transcriptional state** and may interact with **distal regulatory elements** or **enhancer-promoter loops**. --- ### **Integrated Biological Function** The sequence is a **highly conserved regulatory element** with three key features: 1. **Promoter activity** (core promoter). 2. **Evolutionary conservation** (functional constraint). 3. **Open chromatin** (active regulatory state). This combination strongly suggests it is a **cis-regulatory module** involved in: - **Gene regulation** (e.g., enhancer-promoter interaction). - **Developmental control** (due to conservation and open chromatin). - **Transcriptional activation** (via open chromatin and promoter elements). Such regions are critical for **fine-tuning gene expression** in response to environmental signals or developmental cues. Further experiments (e.g., chromatin immunoprecipitation, reporter assays) would validate its functional role. === Usage Statistics === RunUsage(input_tokens=9299, output_tokens=1922, requests=3, tool_calls=2)
In [7]:
Copied!
# Alternative approach: Test individual tool calls to debug
async def test_individual_tools():
async with agent_ollama:
# First, let's try to list available tools
print("=== Testing tool availability ===")
# Try to get the agent to use the list models tool
list_result = await agent_ollama.run(
'Please list all the available DNA analysis models using the _list_loaded_models tool.'
)
print("List models result:")
print(list_result.output)
return list_result
# Run individual tool test
list_result = await test_individual_tools()
# Alternative approach: Test individual tool calls to debug
async def test_individual_tools():
async with agent_ollama:
# First, let's try to list available tools
print("=== Testing tool availability ===")
# Try to get the agent to use the list models tool
list_result = await agent_ollama.run(
'Please list all the available DNA analysis models using the _list_loaded_models tool.'
)
print("List models result:")
print(list_result.output)
return list_result
# Run individual tool test
list_result = await test_individual_tools()
=== Testing tool availability === List models result: Here are the three DNA analysis models currently loaded on the server: 1. **Promoter Prediction Model** - **Task Type**: Binary classification - **Labels**: "Not promoter" vs "Core promoter" - **Performance**: 85% accuracy, 82% F1 score - **Architecture**: DNABERT (plant-specific) - **Memory Usage**: ~351.7MB parameters, 369.3MB total - **Use Case**: Identifies promoter regions in plant genomes 2. **Conservation Prediction Model** - **Task Type**: Binary classification - **Labels**: "Not conserved" vs "Conserved" - **Performance**: 88% accuracy, 85% F1 score - **Architecture**: DNABERT (plant-specific) - **Memory Usage**: Same as promoter model - **Use Case**: Detects conserved genomic regions across species 3. **Open Chromatin State Model** - **Task Type**: Multiclass classification - **Labels**: "Not open", "Full open", "Partial open" - **Performance**: 82% accuracy, 79% F1 score - **Architecture**: DNAMamba (plant-specific) - **Memory Usage**: Slightly higher at 368.8MB parameters - **Use Case**: Characterizes chromatin accessibility states All models are optimized for plant genome analysis and use BPE tokenization. The DNAMamba architecture shows promising performance for chromatin state prediction. Would you like to analyze a specific DNA sequence with any of these models?
In [ ]:
Copied!