BioDSA is an extensible library of data science agents designed for biomedical research. It provides a unified framework for building, deploying, and extending AI agents that can analyze biomedical data, search knowledge graphs, and synthesize scientific evidence.
The library is designed to be modularβyou can use the existing research agents for your biomedical analysis tasks, or extend the base agent classes to develop new specialized agents for your specific needs.
DSWizard (Data Science Wizard) is a two-phase agent designed for reliable biomedical data analysis. It operates by first creating a detailed analysis plan in natural language, then converting that plan into executable Python code.
π Paper: BioDSA-1K: Benchmarking Data Science Agents for Biomedical Research and Making large language models reliable data science programming copilots for biomedical research
π Documentation: biodsa/agents/dswizard/README.md
π Tutorial: tutorials/dswizard_agent.ipynb
Key Features:
- Two-phase planning and implementation approach
- Dataset exploration before committing to analysis strategy
- Structured analysis plans with quality control steps
- Sandboxed Python code execution
from biodsa.agents import DSWizardAgent
agent = DSWizardAgent(model_name="gpt-5", api_type="openai", api_key="...")
agent.register_workspace("./biomedical_data/cBioPortal/datasets/acbc_mskcc_2015")
results = agent.go("Perform survival analysis comparing TP53 mutant vs wild-type patients")DeepEvidence is a hierarchical multi-agent system for comprehensive biomedical literature research and evidence synthesis. It leverages deep knowledge graph exploration to systematically gather, analyze, and synthesize evidence from 17+ biomedical knowledge bases.
π Paper: DeepEvidence: Empowering Biomedical Discovery with Deep Knowledge Graph Research
π Documentation: biodsa/agents/deepevidence/README.md
π Tutorial: tutorials/deepevidence_agent.ipynb
Key Features:
- Orchestrator + BFS/DFS subagent architecture for multi-scale search
- Integration with PubMed, ChEMBL, ClinicalTrials.gov, Gene Ontology, and more
- Persistent evidence graph with entity and relationship tracking
- Interactive HTML visualization of discovered knowledge
from biodsa.agents import DeepEvidenceAgent
agent = DeepEvidenceAgent(model_name="gpt-5", api_type="openai", api_key="...")
results = agent.go(
"What are the mechanisms of resistance to EGFR inhibitors in lung cancer?",
knowledge_bases=["pubmed_papers", "gene", "disease", "drug"]
)
results.export_evidence_graph_html("evidence_graph.html")BioDSA provides extensible base agent classes that you can use directly or extend for custom applications:
π Tutorial: tutorials/coder_agent.ipynb
A direct code generation agent that writes and executes Python/R code in a sandboxed environment.
from biodsa.agents import CoderAgent
agent = CoderAgent(model_name="gpt-5", api_type="openai", api_key="...")
agent.register_workspace("./data")
results = agent.go("Create a bar plot of sample distribution")A ReAct-style (Reasoning + Acting) agent that uses tool calling for iterative problem solving.
from biodsa.agents import ReactAgent
agent = ReactAgent(model_name="gpt-5", api_type="openai", api_key="...")
agent.register_workspace("./data")
results = agent.go("Analyze the mutation patterns in the dataset")- Added
DeepEvidenceAgentfor comprehensive biomedical literature research and evidence synthesis - Implemented hierarchical multi-agent architecture with Orchestrator + BFS/DFS subagents
- Integrated 17+ biomedical knowledge bases: PubMed, ChEMBL, ClinicalTrials.gov, Gene Ontology, HPO, KEGG, NCBI, OpenFDA, Open Genes, ProteinAtlas, PubChem, PubTator, Reactome, StringDB, UMLS, UniProt, and more
- Developed unified modality-wise search tools for cross-knowledge-graph entity bridging
- Added persistent evidence graph with entity/relationship tracking and interactive HTML visualization
- Created DeepEvidence benchmark suite: Cohort Optimization, Safety Criteria, Dose Design, Drug Discovery, Drug Repurposing, Endpoint Selection, Evidence Gap Analysis, and Sample Size Estimation
- Restructured agent architecture for improved modularity and extensibility
- Added base agent implementations:
CoderAgent,ReactAgent,DSWizardAgent - Introduced sandboxed execution environment with Docker integration
- Implemented comprehensive tool wrapper system for PubMed, Clinical Trials, and code execution
- Added memory graph system with BM25 indexing for enhanced context management
- Integrated benchmarks: BioDSA-1K, BioDSBench, HLE-Medicine, LabBench, SuperGPQA, TrialPanorama
- Enhanced
ExecutionResultsAPI with PDF report generation, artifact management, and JSON export - Included example cBioPortal datasets for quick start and testing
git clone https://github.com/your-org/BioDSA.git
cd BioDSAWe recommend using Python 3.12. Install dependencies using pipenv:
pip install pipenv
pipenv install
pipenv shellCreate a .env file with your API credentials:
# OpenAI
OPENAI_API_KEY=your_openai_api_key
# Azure OpenAI
AZURE_OPENAI_API_KEY=your_azure_key
AZURE_OPENAI_ENDPOINT=your_azure_endpoint
# Anthropic Claude
ANTHROPIC_API_KEY=your_anthropic_key
# Google Gemini
GOOGLE_API_KEY=your_google_keyimport os
from biodsa.agents import CoderAgent
agent = CoderAgent(
model_name="gpt-5",
api_type="openai",
api_key=os.environ.get("OPENAI_API_KEY")
)
agent.register_workspace("./biomedical_data/cBioPortal/datasets/acbc_mskcc_2015")
results = agent.go("Create a bar plot showing the distribution of samples per table")
print(results)
results.download_artifacts(output_dir="output_artifacts")
results.to_pdf(output_dir="reports")BioDSA supports two execution modes for agent-generated code: Docker sandbox (recommended) and local execution (fallback).
| Mode | Docker Required | Security | Artifacts | Use Case |
|---|---|---|---|---|
| Docker Sandbox | β Yes | β Isolated container | β Full support | Production, untrusted code |
| Local Execution | β No | Quick testing, trusted code |
If Docker is not available, agents automatically fall back to local execution mode:
- Code runs directly in your Python process using
exec() - Variables persist across executions within the same session
- Matplotlib plots are captured automatically
- Limitations:
- No process isolationβcode has full access to your system
- Generated files are saved to your current working directory
- Some artifact download features (e.g.,
download_artifacts()) are unavailable - Not recommended for untrusted or LLM-generated code in production
For secure, isolated code execution, set up the Docker sandbox:
- Install Docker Desktop or Docker Engine
- Ensure Docker daemon is running
cd biodsa_env/python_sandbox
./build_sandbox.shThis builds a Docker image biodsa-sandbox-py:latest with:
- Python 3.12
- Data science libraries (pandas, matplotlib, seaborn, scikit-learn, etc.)
- Statistical analysis tools (statsmodels, lifelines, etc.)
Monitor build progress:
tail -f biodsa_env/python_sandbox/build.logVerify installation:
docker images | grep biodsa-sandbox-pyBioDSA includes comprehensive benchmark datasets for evaluating agent performance on biomedical data science tasks.
π Full Documentation: benchmarks/README.md
Available Benchmarks:
| Benchmark | Description | Tasks |
|---|---|---|
| BioDSA-1K | Hypothesis validation from real biomedical studies | 1,029 |
| BioDSBench-Python | Python coding tasks for biomedical analysis | 128 |
| BioDSBench-R | R coding tasks for biomedical analysis | 165 |
| DeepEvidence | Evidence synthesis and knowledge graph research | Multiple task types |
| HLE-Biomedicine | Hard biomedical reasoning questions | 102 |
| HLE-Medicine | Hard medical reasoning questions | 30 |
| LabBench | Literature QA and database QA | 75 |
| SuperGPQA | Expert-level biology and medicine questions | 264 |
The biomedical_data/cBioPortal/ directory contains example datasets from cBioPortal:
biomedical_data/cBioPortal/datasets/acbc_mskcc_2015/
βββ data_clinical_patient.csv
βββ data_clinical_sample.csv
βββ data_mutations.csv
βββ data_cna.csv
βββ data_sv.csv
βββ data_gene_panel_matrix.csv
βββ available_table_paths.json
βββ LICENSE
Download more datasets:
- Portal: https://www.cbioportal.org/datasets
- DataHub: https://github.com/cBioPortal/datahub
agent = CoderAgent(
model_name="gpt-5",
api_type="openai",
api_key=os.environ.get("OPENAI_API_KEY"),
sandbox_image="biodsa-sandbox-py:latest",
workdir="/workdir"
)results = agent.go("Your task")
# Access components
print(f"Messages: {len(results.message_history)}")
print(f"Code executions: {len(results.code_execution_results)}")
print(f"Final response: {results.final_response}")
# Export
results.to_json("results.json")
results.to_pdf(output_dir="reports")
results.download_artifacts(output_dir="outputs")
# Resource monitoring
for execution in results.code_execution_results:
print(f"Runtime: {execution.get('running_time')}s")
print(f"Peak memory: {execution.get('peak_memory_mb')}MB")agent.clear_workspace() # Clear workspace and stop sandboxCheck the scripts/ directory for complete examples:
python scripts/run_coder_agent.py
python scripts/run_dswizard_agent.py
python scripts/run_deepevidence_agent.py
python scripts/run_react_agent.pyIf you use BioDSA in your research, please cite our papers:
@article{wang2026deepevidence,
title={DeepEvidence: Empowering Biomedical Discovery with Deep Knowledge Graph Research},
author={Wang, Zifeng and Chen, Zheng and Yang, Ziwei and Wang, Xuan and Jin, Qiao and Peng, Yifan and Lu, Zhiyong and Sun, Jimeng
},
journal={arxiv preprint arXiv:2601.11560},
year={2026}
}
@article{wang2025biodsa1k,
title={BioDSA-1K: Benchmarking Data Science Agents for Biomedical Research},
author={Wang, Zifeng and Danek, Benjamin and Sun, Jimeng},
journal={arXiv preprint arXiv:2505.16100},
year={2025}
}
@article{wang2026reliable,
title = {Making large language models reliable data science programming copilots for biomedical research},
author = {Wang, Zifeng and Danek, Benjamin and Yang, Ziwei and Chen, Zheng and Sun, Jimeng},
journal = {Nature Biomedical Engineering},
year = {2026},
doi = {10.1038/s41551-025-01587-2},
}
This project is licensed under the terms specified in the LICENSE file.