Physician · Clinical Data Scientist · Senior Statistical Programmer
Bridging healthcare and code — R, CDISC, OMOP CDM, and AI-driven pipelines for Real-World Evidence.
🌐 metahealthinfo.com · 💼 LinkedIn
I'm an MD who programs. I work at the seam between clinical trials, regulatory data standards, and applied AI — turning messy clinical source documents into submission-grade, reproducible R pipelines.
- 🧬 Clinical R / pharmaverse — moving teams from SAS to
admiral,cards,Tplyr,metacore,xportr. - 📐 CDISC standards — SDTM, ADaM, and the Analysis Results Standard (ARS) — annotated TLF shells → ARS JSON → ARD.
- 🏥 OHDSI / OMOP CDM — phenotype curation, MIMIC-IV ETL, LLM-assisted concept-set benchmarking.
- 🤖 Clinical AI — LLM pipelines for spec generation, metadata enrichment, and phenotype discovery.
| Repo | What it is |
|---|---|
| pharmaverse-tutorials | 46 interactive learnr tutorials (712 live exercises) for SAS→pharmaverse transition, on real CDISCPILOT01 data. |
| arsbridge | R package: parse/validate/execute CDISC ARS specs into tidy ARD via {cards}, with multi-LLM metadata enrichment. |
| ars-learnr-tutorial | 7-chapter hands-on course: annotated TLF shells → ARM-TS JSON on pharmaverse datasets. |
| cards-tutorial | 10-chapter {cards}/{cardx} course — ARD, model tidying, ARS JSON mapping. |
| omop-phenotype-pipeline | Benchmarking LLM-assisted OMOP phenotype curation — concept- vs patient-level F1 on MIMIC-IV. |
| precise-X 🔒 (private) | Lead statistical programmer — built and validated the Cox PH + LASSO survival model predicting first severe COPD exacerbation within 5 years from UK primary-care records. Published in Thorax (2025). 📄 Read the paper |
R · pharmaverse (admiral, cards, Tplyr, metacore, xportr, teal) · SAS · SQL / PostgreSQL · Python · OMOP CDM · MIMIC-IV · CDISC SDTM / ADaM / ARS · Docker · LLM pipelines (Claude, Gemini)
Open to clinical-R, RWE, and health-AI collaboration. Reach out via any repo discussion or my site.
