End-to-end CDISC SDTM → ADaM → TLF submission package, replicating the FDA-reviewed 2007 CDISC Pilot Alzheimer's study (
CDISCPILOT01). Built in parallel R ({admiral}+ pharmaverse) and SAS 9.4, withPROC COMPARE(R↔SAS equivalence) and Pinnacle 21 Community (ADaMIG 1.3) validation harnesses wired into the build.
- CDISC ADaM dataset generation - ADSL, ADAE, ADLB, ADTTE, ADQSADAS, all conformant to ADaMIG 1.3
- Dual implementation (R + SAS) with a
PROC COMPAREharness for R↔SAS numerical equivalence (criterion=1e-9) - Submission-quality TLFs - 4 tables, 2 listings, 2 figures (
{rtables},{tern},{rlistings},survminer,ggplot2) - MMRM primary efficacy analysis with ICH E9(R1) estimand specification
define.xmlv2.0 generated from variable-level metadata Excel specs- Reproducibility -
renvlockfile, GitHub Actions CI, end-to-end in ~20 seconds - Regulatory writing - mock Statistical Analysis Plan + Analysis Data Reviewer's Guide (PHUSE template)
| Deliverable | Path | Format |
|---|---|---|
| 5 ADaM XPT v5 datasets | data/xpt/*.xpt |
binary, committed |
| 4 submission tables | outputs/tlfs/t_*.pdf |
|
| 2 listings | outputs/tlfs/l_*.pdf |
|
| 2 figures | outputs/tlfs/f_*.pdf |
|
| Define-XML 2.0 | outputs/define/define.xml |
XML |
| Statistical Analysis Plan | docs/sap/sap.pdf |
|
| Analysis Data Reviewer's Guide | docs/adrg/adrg.pdf |
|
| Pinnacle 21 validation harness | python/pinnacle21_parser.py, outputs/validation/ |
wired - run pending |
PROC COMPARE harness |
sas/compare_r_sas.sas |
wired - run pending |
pharmaversesdtm cdisc-org/sdtm-adam-pilot-project
(DM, EX, DS, AE, LB) (QS.xpt, gitignored in data/raw/)
│ │
└──────────────┬──────────────┘
▼
R/01_adsl.R - R/05_adqsadas.R sas/adsl.sas
(admiral pharmaverse build) (SAS 9.4 parallel build)
│ │
▼ ▼
data/adam/*.rds data/xpt/sas_adsl.xpt
│ │
▼ │
R/06_export_xpt.R │
(xportr labels/lengths/formats) │
│ │
▼ │
data/xpt/*.xpt ◄──── PROC COMPARE ◄──────────────┘
│ (criterion 1e-9)
┌──────────────┼──────────────┐
▼ ▼ ▼
R/07-09 R/10_define_xml Pinnacle 21
TLF gen define.xml v2.0 ADaMIG 1.3
│ │ │
▼ ▼ ▼
outputs/tlfs/ outputs/define/ outputs/validation/
│
▼
python/pinnacle21_parser.py
│
▼
README badge / JSON
Quickstart (clean checkout, ~30 minutes including first-time package install):
git clone https://github.com/kbd0011/cdisc-pilot-replication.git
cd cdisc-pilot-replication
# 1. Restore the exact R package versions pinned in renv.lock
Rscript -e 'renv::restore()'
# 2. Install Python validators
pip install -r requirements.txt
# 3. Download external SDTM (qs.xpt, ~33MB)
make fetch-raw
# 4. Run the whole pipeline
make allmake all runs three phases:
make pipeline # R/00_setup → 10_define_xml (~20s, all tests green)
make validate # python validators + xpt round-trip + P21 parse
make docs # Quarto render SAP + ADRG to PDF
The SAS parallel build is run manually on SAS OnDemand for Academics - see sas/README.md.
| Dataset | Rows | Vars | XPT size |
|---|---|---|---|
| ADSL | 306 | 39 | 254 KB |
| ADAE | 1,191 (1,122 TE) | 31 | 1.3 MB |
| ADLB | 58,700 | 28 | 41 MB |
| ADTTE | 254 (92 events) | 17 | 154 KB |
| ADQSADAS | 12,241 | 26 | 8.2 MB |
| Validation | Status |
|---|---|
xpt::xpt_validate() on all 5 |
OK |
XPT vs spec round-trip (python/xpt_metadata_check.py) |
OK - all datasets pass |
define.xml well-formedness (xmllint, validate_define.py) |
OK |
| testthat suite | 745/745 expectations pass (90 tests across 5 ADaM datasets + XPT export) |
| Pinnacle 21 ADaMIG 1.3 | Pending P21 run - see outputs/validation/README.md |
SAS PROC COMPARE |
Pending ODA run - see sas/README.md |
docs/sap/sap.pdf- Statistical Analysis Plan (MMRM, ICH E9(R1) estimand, KM/Cox, missing-data sensitivity)docs/adrg/adrg.pdf- Analysis Data Reviewer's Guide (PHUSE 2019-07-18 template)sas/README.md- SAS OnDemand for Academics workflowoutputs/validation/README.md- Pinnacle 21 + XSD-validation workflowdata/README.md- Source data + reproducibility noticemetadata/README.md- Variable-level spec format
cdisc-pilot-replication/
├── R/ pipeline scripts (00_setup → 10_define_xml + run_all)
├── sas/ parallel ADSL + PROC COMPARE
├── python/ define.xml / XPT / P21 validators
├── metadata/ variable-level Excel specs (141 vars across 5 datasets)
├── data/
│ ├── raw/ external SDTM (qs.xpt) - gitignored
│ ├── adam/ R-built ADaM .rds - gitignored
│ └── xpt/ XPT v5 submission deliverables
├── outputs/
│ ├── tlfs/ 4 tables, 2 listings, 2 figures
│ ├── validation/ P21 + PROC COMPARE + JSON summaries
│ └── define/ define.xml + (user-supplied) define2-0-0.{xsd,xsl}
├── docs/
│ ├── sap/sap.pdf
│ └── adrg/adrg.pdf
└── tests/ testthat (90 tests, 745 expectations)
MIT - see LICENSE.