excel-to-engine

Turn your Excel financial model into a live, queryable engine — and a clean JavaScript bundle your developer (or AI coding agent) can drop into a web app. You don't have to be a programmer to use it.

Who this is for

You built (or inherited) a financial model in Excel — an LBO, a fund waterfall, a real-estate pro forma, a 3-statement model, a venture portfolio. You want to:

explore scenarios without breaking the spreadsheet ("what's IRR if we exit at 12×?"), or
turn it into an interactive web app — e.g. a what-if explorer you can share with LPs or an IC.

This toolkit does both. It reads your .xlsx, reproduces every formula in JavaScript, and hands you (a) a question-answering CLI and (b) a self-contained JS engine with a documented input/output contract a developer can wire into an app in an afternoon.

The easiest way to use this is with an AI coding assistant (Claude Code, Cursor, Copilot, etc.). You talk; it runs the commands. See GETTING_STARTED.md for the copy-paste prompts. If you'd rather drive it yourself, the manual steps are below.

The 3-minute path (with an AI assistant)

Open this repo in your AI coding assistant and put your model file next to it.
Say:

"I'm not a programmer. I have a financial model at my-model.xlsx. Set this repo up and convert my model into an engine, then walk me through what it found."
The assistant runs the one-time setup, runs init, and shows you a plain-English summary. Sanity-check the numbers against your spreadsheet. If something looks off, just say so — there are built-in tools to correct it.
When you're happy, say:

"Great. Now give me the engine bundle and a guide so my developer can build a web app from it."

You'll get a folder containing the engine plus an INTEGRATION.md and a runnable example.mjs — everything a coding agent needs.

That's the whole journey: Excel → verified engine → web app, with the assistant doing the typing.

What you get

Input	Output
One `.xlsx` file	A JS engine (`engine.js` with a `run()` function), a machine-readable input/output contract (`named-inputs.json` / `named-outputs.json`), a scenario CLI (`ete`), and a tailored integration guide for developers.

The engine reproduces your spreadsheet's math. run() executes the actual transpiled formulas, so the numbers match Excel. A built-in check confirms it reproduces your base case.
A documented contract. Every lever (input) and answer (output) has a human name, a cell address, a base-case value, and a "what does this affect" map — so a developer never has to reverse-engineer your spreadsheet.
Works on models from 3 KB to 84 MB (2–82 sheets, up to 6M cells). Validated at 99.3% blind-eval accuracy across a suite of real models (149/150 questions, 15.5M cells).

For the developer / coding agent

If an analyst handed you a converted model folder, open chunked/INTEGRATION.md and run node chunked/example.mjs. That's the fast path. In short:

import { run } from './engine.js';

const base = run();                                   // base case (matches Excel)
const alt  = run({ "Assumptions!B4": 14 });           // override a lever, recompute everything
console.log(alt.values["IC Summary!B13"]);            // read an output by cell

named-outputs.json maps friendly names → cells (+ base-case values to spot-check against).
named-inputs.json lists the levers (+ which outputs each one affects).
It's plain ES modules, zero dependencies — runs in Node and the browser.

See GETTING_STARTED.md → "Build a web app" for a full wiring sketch.

Manual setup (if you're not using an assistant)

Prerequisites: Node.js 18+ and (one-time) the Rust toolchain to build the parser.

git clone https://github.com/ebootheee/excel-to-engine.git
cd excel-to-engine
npm install

# One-time: build the Excel parser (this is the only build step)
npm run build:parser          # = cargo build --release in pipelines/rust
#   No Rust? Install it from https://rustup.rs/ first.

# Check everything is ready (tells you exactly what's missing, if anything):
npm run check-env

Then convert a model and ask questions:

# Parse your model → engine + manifest + contract + integration guide, in one step
node cli/index.mjs init my-model.xlsx --output ./my-model/

# Look at what it found
node cli/index.mjs summary ./my-model/chunked/

# Explore scenarios
node cli/index.mjs scenario ./my-model/chunked/ --exit-multiple 16
node cli/index.mjs sensitivity ./my-model/chunked/ --vary exit-multiple:14-22:2 --metric grossIRR

Tip: npm link (or npx ete) lets you type ete … instead of node cli/index.mjs …. The examples below use the long form so they work without linking.

CLI Commands

`ete summary` — Model Overview

$ node cli/index.mjs summary ./my-model/chunked/

Model: Example Fund (pe_platform)
Period: 2024–2030 (6yr, annual) | Exit: 2030 @ 18.5x EBITDA

Revenue Segments                       Start        Exit        CAGR
  Real Estate NOI                     $45.2M      $52.1M        2.4%
  Technology Gross Profit              $8.3M      $22.7M       18.3%

Platform EBITDA             $41.4M → $59.0M  (CAGR: 6.1%)
Terminal Value              $1.1B

Returns                    Gross         Net
  MOIC                     2.85x       2.45x
  IRR                      28.4%       24.1%

Carry: $50.3M (3 tiers), 8% pref
Equity: 1 class (Series A), basis $270.0M

`ete verify` — Confirm the engine matches your spreadsheet

node cli/index.mjs verify ./my-model/chunked/
# → "✓ engine.run() reproduces the model's base case exactly. Safe to hand off."

Runs the generated engine with no overrides and checks every named output against its base-case value. The trust signal to run before handing the bundle to a developer (also available as ete init --verify).

`ete query` — Find Anything

node cli/index.mjs query ./my-model/chunked/ --search "headcount"     # search by label
node cli/index.mjs query ./my-model/chunked/ "Valuation!K54"          # look up a cell
node cli/index.mjs query ./my-model/chunked/ --name grossIRR          # look up by name

`ete scenario` — What-If Analysis

node cli/index.mjs scenario ./my-model/chunked/ --exit-multiple 16
node cli/index.mjs scenario ./my-model/chunked/ \
  --exit-multiple 14 --exit-year 2033 --revenue-adj techGP:-20% --cost-adj technology:+10%
node cli/index.mjs scenario ./my-model/chunked/ --exit-multiple 14 --save "bear"

Full parameter set:

Category	Parameters
Exit	`--exit-year`, `--exit-multiple`, `--revenue-multiple`
Revenue	`--revenue-adj seg:±%/$`, `--revenue-growth seg:rate`, `--remove-segment`, `--add-revenue`, `--override-arr`
Cost	`--cost-adj seg:±%/$`, `--line-item id:adj`, `--cost-ratio seg:ratio`, `--capitalize item:years`
Capital	`--leverage ltv`, `--equity-override`, `--distribution year:amount`
Valuation	`--sotp`, `--segment-multiple seg:n`, `--discount-rate`
Returns	`--pref-return rate`, `--hold-period years`
Scenarios	`--file scenario.json`, `--save name`, `--load name`, `--list`
Output	`--metric list`, `--format table\|json\|csv\|markdown`, `--attribution`

`ete sensitivity` — IRR/MOIC Surfaces

node cli/index.mjs sensitivity ./my-model/chunked/ --vary exit-multiple:14-22:2 --metric grossIRR,grossMOIC
node cli/index.mjs sensitivity ./my-model/chunked/ --vary exit-multiple:14-22:2 --vary exit-year:2028-2034:1 --metric grossIRR

`ete carry` — Waterfall GP Carry

node cli/index.mjs carry ./my-model/chunked/                              # from the manifest
node cli/index.mjs carry --peak 500e6 --moc 2.8 --life 4.7 --pref 0.08 --carry 0.20 --ownership 0.06

`ete compare`, `ete extract`, `ete explain`, `ete eval`, `ete manifest`

Side-by-side scenarios with attribution; time-series schedules (capital calls, distributions, debt); audit trails; exact formula evaluation; and manifest configuration. Run node cli/index.mjs --help for the full list, or see skill/SKILL.md.

How It Works

Excel (.xlsx)
  → Rust parser (calamine, 10–50x faster than SheetJS)
    → Per-sheet JS modules (formulas transpiled to JavaScript)  ──┐
    → Ground truth JSON (every cell value from Excel)             │
    → Model manifest (financial concepts → cells)                 ├─→  engine.js (run())
    → Contract maps (named-outputs / named-inputs / cell-types)  ─┘    + INTEGRATION.md + example.mjs
      → CLI scenario engine (instant what-ifs via a delta cascade)

ete init does all of this in one step and finishes by emitting the developer handoff bundle (INTEGRATION.md + example.mjs) into the output folder.

Two ways to compute a scenario

engine.run() executes the model's actual transpiled formulas — exact, the right choice for an app's production math.
The CLI's scenario/sensitivity use a fast first-order approximation (a "delta cascade") for instant analyst queries. Great for exploration; use run() when you need numbers that tie out to the penny.

Downstream contract artifacts

ete init emits a few small JSON files into chunked/ so other apps can wire up the engine by name, at build time — without running the engine to discover which cells hold the outputs (and without shipping the silent-NaN bug you get from guessing the wrong cell):

File	Shape	Use
`named-outputs.json`	`name → { cell, label, type, format, baseCaseValue, source, dependsOnNamedInputs }`	The contract for downstream apps. Look up `grossMOIC`, get its cell + base-case value, spot-check on import. If your observed value ≠ `baseCaseValue`, your cell map is stale — fail the build. Time-series outputs are `type:"schedule"` with `cellRange` + `perYear:[{year,value}]`; their scalar `baseCaseValue` is a life-to-date `sum` for flows and the `terminal` level for balances (`aggregation` says which) — `perYear` is authoritative.
`named-inputs.json`	`name → { cell, type, default, referencedBy, affectsOutputs }`	Drive `engine.run({ [cell]: value })` for what-ifs. `affectsOutputs` says which outputs to invalidate (don't regenerate the whole grid). Lists Excel defined-name cells read by ≥1 formula plus the model drivers `exitMultiple` / `exitYearSelector` / `hurdleRate` (`source:"manifest-driver"`, derived from the manifest + ground truth, so they emit even without the `.xlsx`).
`cell-types.json`	`cell → "number" \| "label" \| "boolean" \| "empty"`	Tell a label string from a numeric output, and a real `0` (present, `"number"`) from a never-computed cell (absent from this map).
`build-manifest.json`	`{ layoutVersion, engine:{ entry, export }, contentHash, complete, artifacts[] }`	The locked artifact layout + a stable `contentHash` over the identity artifacts (engine.js, sheets/, _ground-truth.json, manifest.json). Pin a build by its `contentHash`; it's stable across rebuilds of the same workbook and changes on drift, so you reconcile deliberately, not per version. `complete:false` / `missingRequired` flag an unrunnable build.
`dependency-graph.json` (debug)	`{ format:"cell-dependency-edges-v2", edges: cell → [cells/ranges it reads] }`	Cell-level forward edges — the raw material for the `dependsOnNamedInputs` / `affectsOutputs` closures above. Ranges are kept as compact tokens (`Sheet!A1:B10`), not expanded to interior cells: full expansion was 37 GB / ~7 min on the real models and OOM-killed the closure-baking step (#32); the compact form is ~0.5 GB. Written one edge per line (still valid JSON) so a >512 MiB graph can be read line-by-line without exceeding Node's max string length. Consumers expand a token lazily against the cells they care about. Removed from the default output once the closures are baked into the named maps; re-run `ete init --emit-debug` to keep it (plus the root `model-map.json`) for offline analysis or closure recomputation.

Names come from the workbook's defined-name table (the model owner's curated named cells) when present — these are authoritative and override heuristic detection. Regenerate without a re-parse: ete manifest maps ./my-model/chunked/ --excel model.xlsx.

Note: the defined-name inputs and defined-name enrichment of outputs require the source .xlsx. Without it (e.g. --reuse-parse against an already-parsed dir), outputs + cell-types + the manifest-driver inputs (exitMultiple / exitYearSelector / hurdleRate) still emit from the manifest and ground truth; only the defined-name inputs are skipped.

Default output stays small. ete init keeps only what consumers and the CLI actually read: the engine modules, _ground-truth.json (compact), _labels.json, the sheet-level _graph.json (3 KB — topo order + clusters, read by the per-sheet eval), the contract maps, the manifest, and build-manifest.json. The large intermediate/debug artifacts — the cell-level dependency-graph.json (~0.5 GB) and the root model-map.json (600+ MB on the biggest models) — are dropped after the closures are computed. The high-value data survives as the closures inside the named maps. Pass --emit-debug to retain dependency-graph.json.

Golden-master gate. eval/golden-master.mjs (run via npm run golden <chunkedDir>) is the post-build assert for these artifacts: with --assert-no-fallbacks it fails if any return/value output resolves through an unsupported-function (_fn) stub, and with --canonical <file> it diffs named-outputs.baseCaseValues against a canonical returns map to full float precision. CI runs it on a synthetic fixture (npm run test:golden); point it at a real build with ETE_GOLDEN_DIR + a gitignored canonical-returns.json to verify a regenerated model still reproduces the hand-port's gross/net MOIC & IRR exactly.

Lazy engine for large models (`--lazy-engine`)

The default engine.js statically imports every per-sheet module, so import('engine.js') pulls all of them into memory (hundreds of MB on the big PE models — dominated by a couple of monster sheets) before run() can be called. For a consumer that only needs to sample the model (the calibration- oracle use case), that load is the wall.

ete init --lazy-engine emits an engine that imports sheet modules on demand:

import engine from './my-model/chunked/engine.js';

// Load only what you need, then run() synchronously (same return shape as always).
await engine.load({ cells: ['Returns!D22', 'Returns!E22'] }); // loads just the
                                                               // dependency cone
const { values, meta } = engine.run({ 'Assumptions!B3': 18 });  // override + run

// Or do both in one call:
const r = await engine.runScoped({ 'Assumptions!B3': 18 }, { cells: ['Returns!D22'] });

load(options) — { sheets: [...] } and/or { cells: ['Sheet!A1', ...] } loads only those sheets plus their transitive dependency closure (whole circular clusters are pulled in as a unit). No options ⇒ load everything (still lazy, but complete). To scope to named outputs, map their names → cells via named-outputs.json, then pass cells.
run(inputs, options) — unchanged synchronous semantics; throws if called before anything is loaded. Sheets outside the loaded cone are simply skipped.
runScoped(inputs, options) — await load(options) then run(inputs, options).

The default build is unchanged — engine.js stays eager and run() stays synchronous, so existing integrations are untouched. --lazy-engine is purely opt-in. (Per-sheet modules are emitted either way; the flag only changes how engine.js loads them.)

The Delta Cascade

When you run a scenario, the CLI doesn't re-execute the full engine (which can take 10+ minutes on large models). Instead, it:

Reads base case values from ground truth (instant — JSON lookup)
Applies your adjustments to the annual P&L
Recomputes the chain: exit EBITDA → terminal value → exit equity → MOIC → IRR → carry
Uses lib/irr.mjs (Newton-Raphson) and lib/waterfall.mjs (American/European PE structures) for returns

This is a first-order approximation — accurate for linear sensitivities (revenue %, cost %, multiple changes, exit timing, leverage). For highly non-linear scenarios (MIP triggers, complex pref compounding), use the full chunked engine.

Project Structure

excel-to-engine/
├── cli/                         # The `ete` command
│   ├── index.mjs                # Entry point + arg parsing
│   ├── commands/                # init, summary, query, pnl, scenario, sensitivity, compare, manifest
│   ├── extractors/              # date-detector, annual-aggregator, segment-detector, waterfall-detector, line-item-resolver
│   └── solvers/                 # delta-cascade (financial math), scenario-engine (orchestrator)
├── skill/SKILL.md               # Claude Code skill (PE language → CLI translation)
├── pipelines/
│   ├── rust/                    # Excel → JS transpiler (8 Rust modules, ~60 Excel functions)
│   └── js-reasoning/            # Claude-driven pipeline for smaller models
├── eval/                        # Blind eval, per-sheet eval, golden-master gate, auto-iteration
├── lib/                         # Financial libraries (IRR, waterfall, calibration, sensitivity, manifest)
└── tests/cli/                   # 166 tests (34 integration + 132 use-case scenarios)

Libraries

Library	Purpose
`lib/manifest.mjs`	Manifest schema, auto-generation, validation, cell resolvers, label search
`lib/manifest-maps.mjs`	Downstream contract maps (named-outputs/inputs.json, cell-types.json)
`lib/build-manifest.mjs`	Locked artifact layout + content hash (build-manifest.json)
`lib/irr.mjs`	Newton-Raphson IRR with bisection fallback, XIRR for irregular dates
`lib/waterfall.mjs`	American + European PE waterfall structures
`lib/calibration.mjs`	Scale factor calibration against Excel targets
`lib/sensitivity.mjs`	Surface extraction, slope comparison, breakpoint detection
`lib/excel-parser.mjs`	Cell reading, sheet fingerprinting, year detection, field mapping

Accuracy

A fresh Claude API session with zero knowledge of the engine answers 25 randomized financial questions per model:

Model	Sheets	Cells	Blind Eval
Fund model A	2	5.7K	25/25 (100%)
Fund model B	7	96K	25/25 (100%)
Platform model A	51	1.8M	25/25 (100%)
Platform model B	60	1.8M	25/25 (100%)
Corporate model A	20	5.8M	25/25 (100%)
Corporate model B	21	6.1M	24/25 (96%)
Total		15.5M cells	149/150 (99.3%)

~60 Excel functions transpiled: SUM, IF, VLOOKUP, INDEX/MATCH, IRR, XIRR, NPV, PMT, SUMIFS, COUNTIFS, INDIRECT, OFFSET, and more.

Use with Claude Code (or any AI assistant)

The toolkit ships a skill (skill/SKILL.md) that translates natural language into CLI commands, so you can ask questions instead of memorizing flags:

"What happens to returns if tech grows at 40% instead of 30%?"
"Show me a sensitivity table for exit multiples and timing"
"Build me bear, base, and bull cases for the board deck"

The assistant handles model conversion, cell references, and the manifest behind the scenes. See GETTING_STARTED.md for the guided flow.

Project Structure

excel-to-engine/
├── cli/                 # The `ete` command (init, summary, query, pnl, scenario, …)
├── lib/                 # IRR, waterfall, manifest, contract maps, integration-doc, verify-engine
├── skill/SKILL.md       # AI-assistant skill (natural language → CLI)
├── pipelines/rust/      # Excel → JS transpiler (the parser you build once)
├── eval/                # Blind-eval accuracy harness (optional; needs an API key)
├── tests/               # CLI + onboarding + use-case suites
└── GETTING_STARTED.md   # The guided "walk me through it" companion

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 178 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
cli		cli
docs		docs
eval		eval
lib		lib
pipelines		pipelines
scripts		scripts
skill		skill
templates		templates
tests		tests
tools		tools
.gitattributes		.gitattributes
.gitignore		.gitignore
ANALYST_UX_REPORT.md		ANALYST_UX_REPORT.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
GETTING_STARTED.md		GETTING_STARTED.md
HANDOFF-analyst-ux.md		HANDOFF-analyst-ux.md
HANDOFF.md		HANDOFF.md
LICENSE		LICENSE
PLAN.md		PLAN.md
PLAN_V3.md		PLAN_V3.md
PLAN_V4.md		PLAN_V4.md
README.md		README.md
ROADMAP.md		ROADMAP.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

excel-to-engine

Who this is for

The 3-minute path (with an AI assistant)

What you get

For the developer / coding agent

Manual setup (if you're not using an assistant)

CLI Commands

`ete summary` — Model Overview

`ete verify` — Confirm the engine matches your spreadsheet

`ete query` — Find Anything

`ete scenario` — What-If Analysis

`ete sensitivity` — IRR/MOIC Surfaces

`ete carry` — Waterfall GP Carry

`ete compare`, `ete extract`, `ete explain`, `ete eval`, `ete manifest`

How It Works

Two ways to compute a scenario

Downstream contract artifacts

Lazy engine for large models (`--lazy-engine`)

The Delta Cascade

Project Structure

Libraries

Accuracy

Use with Claude Code (or any AI assistant)

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

excel-to-engine

Who this is for

The 3-minute path (with an AI assistant)

What you get

For the developer / coding agent

Manual setup (if you're not using an assistant)

CLI Commands

ete summary — Model Overview

ete verify — Confirm the engine matches your spreadsheet

ete query — Find Anything

ete scenario — What-If Analysis

ete sensitivity — IRR/MOIC Surfaces

ete carry — Waterfall GP Carry

ete compare, ete extract, ete explain, ete eval, ete manifest

How It Works

Two ways to compute a scenario

Downstream contract artifacts

Lazy engine for large models (--lazy-engine)

The Delta Cascade

Project Structure

Libraries

Accuracy

Use with Claude Code (or any AI assistant)

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`ete summary` — Model Overview

`ete verify` — Confirm the engine matches your spreadsheet

`ete query` — Find Anything

`ete scenario` — What-If Analysis

`ete sensitivity` — IRR/MOIC Surfaces

`ete carry` — Waterfall GP Carry

`ete compare`, `ete extract`, `ete explain`, `ete eval`, `ete manifest`

Lazy engine for large models (`--lazy-engine`)

Packages