Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions .github/workflows/pr-dashboard.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,14 @@ jobs:

echo "" >> /tmp/pr_dashboard.md

# Compute summary counts BEFORE rendering the table.
# Prior version referenced $TOTAL / $FAILING before they were assigned,
# which crashed under `set -u` with "unbound variable" on every PR.
TOTAL=$(echo "$PR_DATA" | jq -r 'length')
FAILING=$(echo "$PR_DATA" | jq -r '[.[] | select((.statusCheckRollup // []) | any(.conclusion == "FAILURE" or .conclusion == "CANCELLED" or .conclusion == "TIMED_OUT"))] | length')
READY=$(echo "$PR_DATA" | jq -r '[.[] | select((.statusCheckRollup // []) | all(.conclusion == "SUCCESS" or .conclusion == "SKIPPED" or .conclusion == null))] | length')
PENDING=$(echo "$PR_DATA" | jq -r '[.[] | select((.statusCheckRollup // []) | any(.conclusion == null))] | length')

# Summary table
echo "## Summary" >> /tmp/pr_dashboard.md
echo "" >> /tmp/pr_dashboard.md
Expand All @@ -49,12 +57,8 @@ jobs:
echo "| PRs with Failing Checks | $FAILING |" >> /tmp/pr_dashboard.md
echo "| PRs with All Checks Green | $((TOTAL - FAILING)) |" >> /tmp/pr_dashboard.md

READY=$(echo "$PR_DATA" | jq -r '[.[] | select((.statusCheckRollup // []) | all(.conclusion == "SUCCESS" or .conclusion == "SKIPPED" or .conclusion == null))] | length')
FAILED=$(echo "$PR_DATA" | jq -r '[.[] | select((.statusCheckRollup // []) | any(.conclusion == "FAILURE" or .conclusion == "CANCELLED" or .conclusion == "TIMED_OUT"))] | length')
PENDING=$(echo "$PR_DATA" | jq -r '[.[] | select((.statusCheckRollup // []) | any(.conclusion == null))] | length')

echo "| READY | $READY |" >> /tmp/pr_dashboard.md
echo "| FAILING | $FAILED |" >> /tmp/pr_dashboard.md
echo "| FAILING | $FAILING |" >> /tmp/pr_dashboard.md
echo "| PENDING | $PENDING |" >> /tmp/pr_dashboard.md

- name: Post PR Comment
Expand Down
104 changes: 104 additions & 0 deletions BENCHMARKS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# BENCHMARKS.md -- Restrained Benchmark Posture

> **Policy:** publish only numbers we can reproduce from this repo,
> from a sealed spec or generated file. No "expected" or "projected"
> figures appear here. **When in doubt, omit the row.**

This document is a register of what is benchmarked (and what is **not**) in
this repository, kept conservative on purpose. It complements
[`COMPETITORS.md`](COMPETITORS.md), which states what we do not claim.

---

## 1. What exists in-repo today

### 1.1 Conformance vectors (correctness, not throughput)

| Vector file (under `conformance/`) | Purpose |
|--------------------------------------------|--------------------------------------------------|
| `FORMAT-SPEC-001.json` | GoldenFloat family registry (SSOT for the line). |
| `gf*_vectors.json` | Arithmetic conformance vectors for GF widths. |
| `ar_*.json` | CLARA-style assurance reasoning vectors. |
| `nn_*.json` | Neural architecture conformance vectors. |
| `sacred_physics*.json` | phi / Trinity identity conformance. |
| `gf_competitive_bench.json` | Skeleton benchmark file. **Most rows are placeholders.** |

Validation entry point: `./scripts/tri validate-conformance`. These vectors
test **correctness against the spec**, not silicon throughput.

### 1.2 Benchmark specs (under `specs/benchmarks/`)

- `bench_main.t27`
- `bench_nn.t27`
- `ternary_vs_binary.t27`

These specs define **measurement procedures**. The numbers they would
produce belong with the chip repos or with a future
`bench/results_*.json` set. **No silicon-level numbers from these specs
appear in this document.**

### 1.3 FPGA / Vivado scripts

- `fpga/vivado/build.tcl`, `fpga/vivado/build_gf16.tcl`
- testbenches: `gf16_add_tb.v`, `gf16_mul_tb.v`, `gf16_dot4_tb.v`,
`gf16_matmul4x4_tb.v`

These produce simulation and synthesis-level evidence (latch-free,
timing-reported), but **not** end-to-end accelerator throughput numbers.

### 1.4 Misc

- `bench/results_v02_real.json` -- legacy results file, not maintained as a
benchmark target for this line. Treat as historical.
- `benchmarks/phi_attractor_convergence.py` -- a research convergence
script, not a product benchmark.

---

## 2. What we deliberately do not publish here

Items in this list are **not** to be quoted from this document:

1. TOPS or TOPS/W for any TRI-NET chip until that chip reaches **SILICON**
per [`STATUS.md`](STATUS.md).
2. Latency / throughput vs. Hailo-8, Coral Edge TPU, Axelera Metis,
Qualcomm Cloud AI 100 Ultra, MediaTek Dimensity 9400+ -- see
[`COMPETITORS.md`](COMPETITORS.md) for why.
3. Accuracy parity with FP16 / BF16 on ImageNet, LLM perplexity, or any
other model-level benchmark, until a reproducible vector lands under
`conformance/`.
4. Any "expected" / "projected" / "target" figure. If a number is not
measured, it is not in this document.

---

## 3. How to add a benchmark (the only allowed way)

1. **Land the spec.** Add a `.t27` under `specs/benchmarks/` describing
the measurement.
2. **Land conformance vectors.** Add a `*_vectors.json` under
`conformance/`.
3. **Land a results file.** Add `bench/<name>_results.json` produced by
`./scripts/tri test` or an equivalent reproducible run, including the
commit hash of the run.
4. **Add a row to this document** pointing to the spec + vectors + results.

Rows that point at "expected" results MUST NOT be added.

---

## 4. External references (for context only)

For research context cited elsewhere in this docs package:

- BitNet b1.58, 1.58-bit ternary LLM weights:
https://arxiv.org/abs/2402.17764
- Tiny Tapeout chip catalogue:
https://tinytapeout.com/chips/

These links are **not** sources of benchmark numbers for TRI-NET; they are
sources of **direction** for the research line.

---

**phi^2 + 1/phi^2 = 3 | TRINITY**
93 changes: 93 additions & 0 deletions CLARA_TRACEABILITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# CLARA_TRACEABILITY.md -- Mapping to DARPA CLARA Public Goals

> **Scope:** this document maps **public-facing** goals of DARPA's CLARA
> program to **specific artefacts in this repository**, so that an
> external reviewer can trace claim -> file.
>
> It is **not** a claim of CLARA participation, award, or endorsement.
> Where the word "CLARA" appears below, it refers to the **publicly
> described program** at:
>
> - DARPA CLARA: https://www.darpa.mil/research/programs/clara

---

## 1. Why this document exists

The TRI-NET line is positioned in the **high-assurance** corner of AI
silicon (see [`COMPETITORS.md`](COMPETITORS.md)). DARPA CLARA is the most
visible public program articulating goals in this corner. Rather than make
loose "CLARA-aligned" statements, this document gives a single page that
points each public CLARA goal at a file or directory in this repo, with
honest gaps marked.

Sources of CLARA goals: the public program page above. All wording of
"CLARA goal" rows is paraphrased from public material; treat the linked
page as authoritative.

---

## 2. Mapping table

The "goal" column paraphrases public language; the "artefact" column points
into this repo; the "level" column uses [`STATUS.md`](STATUS.md) readiness
levels where it makes sense, or `n/a` where the artefact is a document not
a build target.

| Public CLARA goal (paraphrased) | Artefact in this repo | Level |
|-----------------------------------------------------------------|-------------------------------------------------------------|-----------|
| Compositional AI assurance -- combining ML and AR components | `clara-bridge/` (4 hybrid patterns documented in README) | demo |
| Bounded reasoning / explainability over inference steps | `clara-bridge/` proof-trace work; `specs/ar/` AR specs | demo/SPEC |
| Polynomial-time complexity guarantees on the assurance path | Stated in `clara-bridge/README.md`; specs under `specs/ar/` | demo |
| Formal verification of components prior to composition | `coq/Kernel/`, `coq/Theorems/`, `coq/IGLA/`, `proofs/` | partial |
| Reproducible build pipeline auditable end-to-end | `.t27` -> `t27c` -> `gen/*` -> `.trinity/seals/` | GREEN |
| Open and inspectable artefacts | This repo + linked chip repos (see [`LINEUP.md`](LINEUP.md))| n/a |

Honest gaps:

- **No claim of submission acceptance** to CLARA. `clara-bridge/submission/`
and `clara-bridge/proposal/` are documents in this repo; their state in
any external program is **not** asserted here.
- **No claim of CLARA TA-level mapping** (Technical Area X.Y wording).
When the public program page details such structure, a follow-up PR can
refine this table.
- **Coq surface is "partial"**: see [`STATUS.md`](STATUS.md) section 2.4.

---

## 3. Reproducing the trace

Anyone with this repo can verify the mapping:

```bash
# Spec-to-gen reproducibility
./scripts/tri parse specs/numeric/gf16.t27
./scripts/tri gen-verilog specs/numeric/gf16.t27
./scripts/tri seal specs/numeric/gf16.t27 --verify

# Assurance bridge examples
ls clara-bridge/
cat clara-bridge/README.md

# Conformance gating
./scripts/tri validate-conformance
./scripts/tri validate-gen-headers
```

The intent is: a reviewer who has never used t27 should be able to land
on this file, click into each row's artefact, and reach a reproducible
build or proof from there.

---

## 4. What this document is not

- Not a CLARA program proposal (those live, if at all, under
`clara-bridge/submission/` and `clara-bridge/proposal/`).
- Not a CLARA endorsement or affiliation statement.
- Not a substitute for [`STATUS.md`](STATUS.md), which still governs
what level each component is at.

---

**phi^2 + 1/phi^2 = 3 | TRINITY**
133 changes: 133 additions & 0 deletions COMPETITORS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# COMPETITORS.md -- Honest Positioning

> **One-line positioning:** Commercial NPUs own the production TOPS / SDK /
> compliance corner. TRI-NET / t27 own the **inspectable open silicon and
> formal / assurance workflow** corner. These are different products; this
> document is written to keep us out of races we are not running.

This page describes **adjacent products** in the AI-accelerator space and
states, as restrained as possible, what TRI-NET / t27 is and is not, relative
to each. **No throughput parity is claimed against any product on this page.**

External links are kept as primary sources. All claims attributed to a
vendor are sourced from the linked page; any other claim is attributed to
this repo.

---

## 1. Adjacent products (alphabetical)

### 1.1 Axelera Metis (AIPU)

- **Vendor page:** https://axelera.ai/ai-accelerators/aipu/metis
- **What they sell:** edge AI inference cards / modules with their own
AIPU silicon, Voyager SDK, model zoo.
- **What TRI-NET is not:** we do not provide an SDK at this scale or a model
zoo. Our compute volume target (`tt-trinity-gamma`, 32 PEs) is research-tier.
- **What TRI-NET differs in:** every Verilog block in our line comes from a
`.t27` spec under `specs/` with conformance vectors under `conformance/`.
The silicon submission target is the Tiny Tapeout shuttle, not a private
fab run.

### 1.2 Coral Edge TPU

- **Benchmarks page:** https://www.coral.ai/docs/edgetpu/benchmarks/
- **What they sell:** USB / M.2 / PCIe Edge TPU accelerators, post-training
INT8 quantised models, the Edge TPU Compiler.
- **What TRI-NET is not:** we do not ship a binary toolchain that takes a
TFLite file and produces a ready-to-run device image. Coral does.
- **What TRI-NET differs in:** the numeric format itself is open and
inspectable (see [`FORMAT_REGISTRY.md`](FORMAT_REGISTRY.md)); the path
from spec to RTL is reproducible and sealed.

### 1.3 Hailo-8

- **Vendor page:** https://hailo.ai/products/ai-accelerators/hailo-8-ai-accelerator/
- **What they sell:** edge AI processor IC with a dataflow architecture,
Hailo Dataflow Compiler, production deployments in automotive / industrial.
- **What TRI-NET is not:** we are not a production embedded inference
processor. We do not claim TOPS, mW/TOPS, or automotive compliance.
- **What TRI-NET differs in:** all of our numeric kernel and ISA are
spec-driven; we publish proofs (`coq/`) and seals (`.trinity/seals/`).
This is an **orthogonal** value proposition, not a substitute.

### 1.4 MediaTek Dimensity 9400+

- **Vendor page:** https://www.mediatek.com/products/smartphones/mediatek-dimensity-9400-plus
- **What they sell:** smartphone application SoC with an integrated NPU, in
shipping mobile devices.
- **What TRI-NET is not:** we are not an SoC and not a phone-class platform.
- **What TRI-NET differs in:** TRI-NET targets the **open-shuttle**
(Tiny Tapeout) economic regime, not high-volume mobile silicon.

### 1.5 Qualcomm Cloud AI 100 Ultra

- **Vendor PDF:** https://www.qualcomm.com/content/dam/qcomm-martech/dm-assets/documents/Prod-Brief-QCOM-Cloud-AI-100-Ultra.pdf
- **What they sell:** datacentre-class inference accelerator with a closed
SDK, drivers, and ecosystem.
- **What TRI-NET is not:** we are not a datacentre accelerator and never
will be on this codebase.
- **What TRI-NET differs in:** TRI-NET's compute volume is research-tier;
our differentiator is that the **whole spec chain** -- numeric format,
ISA, RTL -- is openly auditable.

### 1.6 BitNet b1.58 (research, not a product)

- **Paper:** https://arxiv.org/abs/2402.17764
- **What it is:** a research result showing that LLM weights can be
represented in ternary form (`{-1, 0, +1}`) with competitive accuracy at
~1.58 bits/weight.
- **Why we cite it:** it validates the **direction** TRI-NET pursues in the
large -- ternary inference is plausible at scale. It does **not** validate
any claim about t27 or the chip line; we cite it only as motivation for
the ternary numeric path documented in [`FORMAT_REGISTRY.md`](FORMAT_REGISTRY.md).

### 1.7 Tiny Tapeout (open shuttle, not a competitor)

- **Catalogue:** https://tinytapeout.com/chips/
- **What it is:** an educational / open silicon shuttle program that lets
designers submit small digital designs as tiles on a shared die.
- **Relation:** Tiny Tapeout is the **submission channel** for the three
TRI-NET chip repos (`tt-trinity-phi`, `tt-trinity-euler`,
`tt-trinity-gamma`). It is part of our pipeline, not a competitor.

---

## 2. What we do not claim

To keep this document honest, the following claims are **explicitly out of
scope** for t27 and the TRI-NET line as of this writing:

1. No claim of **TOPS parity** or **TOPS/W parity** with any product listed above.
2. No claim of **SDK feature parity** with Hailo, Coral, Qualcomm, MediaTek,
or Axelera. We do not ship a vendor compiler for popular framework
formats (TFLite / ONNX / PyTorch Mobile).
3. No claim of **compliance certifications** (automotive, aerospace, medical).
4. No claim that GoldenFloat formats outperform FP8 / BF16 at any specific
model or task.
5. No claim about silicon performance until a chip repo demonstrates
`SILICON` level (see [`STATUS.md`](STATUS.md) definitions).

---

## 3. What we do claim (narrow, defensible)

1. **Spec-to-RTL reproducibility.** A `.t27` spec compiles to Verilog under
`gen/verilog/` (and to Zig / C software backends), with conformance
vectors under `conformance/`. See [`STATUS.md`](STATUS.md) for the levels.
2. **A single numeric SSOT** -- `conformance/FORMAT-SPEC-001.json` -- used
uniformly across the line. See [`FORMAT_REGISTRY.md`](FORMAT_REGISTRY.md).
3. **Open-shuttle silicon target.** The chip repos submit to Tiny Tapeout,
not a closed fab.
4. **Formal / assurance workflow** -- Coq proofs (`coq/`), seal-based
integrity (`.trinity/seals/`), and the `clara-bridge/` worked example
for DARPA CLARA-style compositional assurance
(see [`CLARA_TRACEABILITY.md`](CLARA_TRACEABILITY.md) for the public-goal
mapping).

These four claims, together, define the "open high-assurance ternary AI
silicon substrate" positioning.

---

**phi^2 + 1/phi^2 = 3 | TRINITY**
Loading
Loading