MOSES-THRML

Compile MOSES evolutionary program search to Boltzmann sampling on thermodynamic hardware — bridging metta-moses and Extropic/thrml.

Overview

MOSES-THRML compiles MOSES's evolutionary program learning into Boltzmann sampling that runs on Extropic's Thermodynamic Sampling Unit (TSU). You give it a fitness function and knob space; it runs parallel Metropolis-Hastings chains (equivalent to TSU Gibbs sampling) and returns the best program. 77 tests verify end-to-end correctness against metta-moses. Together with PLN-THRML, ECAN-THRML, QuantiMORK-THRML, and Geodesic-THRML, it completes the Hyperon AGI stack on thermodynamic hardware.

New to MOSES? (30-second primer)

MOSES (Meta-Optimizing Semantic Evolutionary Search) evolves short programs to fit a target dataset. Instead of training neural network weights, it searches over symbolic boolean expressions — flipping "knobs" to change program structure.

Component	Meaning	Analogy
Knob vector	Binary encoding of a program's structure	A genome in genetic algorithms
Fitness	How well the program matches target data	Accuracy on a test set
Exemplar	The best program found so far in a subpopulation	The fittest individual
Deme	A subpopulation exploring variations of one exemplar	An island in island-model GA
Simulated annealing	Accept random knob flips; sometimes accept worse to escape local optima	Shaking a puzzle box to unstick pieces

The search starts hot (accepting almost any change) and cools down (accepting only improvements), converging on the best program. This temperature schedule is exactly what TSU hardware does natively.

New to TSU / Boltzmann sampling? (30-second primer)

A Thermodynamic Sampling Unit (TSU) is Extropic's chip that uses physical thermal noise to sample from Boltzmann distributions. Instead of simulating randomness in software, the hardware is the random sampler.

TSU concept	Meaning	Analogy
pbit	A probabilistic bit — fluctuates between two states based on its energy (hardware uses Ising ±1; software often maps to 0/1)	A biased coin that flips itself
Energy E(x)	How "costly" a configuration x is	Height on a fitness landscape
Boltzmann distribution	P(x) ∝ exp(−E/T) — lower energy = higher probability	Water settling into valleys
Gibbs sampling	Update each pbit conditioned on its neighbors until equilibrium	Repeatedly re-rolling each die given the others
Temperature T	Controls exploration vs exploitation	High T = random walk; low T = greedy descent

The key insight: MOSES's Metropolis-Hastings acceptance criterion min(1, exp(ΔFitness/T)) and TSU's Gibbs update rule both converge to the same Boltzmann distribution P(x) ∝ exp(−E/T). The algorithms differ in mechanism (MH proposes-then-accepts; Gibbs updates each pbit conditionally), but they target the same stationary distribution — so TSU hardware can replace the software SA loop without changing the distribution being sampled.

See arXiv:2510.23972 for the TSU architecture.

Technical summary: Two formal bridges prove MOSES runs natively on TSU. (1) SA ↔ Gibbs: both MOSES's MH acceptance min(1, exp(Δfitness/T)) and TSU's Gibbs sampling target the same Boltzmann distribution with energy E = −fitness: P(x) ∝ exp(fitness/T). The mechanisms differ (MH propose-accept vs Gibbs conditional update) but the stationary distribution is identical. (2) Deme parallelism ↔ multi-chain: MOSES metapopulation's multiple demes = multiple parallel Markov chains on the same TSU chip (n_chains).

An optional third path compiles boolean programs (SDNF) to RBM weights via LBM Theorem 1 (external, non-Hyperon) — see Appendix: LBM compilation path.

Why this matters

Parallel program search

MOSES's core task is searching a combinatorial program space — trying knob flips and accepting or rejecting them. Each architecture parallelizes this differently:

	CPU (metta-moses)	GPU (estimated)	TSU (this project)
Parallelism	Sequential knob flips	Batched fitness evaluation	All chains sample simultaneously
Bottleneck	Program space size × evaluations	Memory bandwidth	Mixing time (energy barrier height)
Best fit	Small knob spaces, exact fitness	Large population batches	Knob space fits on-chip¹

On a TSU, the knob space compiles into pbit states updated via block Gibbs passes — physical thermal noise drives configurations toward the same Boltzmann equilibrium that MOSES's MH chains target, so the hardware natively samples from the correct distribution.

¹ Knob spaces exceeding a single TSU chip require multi-chip partitioning with communication overhead. Mixing time depends on energy landscape structure and can grow sharply for landscapes with tall barriers.

Energy efficiency

The TSU architecture paper (arXiv:2510.23972) reports ~10,000× lower energy per sample vs GPU baselines on DTM sampling (E_cell ≈ 2 femtojoules). MOSES program search is expected to benefit similarly but has not been independently benchmarked.

Installation

git clone --recurse-submodules https://github.com/xiaohanma-oss/MOSES-THRML.git
cd MOSES-THRML
pip install -e .                 # core only (numpy)
pip install -e ".[metta]"       # + MeTTa bridge (requires hyperon)
pip install -e ".[dev]"         # + pytest for running tests

The metta-moses submodule provides test baselines. If you cloned without --recurse-submodules, run git submodule update --init.

Quick start

Python API

from moses_thrml import (
    KnobSpace, make_xor_dataset, make_energy_fn,
    GibbsSchedule, run_thermodynamic_search, diagnose_convergence,
)
import numpy as np

# Define the problem
dataset = make_xor_dataset(3)

def xor_program(inputs, bits):
    active = np.where(bits == 1)[0]
    if len(active) == 0:
        return np.zeros(len(inputs), dtype=np.int8)
    return (np.sum(inputs[:, active], axis=1) % 2).astype(np.int8)

energy_fn = make_energy_fn(dataset, xor_program, lambda_=0.0)

# Run thermodynamic search (= MOSES SA on TSU)
schedule = GibbsSchedule(n_warmup=300, n_samples=1000, T_start=1.0, T_end=0.05)
result = run_thermodynamic_search(n_knobs=3, energy_fn=energy_fn,
                                   schedule=schedule, n_chains=30)

print(f"Best program: {result.best_bits}")   # [1 1 1]
print(f"Accuracy: {result.best_fitness:.2%}")  # 100.00%

# Convergence diagnostics
diag = diagnose_convergence(result)
print(f"R-hat: {diag['r_hat']:.3f}, ESS: {diag['ess']:.0f}")

MeTTa bridge (requires `hyperon`)

from hyperon import MeTTa
from moses_thrml.metta import register_all

metta = MeTTa()
register_all(metta)

# Search for the best XOR-3 program
results = metta.run('!(thrml-moses-search 3 30 300 1000 xor-3)')
# => (bits 1 1 1)

# Score a candidate program
results = metta.run('!(thrml-moses-score 1 1 1 xor-3)')
# => (fitness 1.000000)

Run tests

pytest tests/ -v                 # all 77 tests
pytest -m slow -v                # end-to-end + scalability tests

How it works

Represent — KnobSpace maps a program tree to a binary pbit vector (encode/decode/neighbors)
Score — Fitness function compiled to energy: E = −fitness (BehavioralScore + ComplexityPenalty)
Search — Parallel MH chains with annealing schedule = TSU Boltzmann sampling at β = 1/T
Select — Exemplar selection from best chains; deme management on CPU

CPU side                                TSU side
────────────────────────                ────────────────────────
Representation building                 run_thermodynamic_search()
  KnobSpace(exemplar)           →         parallel MH chains
                                          (= Boltzmann sampling)
Fitness compilation
  fitness → E = -fitness        →         energy landscape

Exemplar selection (deme.py)    ←         best_bits from chains

Mapping table

MOSES concept	MOSES-THRML construct	TSU hardware
Program (boolean expression)	Binary knob vector	pbit state
Fitness function	Energy E = −fitness	Energy landscape
SA accept criterion	Metropolis-Hastings `min(1, exp(−ΔE/T))`	Gibbs sampling update
Temperature schedule	`GibbsSchedule(T_start, T_end)`	Inverse temperature β = 1/T
Knob flip (mutation)	Bit flip proposal	pbit thermal fluctuation
Deme (subpopulation)	Parallel Markov chain	Independent chain on chip
SDNF formula (optional LBM path)	RBM weights (LBM Thm.1)	Bipartite EBM on DTCA
Metapopulation	Multi-chain orchestration	`n_chains` parameter
Exemplar selection	CPU-side argmin(E)	Readout → CPU

API reference

Core engine (`moses_thrml`)

Representation:

Function/Class	Module	Description
`KnobSpace`	`knobs.py`	Program tree ↔ binary pbit vector (encode/decode/neighbors)

Fitness:

Function/Class	Module	Description
`Dataset`	`fitness.py`	Named tuple: inputs + labels
`BehavioralScore`	`fitness.py`	Accuracy-based fitness component
`ComplexityPenalty`	`fitness.py`	Program size regularization
`CompositeEnergy`	`fitness.py`	E = −fitness (combines behavioral + complexity)
`make_energy_fn`	`fitness.py`	Build energy function from dataset + program evaluator
`make_xor_dataset`, `make_mux_dataset`	`fitness.py`	Canonical test problem datasets

EBM compilation (optional — external tool, not part of Hyperon ecosystem):

Function/Class	Module	Description
`SDNFClause`, `SDNFFormula`	`ebm.py`	SDNF representation for boolean programs
`RBMWeights`	`ebm.py`	RBM weight container
`sdnf_to_rbm_weights`	`ebm.py`	LBM Theorem 1: SDNF → RBM
`rbm_free_energy`, `rbm_to_energy_fn`	`ebm.py`	RBM weights → energy function (marginalise hidden units)
`xor_as_sdnf`	`ebm.py`	XOR-n as canonical SDNF formula

Search:

Function/Class	Module	Description
`GibbsSchedule`	`search.py`	Annealing schedule (T_start, T_end, warmup, samples)
`SamplingResult`	`search.py`	Container for chain histories + best solution
`run_thermodynamic_search`	`search.py`	Parallel MH chains = TSU Boltzmann sampling
`extract_best`	`search.py`	Extract best bits from sampling result
`diagnose_convergence`	`search.py`	R-hat, ESS convergence diagnostics

Deme management:

Function/Class	Module	Description
`Deme`	`deme.py`	Single deme: exemplar + knob space + search result
`Metapopulation`	`deme.py`	Multi-deme orchestration (exemplar selection on CPU)
`run_deme`	`deme.py`	Run thermodynamic search for one deme

MeTTa bridge (`moses_thrml.metta`)

Function	Purpose
`register_all(metta)`	Register `thrml-moses-search`, `thrml-moses-score`, `thrml-moses-represent` with a MeTTa runner

The thrml-moses-search operator runs thermodynamic search on a named problem; thrml-moses-score evaluates a candidate program — see Quick start for examples.

Results

77 tests, all passing.

Canonical test problems

Problem	Description	Optimal accuracy
`xor-3`	XOR parity of 3 inputs	100% (bits=[1,1,1])
`xor-4`	XOR parity of 4 inputs	100% (bits=[1,1,1,1])
`mux-2`	2-address multiplexer	100%

Test breakdown

tests/test_knobs.py    — 21 tests: encode/decode roundtrip, neighbourhood
tests/test_fitness.py  — 13 tests: E = -fitness monotonicity, problem datasets
tests/test_ebm.py      — 18 tests: LBM Theorem 1 soundness (satisfying < non-satisfying)
tests/test_search.py   — 16 tests: convergence, diagnostics, reproducibility
tests/test_deme.py     —  9 tests: deme management, metapopulation

Hyperon integration outlook

See PLN-THRML README for the full heterogeneous pipeline design (Control → Compile → Sample).

This project contributes the MOSES tier: evolutionary program search compiled to TSU Boltzmann sampling, handling program optimization alongside PLN's factor-graph inference and ECAN's LBM attention diffusion. All three workloads are TSU-native — co-location on one chip via time-multiplexing or spatial partitioning is an open question (depends on knob space size, chain count, and mixing time).

Cross-project comparison

	PLN-THRML	ECAN-THRML	MOSES-THRML
Compiles	PLN inference rules	ECAN attention diffusion	MOSES program search
TSU primitive	Factor graph Gibbs	Lattice-Boltzmann fluid collision+streaming	EBM Gibbs (SA equiv.)
Variables	TruthValue (s, c)	STI density field	knob vector (binary)
CPU handles	Rule structure (Q_logic)	Goal/boundary conditions	Representation building
TSU handles	Posterior sampling	Diffusion/advection	Deme-local knob search
Formal basis	Boltzmann factor graph	Navier-Stokes ↔ Lattice-Boltzmann	MH ↔ Gibbs

Project structure

moses_thrml/               Main package
  __init__.py              Public API re-exports
  knobs.py                 KnobSpace: program tree ↔ binary pbit vector
  fitness.py               BehavioralScore + ComplexityPenalty → CompositeEnergy
  ebm.py                   (optional) LBM Theorem 1: SDNF boolean programs → RBM weights
  search.py                run_thermodynamic_search: parallel MH chains = TSU Gibbs
  deme.py                  run_deme + Metapopulation: population management
  metta/                   MeTTa integration layer (optional, requires hyperon)
    __init__.py            Entry point — exports register_all()
    rules.py               Grounded ops: thrml-moses-search, thrml-moses-score
    dispatch.metta         MeTTa dispatch rules
vendor/
  metta-moses/             iCog Labs upstream (git submodule) — test baselines
tests/
  conftest.py              Shared fixtures and tolerance constants
  test_knobs.py            21 tests: encode/decode roundtrip, neighbourhood
  test_fitness.py          13 tests: E = -fitness monotonicity, problem datasets
  test_ebm.py              18 tests: LBM Theorem 1 soundness (optional path)
  test_search.py           16 tests: convergence, diagnostics, reproducibility
  test_deme.py             9 tests: deme management, metapopulation
docs/
  interactive_overview.html   Interactive visualization (open in browser)
  simple_example.html         Simple example visualization

Contributing

See CONTRIBUTING.md for development setup, testing, code conventions, and pull request guidelines.

Sister Projects

Five projects compiling Hyperon's cognitive architecture to thermodynamic hardware:

Project	What it compiles
PLN-THRML	Probabilistic inference → Boltzmann energy tables
ECAN-THRML	Attention diffusion → Lattice Boltzmann simulation
MOSES-THRML	Program evolution → Boltzmann sampling
QuantiMORK-THRML	Predictive coding → wavelet-sparse factor graphs
Geodesic-THRML	Unified geodesic scheduler for all above

Acknowledgements

metta-moses — iCog Labs
thrml — Extropic AI factor graph library

Appendix: LBM compilation path (optional)

An alternative compilation path uses Logical Boltzmann Machine (LBM) Theorem 1 (Tran, Mota, Garcez 2025) to compile boolean programs (SDNF) directly into RBM weights. This is an external tool (not part of the Hyperon ecosystem) and is not required by the main search pipeline.

from moses_thrml import xor_as_sdnf, sdnf_to_rbm_weights, rbm_to_energy_fn

# MOSES program (as SDNF) → RBM weights → energy function → thermodynamic search
formula = xor_as_sdnf(3)               # 4 clauses, 3 variables
weights = sdnf_to_rbm_weights(formula)  # LBM Theorem 1
energy_fn = rbm_to_energy_fn(weights)   # free energy (hidden units marginalised)

result = run_thermodynamic_search(n_knobs=3, energy_fn=energy_fn, n_chains=20)
assert formula.is_satisfied_by(result.best_bits)  # finds satisfying assignment

Implementation: ebm.py (280 lines, 18 tests in test_ebm.py).

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
moses_thrml		moses_thrml
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MOSES-THRML

Table of Contents

Overview

Why this matters

Parallel program search

Energy efficiency

Installation

Quick start

Python API

MeTTa bridge (requires `hyperon`)

Run tests

How it works

Mapping table

API reference

Core engine (`moses_thrml`)

MeTTa bridge (`moses_thrml.metta`)

Results

Canonical test problems

Test breakdown

Hyperon integration outlook

Cross-project comparison

Project structure

Contributing

Sister Projects

Acknowledgements

Appendix: LBM compilation path (optional)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MOSES-THRML

Table of Contents

Overview

Why this matters

Parallel program search

Energy efficiency

Installation

Quick start

Python API

MeTTa bridge (requires hyperon)

Run tests

How it works

Mapping table

API reference

Core engine (moses_thrml)

MeTTa bridge (moses_thrml.metta)

Results

Canonical test problems

Test breakdown

Hyperon integration outlook

Cross-project comparison

Project structure

Contributing

Sister Projects

Acknowledgements

Appendix: LBM compilation path (optional)

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

MeTTa bridge (requires `hyperon`)

Core engine (`moses_thrml`)

MeTTa bridge (`moses_thrml.metta`)

Packages