A fast neural network framework for the cosmological computation of Big-Bang Nucleosynthesis (BBN) primordial light-element abundances.
This repository contains the reference implementation accompanying the paper
"Accurate neural network emulator for primordial light element abundances"
(F. Zhang, H. Diao, B. Li, J. Meyers, P. R. Shapiro). It provides the training
and evaluation scripts for two emulator instances — one trained on data from
PArthENoPE v3.0 and one trained on data from
AlterBBN v2.2 — together with the
pre-trained weights used in the paper.
High-accuracy numerical BBN solvers are a well-known computational bottleneck
in cosmological inference. A single full-network call of PArthENoPE (26
nuclides, 100 reactions) can take on the order of
BBNet is a deep-learning emulator that learns the mapping
from training data generated by full numerical BBN solvers. The architecture
(ResMLPWithAttn) is a multi-layer perceptron with residual connections and a
multi-head self-attention block. Once trained, it produces
The emulator is designed as a drop-in replacement for the BBN-prediction step
inside parameter-inference pipelines. It is not a BBN solver; it reproduces
the output of an existing solver (here PArthENoPE or AlterBBN) on the
parameter ranges spanned by the training set.
Scope note. This repository ships the BBN emulator and its evaluation scripts only. The modified versions of
PArthENoPEandAlterBBNthat generate the training data, as well as any external MCMC driver, are external to this codebase.
BBNet/
├── train_bbn_parthenope.py # Training entry point for the PArthENoPE-trained emulator
├── train_bbn_alterbbn.py # Training entry point for the AlterBBN-trained emulator
├── mape_for_parthenope.py # Evaluation / error metrics on the PArthENoPE test set
├── mape_for_alterbbn_exp.py # Evaluation on the AlterBBN test set (incl. expert mode)
├── weights/ # Pre-trained model weights and normalisation scalers
└── LICENSE # MIT License
The repository currently contains:
- two training scripts, one per BBN backend (
PArthENoPE,AlterBBN); - two evaluation scripts that compute MPE / MAPE / RMSPE on the test set, with the AlterBBN evaluator additionally supporting the hierarchical expert inference mode described in the paper;
- the pre-trained checkpoints under
weights/referenced in the paper.
A more complete description of the paper-level framework — including the modified solvers, the training-data generation pipeline, and the integration into Bayesian inference — is given in the Paper-level framework section below.
For full details please consult the paper. The most important specifications relevant to using this code are:
Inputs and parameter ranges (uniform Latin-hypercube sampling, with a
log-uniform prior on
| Parameter | Range |
|---|---|
|
|
|
|
|
Outputs:
Training data: PArthENoPE is run in its
complete nuclear network configuration (26 nuclides, 100 reactions);
AlterBBN is run in RK2_halfstep mode (failsafe=7).
Architecture (ResMLPWithAttn): Linear projection to a 4096-dimensional
hidden representation followed by a GeLU activation, an 8-head self-attention
block with a residual connection,
Optimisation: AdamW with initial learning rate ReduceLROnPlateau scheduler (factor
Loss:
with
Expert mode (AlterBBN only). Because AlterBBN spans
several orders of magnitude, the AlterBBN emulator can optionally run in a
two-stage expert mode: a base model produces a preliminary
ResMLPWithAttn
backbone but carry their own scaler statistics. Expert mode is activated via
the --exp flag in mape_for_alterbbn_exp.py.
The code targets PyTorch on CPU or CUDA-capable GPU. A minimal environment is:
# Create and activate an environment (example)
conda create -n bbnet python=<TODO: python-version>
conda activate bbnet
# Install PyTorch matching your CUDA setup, see https://pytorch.org/get-started/locally/
pip install torch
# Additional dependencies used by the scripts
pip install numpy scipy scikit-learn pandas tqdmThe exact pinned versions used in the paper are not yet provided. A
requirements.txt/environment.ymlwill be added in a future release; in the meantime please refer to the import statements at the top of each script.
The repository itself does not require a build step: clone and run.
git clone https://github.com/Hdiao112/BBNet.git
cd BBNetTwo training scripts are provided, one per backend. Each script expects a training dataset generated by the corresponding (modified) BBN solver, containing the four physical inputs and the two abundance outputs.
Training data is not bundled with this repository. It can either be
regenerated using the modified PArthENoPE / AlterBBN codes referenced in
the paper, or — when made available — downloaded from <TODO: data-release-URL>.
# PArthENoPE-trained emulator
python train_bbn_parthenope.py \
--data <path-to-parthenope-training-data> \
--out <path-to-output-checkpoint-dir>
# AlterBBN-trained emulator (base model)
python train_bbn_alterbbn.py \
--data <path-to-alterbbn-training-data> \
--out <path-to-output-checkpoint-dir>The exact CLI flags exposed by each training script have not been documented here to avoid misstatement. Please run
python train_bbn_parthenope.py --helpandpython train_bbn_alterbbn.py --helpfor the authoritative argument list.
Each training run produces a model checkpoint together with the input/output normalisation scalers used during training. Both files are required at inference time.
For the AlterBBN expert mode, the two band-specific expert checkpoints are
trained on the same data filtered to the corresponding
Two evaluation scripts compute the percentage-error metrics reported in the paper — RMSPE, MAPE, and MPE — on the held-out test set:
# Evaluate the PArthENoPE-trained emulator
python mape_for_parthenope.py \
--weights <checkpoint-path> \
--data <path-to-parthenope-test-data>
# Evaluate the AlterBBN-trained emulator (base model only)
python mape_for_alterbbn_exp.py \
--weights <base-checkpoint-path> \
--data <path-to-alterbbn-test-data>
# Evaluate the AlterBBN-trained emulator with expert routing
python mape_for_alterbbn_exp.py \
--weights <base-checkpoint-path> \
--expert1 <expert1-checkpoint-path> \
--expert2 <expert2-checkpoint-path> \
--data <path-to-alterbbn-test-data> \
--expThe exact flag names above are placeholders matching the script semantics described in the paper, not necessarily verbatim. Please consult
--helpon each script for the authoritative interface.
For reference, the percentage-error metrics reported in the paper on the held-out test sets are:
| Backend | Output | RMSPE (%) | MAPE (%) | MPE (%) |
|---|---|---|---|---|
| PArthENoPE | 0.0158 | 0.0064 | ||
| PArthENoPE | 0.0503 | 0.0331 | ||
| AlterBBN | 0.0175 | 0.0055 | ||
| AlterBBN | 0.0799 | 0.0455 |
The weights/ directory contains the checkpoints used to produce the figures
and tables in the paper. To use them directly for inference, point the
evaluation scripts at the relevant files in weights/:
python mape_for_parthenope.py \
--weights weights/<parthenope-checkpoint> \
--data <path-to-test-data>Each checkpoint is paired with its scaler statistics, and the two must be
loaded together to reproduce the reported metrics. The exact filenames inside
weights/ are listed in that directory.
The pre-trained weights apply only within the parameter ranges given in Method summary. Predictions outside those ranges are extrapolations and have not been validated.
The paper presents BBNet as a framework — a standardised pipeline of
data generation, training, and inference that can be repeated for new BBN
codes or new physical models. The full pipeline involves the following
components, only some of which are part of this repository:
-
Modified BBN solvers.
PArthENoPEv3.0 andAlterBBNv2.2 modified to accept$\Delta N_{\mathrm{eff}}$ and$\kappa_{10}$ as inputs and to use a consistent set of nuclear reaction rates for the dominant PNG / DPG / DDN / DDP channels. These solvers are released separately by the authors and are not contained in this repository. -
Training-data generation.
$20{,}000$ Latin-hypercube samples per solver. The data files themselves are not bundled in this repository. -
Emulator training and evaluation. Provided here, in
train_bbn_*.pyandmape_for_*.py. -
Pre-trained weights. Provided here, under
weights/. -
Bayesian inference / MCMC integration. Outside the scope of this
repository. The paper notes that, because each
BBNetevaluation costs a few milliseconds, the emulator can be plugged into existing MCMC pipelines without modifying the sampler itself.
The paper explicitly outlines several future directions. None of them are implemented in this repository at present:
- emulators for the primordial
$^{3}\mathrm{He}/\mathrm{H}$ and$^{7}\mathrm{Li}/\mathrm{H}$ abundances, intended to address the lithium problem; - treatment of nuclear reaction rates as additional free parameters;
- training instances for additional BSM scenarios beyond the dark-radiation
- stiff-fluid extension considered here;
- ready-made bindings to standard inference packages.
Contributions in these directions are welcome — see Contributing.
If you use BBNet in academic work, please cite the accompanying paper:
@article{BBNet2025,
title = {Accurate neural network emulator for primordial light element abundances},
author = {Zhang, Fan and Diao, Hang and Li, Bohua and Meyers, Joel and Shapiro, Paul R.},
journal = {<TODO: journal>},
year = {<TODO: year>},
eprint = {<TODO: arXiv-id>},
archivePrefix = {arXiv},
doi = {<TODO: doi>}
}If you also use the pre-trained weights or the training scripts directly, please cite this repository in addition to the paper.
This project is released under the MIT License.
Bug reports, fixes, and contributions extending the framework to additional
BBN codes or to additional primordial abundances (
For questions about the code or the paper, please open a GitHub issue or
contact the corresponding author of the paper (B. Li,
bohuali@gxu.edu.cn).