Skip to content
#

sparse-training

Here are 19 public repositories matching this topic...

Always sparse. Never dense. But never say never. A Sparse Training repository for the Adaptive Sparse Connectivity concept and its algorithmic instantiation, i.e. Sparse Evolutionary Training, to boost Deep Learning scalability on various aspects (e.g. memory and computational time efficiency, representation and generalization power).

  • Updated Jul 21, 2021
  • Python

moe-engine is a research-grade infrastructure layer for training large Mixture-of-Experts language models at hyperscale. It is designed around one core constraint: at 10K+ GPUs, nodes die continuously. The system must keep training alive end-to-end — routing correctly, checkpointing durably, and resuming without operator intervention.

  • Updated Jul 1, 2026
  • Python

Staged Embarrassment Learning (SEL) is a bio-inspired framework for efficient Deep Learning. Inspired by a child’s rapid correction after a mistake, SEL uses dynamic gradient sparsity to focus compute on high-loss "embarrassing" samples . It achieves up to 99% FLOPs reduction, making it ideal for Edge AI.

  • Updated Apr 23, 2026
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the sparse-training topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the sparse-training topic, visit your repo's landing page and select "manage topics."

Learn more