FlashRT
Popular repositories Loading
-
FlashRT-HF-kernels
FlashRT-HF-kernels PublicFlashRT-HF-kernels contains standalone FlashRT CUDA/CUTLASS kernels prepared for the Hugging Face kernels community, focusing on small-batch, low-latency inference paths for LLM, VLA, and physical …
Repositories
Showing 2 of 2 repositories
- FlashRT Public
FlashRT is a high-performance realtime inference engine for small-batch, latency-sensitive AI workloads. The flagship integration is production VLA control for Pi0, Pi0.5, GROOT N1.6, and Pi0-FAST. Also support llm e.g, qwen3.6-27B
flashrt-project/FlashRT’s past year of commit activity - FlashRT-HF-kernels Public
FlashRT-HF-kernels contains standalone FlashRT CUDA/CUTLASS kernels prepared for the Hugging Face kernels community, focusing on small-batch, low-latency inference paths for LLM, VLA, and physical AI workloads.
flashrt-project/FlashRT-HF-kernels’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…