#

streaming-llm

Here are 2 public repositories matching this topic...

FonaTech / Project_Chronos

⚡ Zero-Stall MoE Inference via Lookahead Prediction & Async DMA Prefetching. Optimized for SSD I/O with Hybrid MLA+Sliding Window Attention.

open-source artificial-intelligence lora high-throughput open-models mixture-of-experts llm generative-ai large-language-model streaming-llm predictive-inference sliding-window-attention io-latency-hiding async-dma ssd-offloading lookahead-routing mla-attention dual-layer-moe

Updated Apr 23, 2026
Python

SandyCompetent / exeter_academic_agent

Agentic AI assistant powered by Google Gemini 2.5, with streaming LLM output, multi-tool data routing, and cross-platform Flutter deployment.

android-application webapp gemini-api flutter-app prompt-engineering generative-ai streaming-llm llm-agents google-gemini agentic-ai llm-integration model-routing

Updated Mar 19, 2026
Dart

Improve this page

Add a description, image, and links to the streaming-llm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the streaming-llm topic, visit your repo's landing page and select "manage topics."