Yu Cheng Gloria72

Hi, I am Yu Cheng

MS Computer Engineering at NYU, BS Data Science at Duke. I build practical AI systems with a focus on LLM infrastructure, inference performance, observability, and customer-facing AI workflows.

I like work that turns messy model-serving behavior into something measurable: benchmarks, traces, evals, small correctness tests, better docs, and tools that make AI systems easier to operate.

Current Focus

LLM serving infrastructure: SGLang, vLLM-style serving, llm-d, routing, KV-cache behavior, and OpenAI-compatible APIs.
Performance tooling: inference benchmarks, kernel correctness checks, latency reports, and lightweight observability.
Forward-deployed AI systems: support/onboarding agents, RAG workflows, citations, escalation paths, and eval loops.
Applied AI products: local-first assistants and domain workflows where reliability matters more than demo polish.

Selected Work

Project	What it shows	Stack
langgraph-support-agent	LangGraph support/onboarding agent for AI infrastructure repos, with local retrieval, evals, escalation, and a forward-deployed case study.	Python, LangGraph
sglang-observability-router	OpenAI-compatible routing proxy and benchmark harness for SGLang-style serving experiments.	Python, LLM serving
flashinfer-kernel-bench	Correctness-first microbenchmarks for attention and sampling code paths.	Python, NumPy, Torch
rust-candle-gateway	Rust inference gateway with bounded request handling, health checks, metrics, and a Candle-ready engine boundary.	Rust
fitsnap-coach	Local-first AI fitness coach with form checks, recovery scoring, trend charts, and an agent task workspace.	JavaScript

Open Source

SGLang PR #29205: parser fix and unit test for flat max_dynamic_patch image metadata in Jinja template content.
llm-d PR #1942: contributor guide for coding agents working inside an AI infrastructure repository.

How I Work

Start with a small reproducible path before optimizing.
Prefer tests, eval cases, and traceable outputs over broad claims.
Keep setup steps explicit because developer experience is part of the product.
Write docs and runbooks when they make a system easier for the next person to operate.

Background

Computer engineering, data science, medical imaging, visualization, and cloud data pipelines.
Comfortable with Python, JavaScript/TypeScript, Rust, C++/CUDA, SQL, Docker, PyTorch, FastAPI, Three.js, and ML/data tooling.
Bilingual: English and Chinese.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yu Cheng Gloria72

Achievements

Achievements

Block or report Gloria72

Hi, I am Yu Cheng

Current Focus

Selected Work

Open Source

How I Work

Background

Contact

Pinned Loading

Uh oh!