Skip to content
View Aravind0403's full-sized avatar

Block or report Aravind0403

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Aravind0403/README.md

Aravind Sundaresan

Inference Systems Engineer · Distributed Serving · GPU Cluster Scheduling

7+ years engineering resilient distributed systems at Microsoft and Amazon. Currently designing ML-driven proxies and scheduling architectures to optimize LLM inference efficiency.


🔧 Open Source

  • vLLM ContributorPR #41952 (Under Review): Fixed preemption ordering in PriorityRequestQueue to minimize KV cache recompute overhead.
  • Clairvoyant — A Go-based reverse proxy eliminating head-of-line (HOL) blocking in vLLM/SGLang via ML-driven Shortest-Job-First (SJF) scheduling. arXiv preprint in prep.
  • ACO Scheduler — Ant Colony Optimization GPU cluster scheduler featuring heterogeneous GPU/CPU/ARM64 affinity routing. Achieved P99 latency <10ms and a +28% utilization gain validated against Alibaba and Google Borg traces.
  • ServiceScope — An LLM-powered AST dependency mapper processing 190 files/sec with 0% inference failure via localized execution (zero external API calls); validated on Django (2,886 files).

🏆 Recognition

  • Google Prompt Wars 2026 (Hack2Skill · Hyderabad) — Rank 16 / 83 (Score: 90.64%)
    • AI Code Analysis: 100% Efficiency · 96.5% Problem Alignment · 92.5% Accessibility

🛠️ Languages

Python · Go · C++ · Java · Bash


💡 Interests

LLM inference internals · GPU cluster scheduling · Distributed systems · AI-powered developer tooling

📫 linkedin.com/in/aravindsundaresan

Pinned Loading

  1. ACO_Adaptive_Compute_Orchestrator ACO_Adaptive_Compute_Orchestrator Public

    Predictive job scheduler for heterogeneous compute — ACO + LSTM spike prediction + intent-aware routing. <10ms latency, 95%+ SLA adherence, 202 tests

    Python

  2. ServiceScope-v2 ServiceScope-v2 Public

    AI-native blast-radius analysis for Python microservices — AST parsing + local LLM inference + dependency graph. No service mesh needed

    Python

  3. clairvoyant-scheduler clairvoyant-scheduler Public

    Go sidecar proxy that eliminates Head-of-Line Blocking in LLM inference via ML-driven SJF scheduling — zero backend modification. Paper in preparation

    Python