Irfan Ali irfanalidv

Irfan Ali

Generative AI Engineer · LLMs · RAG · Agentic Pipelines

I build AI systems that work in production — not just demos. Over 7+ years across data engineering, NLP, and LLM systems. I've shipped production AI at a Schneider Electric company, built the entire AI intelligence layer at a Hong Kong AI startup, and founded DataCortex IQ to deliver AI systems for clients globally.

What I actually work on

Most AI projects fail at the same places — retrieval breaks under real data, agents loop forever, pipelines that passed evals fail in production, costs explode at scale. That's the problem I solve.

My work sits at the intersection of LLM orchestration, agentic pipelines, and production data engineering — building the full system from ingestion to deployment, not just the model layer.

Recent systems I've shipped:

Voice AI platform with real-time STT/TTS, LLM reasoning, structured extraction, and post-call analytics (Reflecta)
Autonomous multi-channel lead intelligence system with agent-driven pipelines, multi-provider enrichment, and RAG-style web extraction (Kuration AI)
LLM-powered dealer assistant with domain fine-tuned GPT model, multi-chain LangChain pipeline, and load calculation logic (Luminous Power Technologies / Schneider Electric)

PyPI packages

Libraries I maintain for AI infrastructure, retrieval, and data systems — used by developers in production.

Package	What it does	Downloads
agentensemble	Multi-agent orchestration — ReAct, Swarm, Pipeline, Debate, WorkflowGraph patterns with routing, planning, RAG, and cost tracking
ragfallback	Stop RAG from failing silently — query rewriting, retrieval confidence scoring, fallback strategies, retry logic
ragnav	Navigation-first RAG for long documents — routes queries to right pages, follows cross-references, coherent evidence retrieval
scrapeflow-py	Production Playwright scraping — LLM extraction, hybrid selectors, session persistence, rate limiting, anti-detection
agentcare	Voice AI for healthcare — call intake, structured extraction, missing-data recovery, appointment orchestration, post-call analytics
askpandas	Natural language queries on CSV data using local LLMs — no API keys, no data leaves your machine
lingo-nlp-toolkit	Lightweight NLP toolkit bridging traditional pipelines and transformer-ready workflows
pyrochain	Agentic feature engineering — PyTorch + LangChain agents for multimodal feature extraction
toxic-comment-classifier	Deep learning toxicity detection — obscene language, threats, insults, identity hate with per-category scores

→ All packages on PyPI

Stack

LLMs          GPT-4o · Claude · Gemini · Mistral · Ollama (local)
Orchestration LangChain · LangGraph · custom agent frameworks
RAG           hybrid BM25 + embeddings · reranking · fallback strategies
Backend       Python · FastAPI · async pipelines · queue-driven systems
Scraping      Playwright · Selenium · Firecrawl · ZenRows
Databases     MongoDB · PostgreSQL · vector stores
Infra         Docker · Azure · Azure ML · Azure DevOps · GCP
Data          Pandas · ETL pipelines · structured extraction · NLP

Where I've worked

Company	Role	What I built
Kuration AI · Hong Kong	Founding AI Engineer	Entire AI intelligence layer — agent pipelines, multi-provider enrichment, RAG-style web extraction, LLM orchestration
Luminous Power Technologies · Schneider Electric	Senior Manager — Data & Analytics, R&D	LLM dealer assistant (fine-tuned GPT), R&D intelligence dashboard, GenAI data platform on Azure
Lynk · India	Data Analytics & Automation	Analytics pipelines, NLP-powered expert matchmaking, decision-ready data workflows
brainsfeed · Hong Kong	Head of Data & Analytics	Built Infosphere from scratch — NLP enrichment platform with 15+ attribute extraction and natural-language search

Research

Multi-Aspect Temporal Topic Evolution with Neural-Symbolic Fusion and Information Extraction for Yelp Review Analysis — Indian Journal of Artificial Intelligence and Neural Networking (IJAINN), Oct 2025. DOI
Advanced Cross-Validation Framework for Mental Health AI: BERT and Neural Networks Achieve High Accuracy on MentalChat16K — IJAINN, Dec 2025. DOI

Currently

Pursuing M.Sc. Data Science & AI at IISER Tirupati (Institute of National Importance, Ministry of Education, Govt. of India) — GPA 8.0/10
Building Reflecta — continuity-first mental wellness platform with voice AI
Running DataCortex IQ — available for AI engineering contracts and consulting
Open to full-time Generative AI / Agentic AI Engineering roles (remote preferred)

GitHub Stats

Building at the intersection of LLMs, agentic systems, and production data engineering. India · Previously: Hong Kong · France · US

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Irfan Ali irfanalidv

Achievements

Achievements

Highlights

Organizations

Block or report irfanalidv