Sanchit Pathak svpathak

Hi, I'm Sanchit Pathak 👋

🎓 MTech in Data Science, IIT Guwahati (CPI: 9.61) | BTech in Electrical Engineering, VJTI Mumbai
💼 ML Engineer / Data Scientist with ~2.5 years of ML/AI experience — NLP pipelines, LLM-based RAG systems, and cloud-scale analytics on Azure & GCP
🔬 ACM Published — S-VQA at ICVGIP 2023
🏆 LeetCode Knight — Top ~5% globally
🚀 Passionate about LLMs, RAG, PEFT/LoRA fine-tuning, Agents, and NLP evaluation
🌱 Currently exploring: systematic ML evaluation, long-document QA, and agent orchestration patterns

Core areas: LLMs · RAG · PEFT/LoRA · NLP · Computer Vision · Transformers · ML Evaluation
Cloud: Azure (Databricks, AI Search, AI Foundry) · GCP (BigQuery, GCS)
Languages: Python · SQL · C/C++

Project	Description	Links
Diagnosing RAG Failure Modes	Systematic evaluation of RAG on long-document QA (QASPER dataset). 4 stress-test experiments, custom Evidence Coverage Score (ECS) metric.	Live Demo · GitHub
S-VQA	Sentence-based Visual Question Answering — TDIUC-SVQA dataset construction and multi-task multimodal modeling. ACM · ICVGIP 2023.	GitHub · Paper
Business Analytics Chatbot	Conversational analytics agent built with Google ADK + Gemini. Natural language queries returning concise answers and charts.	GitHub
PEFT LoRA Fine-tuning	Parameter-efficient fine-tuning of LLMs using LoRA — with a live interactive demo on HuggingFace Spaces.	Live Demo · GitHub
Movie Recommendation System	Collaborative filtering on the MovieLens dataset with recommendation-specific evaluation including Long Tail analysis.	GitHub
Article Bias Prediction (LSTM)	Multi-approach similarity-based political bias detection in news articles using LSTM.	GitHub
Image Caption Generator	Automatic image captioning model inspired by the Show and Tell architecture.	GitHub

S-VQA: Structured Visual Question Answering
ACM · ICVGIP 2023
📖 View Paper → | 💻 View Paper Summary →

loT Based Real-Time Harmonic Monitoring System for Distributed Generation
IEEE · I2CT 2018
📖 View Paper →

🔍 Open to MLE / Applied Scientist / Senior Data Scientist roles at product-based companies
🛠️ Building: systematic RAG evaluation frameworks and agentic AI workflows
📬 Best way to reach me: LinkedIn