ML Engineer @ NoBroker, Bengaluru
Sneh Shah
|
I build production conversational AI systems — multi-channel agents, memory services, and ASR pipelines. Side-projects live at the intersection of RL post-training and clinical reasoning.
scroll
Experience
NoBroker (Convozen.ai)
Machine Learning Engineer
Jul 2025 – Present
Bengaluru, Karnataka
- ▸Engineered and deployed end-to-end Chatbot and Email Co-Pilot harness powering live multi-channel conversational AI in production.
- ▸Architected MSOCA (Multi-Session + Omni-Channel Architecture) with a unified Agent-User-Context, sharing state across Voice, Chat, and WhatsApp via single-active-focus arbitration.
- ▸Migrated session state from MongoDB to Redis under a hot/cold pattern with write-behind flush, cutting per-turn I/O by 50–150 ms to meet voice TTFB budgets.
- ▸Built a standalone Memory Management Service (FastAPI) with pluggable Redis, Mongo, and Vector backends implementing rolling-window summarization and RAG retrieval, reducing effective prompt size by 30–40%.
- ▸Enabled multi-modal ingestion via Whisper STT and Vision LLMs, letting agents reason natively over voice-note, image, and video turns.
- ▸Fine-tuned Whisper and multilingual Conformer ASR and deployed on NVIDIA Riva and Triton, delivering sub-second multilingual transcription at production scale.
- ▸Engineered a "Piggyback" end-of-call extraction emitting post-call analysis JSON in the agent's final turn, eliminating a separate analysis LLM call and cutting analytics cost to output tokens only.
LLMRAGRedisRivaTritonTTSASRWebsocketGCS
Salesken.ai
ML Engineer 1
Apr 2024 – Jun 2025
Bengaluru, Karnataka
- ▸Reduced monthly transcription costs by 20% by fine-tuning a Conformer ASR model to replace a third-party service.
- ▸Built an LLM Agent system to generate personalized client emails, improving creation accuracy.
- ▸Developed a RAG-based Knowledge Extractor with web scraping, Azure, and Kafka, boosting data retrieval efficiency by 25%.
- ▸Engineered a Kafka Controller for idempotent message handling, cutting data duplication by 15%.
- ▸Optimized and deployed a Whisper-ASR system on Triton Inference Server, reducing latency by 25%.
ASRLLMRAGKafkaTritonAzure
Salesken.ai
ML Intern
Jan 2024 – Mar 2024
Bengaluru, Karnataka
- ▸Increased ASR transcription accuracy by 10% by adding speaker diarization and voice activity detection modules.
- ▸Deployed Emotion Detection models on Triton Inference Server to enhance conversational AI capabilities.
ASRTritonPyTorchDocker
Projects
Featured Project
Apr 2026House M.D. Clinical Reasoning RL Environment
POMDP emergency-department simulator for training diagnostic AI agents. Post-trained Gemma-3-4B-IT via SFT + custom GRPO loop with a 5-rubric reward function.
- ▸Built a POMDP emergency-department simulator with 15 conditions and 75 actions, deployed as a Hugging Face Space.
- ▸Post-trained Gemma-3-4B-IT to act as a sequential diagnostician via SFT (Unsloth + TRL) followed by a custom GRPO loop.
- ▸Designed a 5-rubric reward (accuracy, cost, anchoring, safety, format) with shaping that mitigates reward-hacking failure modes.
- ▸Evaluated on a 45-patient holdout against random, greedy, base-model, and Gemini Flash baselines with full per-rubric breakdowns.
Reinforcement LearningGRPOSFTGemma-3UnslothPOMDPHuggingFace
Skills
Languages
PythonRJavaSQLMATLAB
Libraries
PyTorchTensorFlowTransformersTRLUnslothScikit-LearnNumPyPandasOpenCVFastAPIPydantic
Tools & Infra
AzureDockerLangChainHugging FaceWeights & BiasesMongoDBRedisPostgresTritonRivaKubernetesKafkaJenkinsCI/CDPrometheusGit
ML & DS
Reinforcement Learning (GRPO, SFT, RLHF)LLM Post-trainingRAGMulti-Agent SystemsASR / TTSRecommender SystemsComputer VisionA/B Testing
Research
Education
Christ (Deemed to be University)
Masters of Science — Data Science
Bengaluru, Karnataka
Aug 2022 – Apr 2024
GPA: 3.63 / 4.0
Gujarat University
Bachelors of Science — Data Science
Ahmedabad, Gujarat
Jun 2019 – May 2022
GPA: 7.77 / 10