ML Engineer @ NoBroker, Bengaluru

Sneh Shah

I build production conversational AI systems — multi-channel agents, memory services, and ASR pipelines. Side-projects live at the intersection of RL post-training and clinical reasoning.

View Resume GitHub LinkedIn Email

scroll

Experience

NoBroker (Convozen.ai)

Machine Learning Engineer

Jul 2025 – Present

Bengaluru, Karnataka

▸Engineered and deployed end-to-end Chatbot and Email Co-Pilot harness powering live multi-channel conversational AI in production.
▸Architected MSOCA (Multi-Session + Omni-Channel Architecture) with a unified Agent-User-Context, sharing state across Voice, Chat, and WhatsApp via single-active-focus arbitration.
▸Migrated session state from MongoDB to Redis under a hot/cold pattern with write-behind flush, cutting per-turn I/O by 50–150 ms to meet voice TTFB budgets.
▸Built a standalone Memory Management Service (FastAPI) with pluggable Redis, Mongo, and Vector backends implementing rolling-window summarization and RAG retrieval, reducing effective prompt size by 30–40%.
▸Enabled multi-modal ingestion via Whisper STT and Vision LLMs, letting agents reason natively over voice-note, image, and video turns.
▸Fine-tuned Whisper and multilingual Conformer ASR and deployed on NVIDIA Riva and Triton, delivering sub-second multilingual transcription at production scale.
▸Engineered a "Piggyback" end-of-call extraction emitting post-call analysis JSON in the agent's final turn, eliminating a separate analysis LLM call and cutting analytics cost to output tokens only.

LLMRAGRedisRivaTritonTTSASRWebsocketGCS

Salesken.ai

ML Engineer 1

Apr 2024 – Jun 2025

Bengaluru, Karnataka

▸Reduced monthly transcription costs by 20% by fine-tuning a Conformer ASR model to replace a third-party service.
▸Built an LLM Agent system to generate personalized client emails, improving creation accuracy.
▸Developed a RAG-based Knowledge Extractor with web scraping, Azure, and Kafka, boosting data retrieval efficiency by 25%.
▸Engineered a Kafka Controller for idempotent message handling, cutting data duplication by 15%.
▸Optimized and deployed a Whisper-ASR system on Triton Inference Server, reducing latency by 25%.

ASRLLMRAGKafkaTritonAzure

Salesken.ai

ML Intern

Jan 2024 – Mar 2024

Bengaluru, Karnataka

▸Increased ASR transcription accuracy by 10% by adding speaker diarization and voice activity detection modules.
▸Deployed Emotion Detection models on Triton Inference Server to enhance conversational AI capabilities.

ASRTritonPyTorchDocker

Projects

Featured Project

House M.D. Clinical Reasoning RL Environment

Apr 2026

POMDP emergency-department simulator for training diagnostic AI agents. Post-trained Gemma-3-4B-IT via SFT + custom GRPO loop with a 5-rubric reward function.

▸Built a POMDP emergency-department simulator with 15 conditions and 75 actions, deployed as a Hugging Face Space.
▸Post-trained Gemma-3-4B-IT to act as a sequential diagnostician via SFT (Unsloth + TRL) followed by a custom GRPO loop.
▸Designed a 5-rubric reward (accuracy, cost, anchoring, safety, format) with shaping that mitigates reward-hacking failure modes.
▸Evaluated on a 45-patient holdout against random, greedy, base-model, and Gemini Flash baselines with full per-rubric breakdowns.

Reinforcement LearningGRPOSFTGemma-3UnslothPOMDPHuggingFace

Live Demo Source

Conversational Voice AI

Apr – May 2024

End-to-end real-time voice-to-voice AI system with human-like interaction via a low-latency ASR → LLM → TTS pipeline over full-duplex WebSockets.

FastAPIWebSocketsASRTTSLLMGroqPython

Neural Collaborative Filtering

Oct – Dec 2023

High-impact movie recommendation system using Neural Collaborative Filtering with +20% accuracy improvement and a Gradio interactive demo.

PyTorchRecommendation SystemsGradioNCF

Neural Style Transfer

Jul – Nov 2023

Neural Style Transfer using VGG-19 with custom loss functions. ~30% reduction in processing time via L-BFGS optimization; showcased on an interactive website.

PyTorchComputer VisionVGG-19Optimization

Diabetic Retinopathy Detection

Jan – Jun 2022

CNN-powered retinal image classifier combining deep feature extraction with 5 ML/DL algorithms, achieving 92% accuracy and 27% reduction in false negatives.

PythonTensorFlowOpenCVCNNComputer Vision

Credit Card Fraud Detection

May – Nov 2021

Fraud detection system on 284,807 transactions using feature engineering and ensemble learning, achieving 95% accuracy with a random forest classifier.

PythonScikit-LearnEnsemble LearningFeature Engineering

Weather Data Analysis

Aug – Oct 2020

Unsupervised analysis of 25 weather features using 3 clustering techniques, improving data accuracy by 15% through preprocessing pipelines and visualizations.

PythonScikit-LearnUnsupervised LearningData Visualization