ML Engineer @ NoBroker, Bengaluru

Sneh Shah

|

I build production conversational AI systems — multi-channel agents, memory services, and ASR pipelines. Side-projects live at the intersection of RL post-training and clinical reasoning.

scroll

Experience

NoBroker (Convozen.ai)

Machine Learning Engineer

Jul 2025 – Present

Bengaluru, Karnataka

  • Engineered and deployed end-to-end Chatbot and Email Co-Pilot harness powering live multi-channel conversational AI in production.
  • Architected MSOCA (Multi-Session + Omni-Channel Architecture) with a unified Agent-User-Context, sharing state across Voice, Chat, and WhatsApp via single-active-focus arbitration.
  • Migrated session state from MongoDB to Redis under a hot/cold pattern with write-behind flush, cutting per-turn I/O by 50–150 ms to meet voice TTFB budgets.
  • Built a standalone Memory Management Service (FastAPI) with pluggable Redis, Mongo, and Vector backends implementing rolling-window summarization and RAG retrieval, reducing effective prompt size by 30–40%.
  • Enabled multi-modal ingestion via Whisper STT and Vision LLMs, letting agents reason natively over voice-note, image, and video turns.
  • Fine-tuned Whisper and multilingual Conformer ASR and deployed on NVIDIA Riva and Triton, delivering sub-second multilingual transcription at production scale.
  • Engineered a "Piggyback" end-of-call extraction emitting post-call analysis JSON in the agent's final turn, eliminating a separate analysis LLM call and cutting analytics cost to output tokens only.
LLMRAGRedisRivaTritonTTSASRWebsocketGCS
Salesken.ai

ML Engineer 1

Apr 2024 – Jun 2025

Bengaluru, Karnataka

  • Reduced monthly transcription costs by 20% by fine-tuning a Conformer ASR model to replace a third-party service.
  • Built an LLM Agent system to generate personalized client emails, improving creation accuracy.
  • Developed a RAG-based Knowledge Extractor with web scraping, Azure, and Kafka, boosting data retrieval efficiency by 25%.
  • Engineered a Kafka Controller for idempotent message handling, cutting data duplication by 15%.
  • Optimized and deployed a Whisper-ASR system on Triton Inference Server, reducing latency by 25%.
ASRLLMRAGKafkaTritonAzure
Salesken.ai

ML Intern

Jan 2024 – Mar 2024

Bengaluru, Karnataka

  • Increased ASR transcription accuracy by 10% by adding speaker diarization and voice activity detection modules.
  • Deployed Emotion Detection models on Triton Inference Server to enhance conversational AI capabilities.
ASRTritonPyTorchDocker

Projects

Featured Project

House M.D. Clinical Reasoning RL Environment

Apr 2026

POMDP emergency-department simulator for training diagnostic AI agents. Post-trained Gemma-3-4B-IT via SFT + custom GRPO loop with a 5-rubric reward function.

  • Built a POMDP emergency-department simulator with 15 conditions and 75 actions, deployed as a Hugging Face Space.
  • Post-trained Gemma-3-4B-IT to act as a sequential diagnostician via SFT (Unsloth + TRL) followed by a custom GRPO loop.
  • Designed a 5-rubric reward (accuracy, cost, anchoring, safety, format) with shaping that mitigates reward-hacking failure modes.
  • Evaluated on a 45-patient holdout against random, greedy, base-model, and Gemini Flash baselines with full per-rubric breakdowns.
Reinforcement LearningGRPOSFTGemma-3UnslothPOMDPHuggingFace

Conversational Voice AI

Apr – May 2024

End-to-end real-time voice-to-voice AI system with human-like interaction via a low-latency ASR → LLM → TTS pipeline over full-duplex WebSockets.

FastAPIWebSocketsASRTTSLLMGroqPython

Neural Collaborative Filtering

Oct – Dec 2023

High-impact movie recommendation system using Neural Collaborative Filtering with +20% accuracy improvement and a Gradio interactive demo.

PyTorchRecommendation SystemsGradioNCF

Neural Style Transfer

Jul – Nov 2023

Neural Style Transfer using VGG-19 with custom loss functions. ~30% reduction in processing time via L-BFGS optimization; showcased on an interactive website.

PyTorchComputer VisionVGG-19Optimization

Diabetic Retinopathy Detection

Jan – Jun 2022

CNN-powered retinal image classifier combining deep feature extraction with 5 ML/DL algorithms, achieving 92% accuracy and 27% reduction in false negatives.

PythonTensorFlowOpenCVCNNComputer Vision

Credit Card Fraud Detection

May – Nov 2021

Fraud detection system on 284,807 transactions using feature engineering and ensemble learning, achieving 95% accuracy with a random forest classifier.

PythonScikit-LearnEnsemble LearningFeature Engineering

Weather Data Analysis

Aug – Oct 2020

Unsupervised analysis of 25 weather features using 3 clustering techniques, improving data accuracy by 15% through preprocessing pipelines and visualizations.

PythonScikit-LearnUnsupervised LearningData Visualization

Skills

Languages

PythonRJavaSQLMATLAB

Libraries

PyTorchTensorFlowTransformersTRLUnslothScikit-LearnNumPyPandasOpenCVFastAPIPydantic

Tools & Infra

AzureDockerLangChainHugging FaceWeights & BiasesMongoDBRedisPostgresTritonRivaKubernetesKafkaJenkinsCI/CDPrometheusGit

ML & DS

Reinforcement Learning (GRPO, SFT, RLHF)LLM Post-trainingRAGMulti-Agent SystemsASR / TTSRecommender SystemsComputer VisionA/B Testing

Research

Education

Christ (Deemed to be University)

Masters of Science — Data Science

Bengaluru, Karnataka

Aug 2022 – Apr 2024

GPA: 3.63 / 4.0

Gujarat University

Bachelors of Science — Data Science

Ahmedabad, Gujarat

Jun 2019 – May 2022

GPA: 7.77 / 10

Achievements

Certifications