2
2
India
2+ years of experience
I am Software Engineer with an MLOps focus based in Noida, India, currently working at Nokia on production ML infrastructure for Vodafone Idea and Bharti Airtel. Over the past 2+ years he has shipped Kubernetes-orchestrated ML pipelines that cut deployment cycles by 67%, reduced MTTR by 3×, and prevented $200K+ in annual SLA penalties through proactive anomaly detection on live 4G/5G networks. OpsPulse Sentinel is a direct product of that experience. Having spent years watching SRE teams manually correlate logs, deployment diffs, and service topology during incidents, he built the agent he wished existed on call — one that reasons across all three sources simultaneously and delivers a ready-to-run fix in under 60 seconds. His technical background spans Kubernetes, MLflow, LangChain, RAG systems, and LLM deployment. He holds Nokia's Associate AI/ML & GenAI certification, AWS certification, and deeplearning.ai specializations in Neural Networks and NLP.

Enterprise SRE teams waste 60–90 minutes per incident manually correlating logs, deployment history, and service topology before identifying a root cause. OpsPulse Sentinel eliminates that triage window entirely. OpsPulse Sentinel is an autonomous Root Cause Analysis agent built for production SRE workflows. It ingests raw telemetry, deployment history, and cluster architecture — then runs a structured multi-stage pipeline to deliver a confidence-scored RCA report with a ready-to-run remediation command in under 60 seconds. The agent uses a dual-model Gemini pipeline. Stage 1: Gemini Flash-Lite acts as a high-speed anomaly filter, stripping INFO noise and returning only ERRORs, WARNs, and suspicious patterns. Stage 2: filtered anomalies are combined with ChromaDB vector memory of past incidents, a live cluster manifest, and the deployment log. Gemini Pro reasons across all four sources using temporal causation logic — if a deployment at time T precedes errors at T+N, it is surfaced as primary causal evidence. Output is a Pydantic-validated JSON RCA: root cause, affected services, confidence score, evidence chain, and a specific remediation command. A policy guardrail blocks execution below 0.75 confidence and escalates to human review — the agent never acts on incomplete reasoning. The dual-model routing isn't just cost optimization. Flash-Lite's low latency makes real-time anomaly filtering practical. Pro's reasoning depth handles deployment context, telemetry, historical memory, and cluster topology in one coherent pass. The human approval gate before execution is a deliberate design decision. Autonomous reasoning with human-controlled execution is the right model for production systems — and means junior engineers can act on incidents that would previously require a senior SRE. Stack: Python · Streamlit · Gemini API (Flash-Lite + Pro) · ChromaDB · Pydantic
19 May 2026

Modern enterprise infrastructure generates millions of log events daily. Current AIOps tools like Datadog detect anomalies statistically but cannot explain why failures happen based on system architecture. Senior engineers still spend 90 minutes manually performing root cause analysis during critical downtime. OpsPulse Sentinel is an LLM-native agent powered by Gemini 3.1 that bridges this semantic gap. It runs a two-stage pipeline: Gemini Flash-Lite filters raw telemetry noise, then Gemini Pro reasons across filtered anomalies, deployment history, and cluster architecture simultaneously to identify the true root cause — not just the service with the most errors. The agent demonstrated this by correctly identifying Redis-Cache as the root cause of a system-wide outage, despite PostgreSQL generating all the FATAL logs. It separated symptom from cause through architectural reasoning — something no statistical tool can do. Every diagnosis produces a structured JSON output with a confidence score, chronological evidence chain, and an exact kubectl or helm command ready to execute. A human approval gate blocks automated remediation below a 75% confidence threshold, ensuring enterprise-grade governance. Built with Gemini 3.1 Pro and Flash-Lite, ChromaDB vector memory for historical incident retrieval, Pydantic schema enforcement for structured output, and a Streamlit dashboard for human-in-the-loop control.
19 May 2026