
Secrets leak to paste sites every minute — API keys, database URLs, customer PII — and most of them sit there until Google indexes them and the abuse begins. LeakGuard is an autonomous agent that closes that window. How it works. A LangGraph pipeline runs six nodes end-to-end: Discovery uses Bright Data's SERP API (with the brd_json=1 parsing flag) to issue Google dorks against a hot-reloading watchlist; Extraction pulls raw paste content through Bright Data's Web Unlocker, bypassing the bot walls that block naive scrapers; a local regex Triage drops the obvious noise cheaply; an Analyst (Claude Sonnet, recall-tuned) flags anything that could be a leak; a Judge (Claude Sonnet, temperature 0, three-axis rubric) only escalates with a score ≥ 8; and Alert ships a redacted Slack notification with the audit reasoning attached. Why Bright Data is load-bearing. Paste-site discovery without SERP access is guessing; Web Unlocker is what makes xtraction actually work past Cloudflare and rate limits. Both zones are real and validated end-to-end. Safety. Credentials are redacted in two layers before any alert leaves the box. LangSmith tracing is off by default — it would otherwise ship the exact secrets the agent exists to catch to a third-party log store. A pre-commit detect-secrets hook guards the repo itself. What's built. Real pipeline (not stubs), per-node tests + smoke test, a Day-3 eval set of seeded pastes for regex tuning, a Streamlit dashboard reading the JSONL audit log, ADRs for the load-bearing decisions, and a synthetic mock server so demos don't burn the $250 SERP credit cap.
31 May 2026

V.I.C.T.O.R. (Voice-Intelligent Cardiovascular Triage & Onset Recognition) is a voice-first AI triage agent designed for Emergency Departments. It addresses a critical gap in emergency medicine: atypical cardiovascular disease presentations — such as abdominal pain, nausea, fatigue, and jaw pain — are systematically undertriaged compared to classic chest pain, particularly in women, elderly patients, and underserved populations. V.I.C.T.O.R. runs a 5-agent swarm on Llama 3.1 8B Instruct, LoRA fine-tuned on ~61,000 training examples derived from MIMIC-IV v3.1 and the MUSIC sudden cardiac death dataset. The agents include a Triage Leader, a Patient Voice advocate, a Concordance Analyst that cross-references patient speech content against voice biomarker signals, a Clinical Note Writer, and an Evidence Synthesiser. Together they re-evaluate triage acuity in real time and flag cases where standard ESI scoring may have missed a serious cardiovascular event. The system features two interfaces: a patient-facing kiosk where patients speak naturally to Victor or Jackie (two voice personas) in their preferred language, and a clinician-facing dashboard styled as an EMR module, showing concordance alerts, voice biomarker trends, re-triage recommendations, and AI-generated clinical notes. Deepgram Flux Multilingual handles speech-to-text with automatic language detection, while ElevenLabs Flash v2.5 provides natural spoken responses. All inference runs on an AMD MI300X GPU via vLLM, with the LoRA adapter fine-tuned using ROCm and PyTorch on the same AMD hardware — full AMD stack from training to production. Quantitative analysis of MIMIC-IV data confirmed the project's core premise: non-chest-pain CVD presentations receive higher mean acuity scores (undertriaged) than chest pain, with the critical caveat that MIMIC-IV only captures patients who eventually received a CVD diagnosis — truly missed cases are invisible, meaning real-world bias is likely worse than measured.
10 May 2026