
Patient wait time is one of the most underused assets in healthcare. Instead of sitting idle in a waiting room, those minutes become structured clinical value before the doctor even enters the room. Patients scan a QR ticket on arrival and complete a bilingual voice-and-photo intake directly from their own phone while tracking their queue position in real time. Behind the scenes, three AI agents work simultaneously. The Intake Agent captures structured OPQRST history from natural conversation. An independent Triage Agent continuously screens for red flags such as cardiac events, stroke, and sepsis while assigning a live priority score. At the same time, a Summarizer Agent prepares a physician-facing SOAP note using Featherless. Speechmatics transcribes the in-room consultation in real time, while the clinician dashboard brings together live queue monitoring, vital-sign visualization, drug-interaction checks, and printable prescriptions into a single workflow. The operational value compounds with every visit. By delivering a complete chief complaint, structured history, and triage category before the consultation starts, the system recovers 3–5 minutes per patient without increasing staffing requirements — directly improving clinic throughput. Independent red-flag triage adds another layer of safety by surfacing high-risk presentations that might otherwise be missed during busy intake flows, reducing liability and adverse-event risk. AI-drafted SOAP notes and automated reminders reduce after-hours documentation burden and help lower no-show rates. Bilingual voice intake improves accessibility for diverse patient populations without requiring additional interpreter resources. Most importantly, every interaction automatically becomes structured, audit-ready data, allowing compliance reporting, throughput analytics, and quality metrics to emerge naturally from the workflow itself — turning intake friction into operational leverage.
19 May 2026

Agentic Trauma Life Support (ATLS) is a multilingual, agentic trauma triage decision-support tool built for the AMD Developer Hackathon. The user uploads a chest X-ray and dictates vitals plus a brief clinical vignette; ATLS returns a structured ATLS primary-survey assessment Airway, Breathing, Circulation, Disability, Exposure and renders it as an SBAR-style markdown handoff in English or Bahasa Indonesia. The entire system runs Qwen2.5-VL-72B-Instruct in full BF16 on one AMD Instinct MI300X. No tensor parallelism. No quantization. No CPU offload. 192 GB of HBM3 fits 72B in BF16 with margin, where H100 (80 GB) and H200 (141 GB) cannot. That makes the MI300X the only sub-$2 per GPU-hour option for the global-health, resource-limited deployment shape this product targets. The single-GPU constraint is not an optimization — it is the architectural argument for using AMD silicon at all. The agentic pipeline runs three roles on the same vLLM endpoint. The Drafter writes a strictly-typed TriageOutput JSON via vLLM's response_format=json_schema, with every ABCDE block populated, every imaging finding tagged with laterality and severity, and every recommended action carrying an urgency window. The Verifier re-sees the X-ray and the draft and emits notes plus path-walker patches that are applied to a deep copy of the draft and re-validated. On case_02 (massive hemothorax) the Verifier caught a left-versus-right chest-tube laterality error in 13 seconds — a never-event prevented by the same model talking to itself in a different voice. The Renderer turns validated JSON into SBAR markdown, keeping schema enums in English so the JSON validates regardless of UI language. Real-CXR end-to-end median TTFT is 1981 ms, throughput 19.9 to 21.5 tokens per second, and peak VRAM under concurrent batch-of-four reaches 183.95 of 191.69 GiB (96 percent), exactly at the configured 0.95 budget.
10 May 2026