Agentic Trauma Life Support (ATLS) is a multilingual, agentic trauma triage decision-support tool built for the AMD Developer Hackathon. The user uploads a chest X-ray and dictates vitals plus a brief clinical vignette; ATLS returns a structured ATLS primary-survey assessment Airway, Breathing, Circulation, Disability, Exposure and renders it as an SBAR-style markdown handoff in English or Bahasa Indonesia. The entire system runs Qwen2.5-VL-72B-Instruct in full BF16 on one AMD Instinct MI300X. No tensor parallelism. No quantization. No CPU offload. 192 GB of HBM3 fits 72B in BF16 with margin, where H100 (80 GB) and H200 (141 GB) cannot. That makes the MI300X the only sub-$2 per GPU-hour option for the global-health, resource-limited deployment shape this product targets. The single-GPU constraint is not an optimization — it is the architectural argument for using AMD silicon at all. The agentic pipeline runs three roles on the same vLLM endpoint. The Drafter writes a strictly-typed TriageOutput JSON via vLLM's response_format=json_schema, with every ABCDE block populated, every imaging finding tagged with laterality and severity, and every recommended action carrying an urgency window. The Verifier re-sees the X-ray and the draft and emits notes plus path-walker patches that are applied to a deep copy of the draft and re-validated. On case_02 (massive hemothorax) the Verifier caught a left-versus-right chest-tube laterality error in 13 seconds — a never-event prevented by the same model talking to itself in a different voice. The Renderer turns validated JSON into SBAR markdown, keeping schema enums in English so the JSON validates regardless of UI language. Real-CXR end-to-end median TTFT is 1981 ms, throughput 19.9 to 21.5 tokens per second, and peak VRAM under concurrent batch-of-four reaches 183.95 of 191.69 GiB (96 percent), exactly at the configured 0.95 budget.
Category tags:"This is an extremely impressive and potentially life-saving project. The fact that it runs a 72B parameter vision-language model in full BF16 on a single AMD MI300X is a remarkable engineering achievement. The self-verification system where the Verifier caught a life-threatening laterality error (putting a chest tube on the wrong side) demonstrates the real-world value of multi-agent AI in healthcare. This addresses a critical global health need. Application of Technology: 🚀🚀🚀🚀🚀 5 - Runs Qwen2.5-VL-72B-Instruct in full BF16 on single MI300X (no tensor parallelism, no quantization). The agentic pipeline with Drafter, Verifier, and Renderer roles is sophisticated. The 96% VRAM utilization (183.95/191.69 GiB) under batch-of-four load shows excellent optimization. Presentation: 🚀🚀🚀🚀 4 - Clear technical explanation with specific metrics (1981ms TTFT, 19.9-21.5 tokens/sec). The "never-event prevented" story about the chest tube laterality error is compelling and demonstrates real value. Business Value: 🚀🚀🚀🚀🚀 5 - Addresses critical healthcare need in trauma triage. Could save lives in resource-limited settings globally. The single-GPU constraint makes it deployable where multi-GPU clusters aren't available. Important for global health. Originality: 🚀🚀🚀🚀🚀 5 - Extremely original. The self-verification system where the same model checks its own work in a different voice is innovative. Demonstrates AMD's unique advantage (192GB HBM3) for this deployment shape."
Sanem Avcil