AsBuilt Lens: Autonomous Visual QA

Streamlit
application badge
Created by team AsBuilt Lens on May 09, 2026
Vision & Multimodal AI

Visual quality control in manufacturing is usually a nightmare because you have to train custom vision models for every single part. I wanted to see if we could completely eliminate the training phase using large Vision-Language Models. AsBuilt Lens is a dual-agent system running on the AMD MI300X. It takes a live camera feed (we wrote a custom OpenCV script to auto-capture when the object is perfectly still) and sends it to the first agent, the "Visual Inspector". This agent gets a natural language prompt (like "check if there are 4 resistors and 1 capacitor") and returns a sturctured JSON with bounding boxes and counts. If everything passes, it stops there. But if something fails, the code does an autonmous handoff to Agent 2, the "Quality Engineer". This agent looks at the failure and reasons about the root cause (e.g., is it a physical defect or a missing part?) and generates a corrective action plan. Running this required a massive model (Qwen3-VL-32B). Doing this on standard GPUs would need heavy quantization, but thanks to the 192GB of HBM3 on the MI300X, we ran it natively in FP16 using vLLM and ROCm. The reasoning speed and visual precision is honestly incredible.

Category tags: