
RoboHazard Arena is a research-based drone simulation project exploring how small vision-language models perform when placed inside a live 3D environment. The goal is not to give the model step-by-step instructions, but to give it a mission: reach the green target pad safely while avoiding obstacles. The drone can be controlled manually, or control can be delegated to Qwen3-VL. The model receives a camera view, telemetry, target information, obstacle information, and available actions, then chooses high-level movements such as move forward, strafe, ascend, descend, hover, or land. The project shows both the promise and limitations of small VLMs for embodied control. Qwen3-VL-2B succeeds in some simple cases but fails or hallucinates in others, proving that more research, better simulation, and stronger evaluation are needed before small models can reliably control small vehicles in dynamic real-world environments.
10 May 2026