
ClutterBot is a proof-of-concept simulation platform that bridges natural language understanding and robotic task execution for household cleanup tasks. Users issue commands like "pick up the phone and the toy train," which Gemini 3 Flash parses into structured task lists. The system generates complete execution plans upfront, with Gemini deciding the sequence of pick-and-place operations for each object. The architecture combines a FastAPI backend hosted on Vultr (central system of record), a Next.js frontend for real-time monitoring, and a MuJoCo physics simulation featuring a Franka FR3 manipulator in a room environment with everyday objects. The robot executes inverse kinematics motions to relocate objects from scattered positions on a table to a collection bin, with each action streamed via WebSocket for live visualization. This prototype validates the feasibility of integrating large language models with robotic simulation pipelines, demonstrating how AI can translate high-level human intent into executable robot behaviors. While the current implementation uses deterministic motion planning with hardcoded inverse kinematics rather than learned policies, the framework establishes foundational patterns for future work incorporating adaptive control, real hardware integration, and expanded object manipulation capabilities. The plan-first approach (Gemini generates the full task plan in a single API call) shows AI reasoning while keeping execution fast and deterministic, making it suitable for real-time interactive use.
15 Feb 2026