Commander is a background voice agent for Windows that handles the mundane and complex tasks every knowledge worker has to deal with every day. Activate it with a double key-press without leaving your current application, to get intelligence right where you need it. Dictation mode streams your speech directly to the cursor, letting you compose documents, emails, chat messages as fast as you can speak. Meeting mode records a continuous, timestamped transcript of any conversation, saving it to a plain-text file you can read, search and have summarised. Command mode β the heart of Commander β accepts natural-language instructions and translates them into desktop automations: opening applications, running shell commands, moving files, filling forms, and chaining multi-step workflows. Under the hood, Commander uses the Speechmatics real-time speech API for low-latency transcription, activated via hot key. Desktop automation tasks are powered by Gemini Flash for fast, responsive execution, while complex agentic orchestration β planning and coordinating multi-step workflows β is handled by Gemini Pro. PyAutoGUI is responsible for the desktop automation. A system-tray icon gives quick access to mode switching and microphone selection and general config, a simple configuration point that runs the agent in the background so it's always there when you need it.
Category tags: