
An AI agent that watches your desk through Apple's Continuity Camera Desk View. Point your iPhone at a textbook, sketch, or to-do list, hit the mic button, and ask DeskMind to solve, explain, summarize, or push items to Notion. Built for a hackathon sponsored by Google (Gemini), Speechmatics, and other partners. Targeting the Speechmatics Awards (real-time voice STT + TTS). Stack Frontend: Vite + React + TypeScript, KaTeX for math, Tailwind CSS v4. Backend: Node.js + Hono, SQLite via better-sqlite3. Brain: Gemini Flash with self-escalation to Gemini Pro for hard problems. Voice (STT + TTS): Speechmatics for both directions — real-time STT for push-to-talk commands (live transcript in UI) and sub-150ms streaming TTS for spoken responses. Single vendor, single API key. Persistence: SQLite. Integrations: Notion (pre-authorized workspace for demo). Demo URL: Backend runs on the demo Mac (Continuity Camera requires the laptop physically anyway); exposed via Cloudflare Tunnel for a public URL.
19 May 2026