.png&w=256&q=75)
3
3
1 year of experience
I'm here to learn and make an impact. I'm relentlessly working towards being a leader in innovation, technology and my community. My interests in artificial intelligence, embedded systems and bioinformatic tracking for wearables shape my core competencies. My goal is to merge my passion for robotics and automation with computational biotechnology, contributing to groundbreaking advancements in medical technologies.

Gofer AI VSS-G is a Google/Gemini-native video search and summarization system for enterprise robotics teams. Robotics organizations generate large amounts of human demonstration videos, robot teleoperation footage, simulation runs, QA tests, and failure clips, but most of that footage remains unstructured and difficult to search, audit, or reuse. Gofer turns that raw video into structured robotics intelligence. The system uses Gemini to analyze task videos, identify objects, segment actions into phases such as reach, grasp, lift, transport, and place, and detect outcomes such as successful execution, failed grasps, object slips, or unsafe interactions. Teams can then search their robotics memory with natural language queries like “show failed grasps,” “find all mug pickup examples,” or “summarize this robot run.” Gofer also includes an enterprise governance layer that logs model actions, policy decisions, risk scores, and audit metadata for every analysis or export. Finally, Gofer can export selected clips into a LeRobot-compatible preview format, helping robotics teams convert valuable video evidence into reusable training, evaluation, and simulation data.
19 May 2026

Problem: AI agents are powerful, but they are blind to how work actually happens. They see prompts, files, and final outputs — not the real sequence of human decisions across windows. Insight: The most valuable context for agents is not just documents. It is the timeline of human actions: what was seen, clicked, corrected, retried, and completed. Solution: Gofer Agent Harness records multimodal workflows and converts them into structured, searchable agent memory. Demo: We record a workflow, segment it into task steps, run multimodal understanding on AMD GPUs, and let an agent retrieve the workflow to answer questions or generate an SOP. Business value: Every company has repetitive workflows trapped in screen recordings, calls, support sessions, and internal demos. Gofer turns that into reusable automation context. Future: This becomes the data layer for robotics and embodied AI: human demonstrations become reusable context for agents and robots.
10 May 2026

Gofer AI is a human-to-robot learning platform that transforms everyday demonstration videos into robot-executable intelligence. Instead of manually programming robotic behavior through conventional reinforcement learning, users can record a task using a phone or GoPro. Our system extracts keyframes using OpenCV, identifies task phases, objects, and human motion patterns using multimodal Gemini models, and converts demonstrations into structured semantic memory through a Video RAG architecture. These demonstrations are embedded, tagged, and stored for retrieval, allowing the system to reason over prior tasks and reuse knowledge. The extracted trajectories are converted into canonical action representations, then replayed and augmented in simulation using Isaac Lab and Real2Render2Real for scalable data generation. This pipeline enables behavior cloning, demo-initialized reinforcement learning, and diffusion-based policy training. The result is a robot-ready policy capable of sim-to-real transfer. Gofer AI bridges the gap between human intent and autonomous execution—turning video into intelligence.
15 Feb 2026