To solve the data problem for training humanoids i.e. train humanoids easily. It turns live video into actionable world state for robots to train on
The app turns live video into actionable world state to train robots. In real time it shows detected objects, segmentation masks, hand-object interactions, VLM summary of object state and possible robot actions are also noted down. Final json is generated