
ElevenClip AI is a multimodal AI video editing tool that helps creators turn long videos into short-form highlight clips. The system analyzes video, speech, and transcript context to identify engaging moments, then generates vertical clips with subtitles. The project supports two editing modes. Normal mode selects strong segments and adds clean sentence-style subtitles for a standard short-form workflow. High-Retention Editing mode creates more dynamic clips by planning per-timestamp edits such as face-first cropping, fast zooms, word-level emphasis for highlight phrases, sentence subtitles for normal speech, and lightweight visual effects. A key design goal is to keep the creator in control. ElevenClip AI does not just generate a final video and stop there. After AI generation, users can review the selected clips, adjust start and end boundaries, remove middle sections, edit subtitle text and timing, change subtitle styling, re-render, and download the final result. This human-in-the-loop editing step makes the workflow more practical because creators can fix AI mistakes and match the final clip to their own style before publishing. The backend uses Whisper-style transcription for speech timing, Qwen2.5-VL for multimodal video understanding, FFmpeg for rendering, and AMD GPU Cloud for accelerated AI processing. The frontend is built with Next.js and provides an interactive editor for final review and polishing. ElevenClip AI is designed for content creators, livestreamers, marketers, and educators who need to repurpose long videos into short clips faster while still keeping creative control over the final edit.
10 May 2026