The basic idea is that we scrape the site and use all that data to create a story about it, then once we get that we split that story insto smaller chunks that get drawing using stable diffussion text to image api. Once we get those we match them to the audio and send all that to clove backend to create a video out of it. The idea is to help understand complex subjects with visual and audio aids, this is still very exploratory and likely only useful for specific use cases. One of the downsides is that each video requires quite a bit of images so I have hit the max requests api limit, but I think I was able to find ways to minimize that.
🔥 7 days Hackathon 🦾 The first Stable Diffusion XL hackathon ✨ Even more models available! Use GPT-4, ChatGPT, Whisper, Cohere and AI21 and more! 🛠️ Incorporate Stable Diffusion 2.0 and Vercel software to your projects 🚀 Create your complete product MVP 🤝 Find co-founders and mentors on the lablab.ai platform