OpenAI Whisper Applications

Browse applications built on OpenAI Whisper technology. Explore PoC and MVP applications created by our community and discover innovative use cases for OpenAI Whisper technology.

Schrodinger ClarifaiLlama

We participated in an exciting 3-day hackathon by, combining Clarifai's industry-leading computer vision with Llama2's advanced natural language model developed by Meta. Overview of "Schrödinger's ClarifaiLlama" app For the hackathon, we built an AI-powered platform called "Schrödinger's ClarifaiLlama" that generates custom multimedia content on any topic by searching across indexed data. Leveraging Clarifai's computer vision and Llama2's language capabilities Our app showcases innovative ways to utilize Clarifai's deep learning for image and video analysis together with Llama2's ability to understand text and generate coherent content. Ingesting and indexing multimedia data The system ingests data from diverse sources like YouTube, PDFs, and images. Powerful vector search with Faiss indexes text, audio, and images for fast semantic retrieval. Generating custom content from user queries Users can query the system through a chat interface. Llama2 analyzes the queries and generates relevant ebooks or blog posts by pulling together content from the indexed multimedia data. Transforming multimedia into cohesive content Llama2's language mastery transforms disjointed multimedia information into smooth, cohesive ebooks and blog posts on the fly. Benefits of combining multimedia search with natural language generation By fusing robust semantic search across text, audio, and visuals with Llama2's content creation skills, our platform opens new possibilities for automated custom content generation.

Schrödinger's ClarifaiLlama Hackathon
ClarifaiLangChainOpenAILlama 2WhisperChroma

Echo Ai

Introducing Echo Ai: Revolutionise the way you approach meetings. Transcript Transformation with Autonomous Agents: Echo Ai redefines meeting transcripts. By employing cutting-edge autonomous agents, it doesn't just capture words; it deciphers the essence of discussions in real-time. Say goodbye to passive transcripts and welcome a dynamic understanding of your meetings. Organised Task Lists: No more post-meeting confusion about who's responsible for what. Echo Ai seamlessly organizes tasks discussed during the meeting, simplifying task delegation and tracking progress. Brief for Absentees: Team members who missed the meeting? Echo Ai has their back. It generates personalized overviews, ensuring that absentees remain informed without wading through lengthy transcripts. Actionable Follow-Up Suggestions: Keeping the momentum after a meeting is crucial. Echo Ai offers actionable follow-up suggestions, from crafting the perfect email to prioritizing tasks and scheduling follow-up meetings. Elevate your post-meeting efficiency. Curated Resources: Need to delve deeper into a discussed topic? Echo Ai provides curated resources, including links to insightful articles, best practices, and relevant case studies so that you always have the latest information when given the mic to present. Effortless Summaries: Condensing hours of dialogue into clear and concise summaries is now effortless. Echo Ai distils key takeaways, enabling you to grasp the heart of the meeting in a fraction of the time.

application badge


Competitive intelligence, sometimes referred to as corporate intelligence, refers to the ability to gather, analyze, and use information collected on competitors, customers, and other market factors that contribute to a business's competitive advantage. Competitive intelligence is important because it helps businesses understand their competitive environment and the opportunities and challenges it presents. Businesses analyze the information to create effective and efficient business practices. Utilizing autonomous agents for competitive intelligence offers a spectrum of strategic advantages. These intelligent agents, powered by advanced algorithms and machine learning, enable organizations to seamlessly gather, process, and react instantaneously using complex and advanced reasoning based on Artificial Intelligence, without human intervention to vast amounts of data from diverse sources in real-time. By autonomously monitoring competitors' activities, product developments, market trends, and customer sentiments, these agents provide up-to-the-minute insights that facilitate rapid decision-making and agile strategy formulation. They minimize human bias and error while maximizing the depth and breadth of information collected, ensuring a comprehensive understanding of the competitive landscape. Furthermore, autonomous agents excel in scalability and efficiency, enabling businesses to monitor a wide range of competitors concurrently, identify emerging opportunities and threats, and allocate resources effectively. Ultimately, harnessing autonomous agents for competitive intelligence empowers organizations to proactively adapt to dynamic markets and gain a sustained competitive edge.



Introducing "VoiceStoryBoard," a groundbreaking application that leverages the power of artificial intelligence to revolutionize how stories are narrated and consumed. By utilizing cutting-edge AI voice cloning technology, our platform aims to create a dynamic and immersive storytelling experience. VoiceStoryBoard intelligently identifies characters in written scripts and assigns them unique, engaging voices from an extensive library. This allows listeners to experience stories with a level of depth and realism that text-to-speech systems cannot provide. But we don't stop there. Our platform uses contextual cues to adapt the narration style, ensuring the voice aligns with the mood and tone of the scene. Whether it's a climactic battle or a tender moment of dialogue, VoiceStoryBoard ensures that the voiceover complements the narrative perfectly. Our solution presents a substantial opportunity for businesses in the entertainment, education, and publishing sectors. It can be utilized to create engaging audiobooks, enhance video game narratives, assist language learning, and more. By transforming a traditionally static, single-voice narration into a dynamic, multi-voice experience, we aim to redefine how stories are told and consumed. With VoiceStoryBoard, we're not just reading stories—we're bringing them to life. As we continue to develop and expand our technology, we envision a world where everyone can experience their favorite narratives in a new, immersive way. Join us on this exciting journey and help shape the future of storytelling.

Character Mania
application badge


Soma is a groundbreaking solution for those tired of struggling with converting lengthy audio recordings into written text. With Soma, you can effortlessly convert audio to text and even translate it into multiple languages. But that's not all! Soma goes above and beyond by offering a unique summarization feature, condensing the audio's content for quick understanding, and a chat AI that allows users to ask questions about the audio content. Investing in Soma is an excellent opportunity due to its massive target market. Focusing on 1.35 billion English speakers and 480 million Arabic speakers worldwide, capturing just 5% of each group would mean 67.5 million potential English users and 24 million potential Arabic users. The demand for Soma's services is undeniably substantial. The business model revolves around a subscription method, featuring three plans: the Starter Plan (free), Premium Plan, and Ultimate Plan, each providing varying features and benefits. This straightforward approach allows users to access the app's capabilities with ease. Soma's success is further bolstered by its skilled team of four individuals, each possessing expertise in their respective fields. Their combined knowledge and dedication ensure that Soma will excel in the audio conversion and translation industry. By investing in Soma today, you become a part of an incredible journey to revolutionize audio processing. With a wide reach, an attractive business model, and a talented team, Soma is poised for remarkable achievements. Thank you for considering Soma, and we hope you join us in reshaping the future of audio conversion and translation. Have a fantastic day!

application badge

BlaBlaLand - your personal AI companion

A platform-agnostic, AI-powered voice interface, enabling personalized digital character creation for immersive, fun, and transformative tech interaction. We want to address a emerging problem: the quest for new ways of communication with technology, beyond the conventional keyboard input. Our goal is not only to promote the joy of discovery and product design but also to create barrier-free solutions for people, enabling user to interact with technologies such as artificial intelligence. We aim to create digital personalities and characters, ranging from fun little monsters, like our BlaBlaLand monster, to more or less familiar personalities. We see the value and importance of such digital personalities, especially in times of loneliness, as they always offer a listening ear and companionship.In addition, we have set ourselves the ambitious goal of allowing users to create their own characters. Our goal is to develop a solution that allows the generation of individual, AI-supported characters that can be integrated into various systems. These characters could serve as personalized voice assistants, with individual voices, personalities, and even areas of expertise. They could be implemented in any system with an internet connection, microphone, and speaker, from cars to home assistants to mobile apps. This solution would allow users to have a truly individual user experience. They could create a voice assistant that caters to their specific preferences and needs and keep this assistant consistent across different devices. Businesses could use such individualized characters to create a unique brand experience. For example, a car manufacturer could develop a special assistant for its cars that reflects the brand image. The potential use cases have a wide range and with a subscription based app or pay-per-custom-character we see a high chance of monetizing the idea. Especially with a little animated storyteller for children.

GPT-3.5OpenAIWhisperStable DiffusionElevenLabs

Retriever AI

Retriever AI is an innovative software solution that leverages cutting-edge artificial intelligence technology to revolutionize the way users interact with their Windows operating systems. By leveraging the capabilities of OpenAl's Whisper Automatic Speech Recognition (ASR) system and ElevenLabs' advanced interaction the application delivers a transformative user experience. Users can interact with their computers using natural spoken language, receive auditory feedback, and carry out tasks without the traditional visual interfaces. At its core, Retriever AI is powered by advanced machine learning algorithms that enable it to understand and respond to user commands effectively. With a simple "Start" command, users can invoke Retriever AI to assist them in navigating their system, opening applications, searching for files, and much more. It is like having a personal assistant dedicated to making your computer interactions more efficient and enjoyable. The software is designed with a user-friendly interface that is easy to start and stop, and it's designed to be almost hands-free from the keyboard. Its design is meant for the visually impaired and blind, and it's geared toward being able to complete normal functions using natural language. In a digital world where efficiency and user experience are of utmost importance, Retriever AI serves as a valuable tool for enhancing productivity, simplifying tasks, and creating a more intuitive interaction between users and their Windows systems even if you aren't visually impaired or blind. Whether you're a professional looking for a smarter way to navigate your workspace, a student aiming for better efficiency, or just a casual user hoping to get more out of your system, Retriever AI is designed to meet your needs.


Smart decision with AI and cognitive science

Strategic Thinking Systems (STS) lies at the convergence of AI, cognitive science, spatial, web3, and voice! It facilitates the organization and communication of thoughts in the context of important, strategic decisions. It puts users in charge of their content by allowing control over what is shared and with whom, providing innovative monetization opportunities. Steve Jobs famously said the computer was like a bicycle for the brain. We contend that AI is turning it into a powerful electric bike. What is needed now are safe and smooth paths for everyone to reach their respective destinations, engage and participate in this age of abundance, and realize their full potential. Our early prototype is ready for brave beta testers who are comfortable using a still-evolving platform. We are looking for passionate individuals and forward-looking organizations to submit use cases, provide content, and help steer the vision toward a tool that will work for them. Why is voice important to our mission? First, it's a question of accessibility and inclusion. Not everybody can read and right. Second, it's a matter of communication. During this hackathon, we've implemented the multilingual model from ElevenLabs, and we were delighted by the results when we tested it with content in English, French, Spanish, Polish, Dutch and German. Third, it's a requirement, a must have to bring collaborative ideation to the metaverse, where keyboards are cumbersome at best, but mostly impractical. We believe that a great voice interface, for output and input, will be a game changer for the space of spatial experiences. Fourth, we strongly believe that a well-designed and implemented voice interface will be the key to achieve and maintain a state of flow, where your tools are not impeding nor slowing down your thoughts.

Strategic Thinking Systems

Research assistant

This project revolves around the development of a research assistant using the Google Vertex AI Palm2 platform. The aim is to streamline the process of searching for and accessing academic papers from Google Scholar, providing researchers with a user-friendly and efficient tool. The research assistant is implemented as a Streamlit application, allowing users to input their search specifications and navigate through Google Scholar seamlessly. One of the key features of the research assistant is its automatic scraping functionality. Once the user provides their search criteria, the application scours Google Scholar across multiple pages, retrieving relevant papers. The scraped papers are then organized into a comprehensive dataframe, providing researchers with a structured overview of the available literature. Additionally, the application also selects and provides downloadable PDF versions of the papers, making it convenient for users to access and read the full content. To further enhance the capabilities of the research assistant, it integrates with Google Vertex AI and Langchain. Google Vertex AI is a powerful machine learning platform that enables users to leverage advanced AI models and tools. By integrating with Vertex AI, the research assistant allows researchers to create a knowledge base from the downloaded papers, enabling them to extract insights and answer questions related to the content. Langchain, another crucial component, provides additional functionality for knowledge extraction. It offers a range of AI models and tools specifically designed for language processing and analysis. Integrating Langchain with the research assistant expands its capabilities, allowing researchers to delve deeper into the papers and extract valuable information.


Cohesive AI

Cohesive AI is focused on bringing cohesion back into organizations by integrating sources of data across the GTM/Engineering divide that most companies face. Masking the complexity of CRMs by transparently summarizing data from customer calls, engineering feature requests, and support tickets allows employees to focus on making customers successful and in turn driving increased revenue. All the data in a single source of truth without human intervention drives better product awareness for engineering, more accurate insights for sales leadership and ultimately brings all parts of the organization closer together. Cohesive AI starts at the Customer Story powered by a Monday AI Assistant interface which uses Generative AI to create a customer story video by leveraging GPT 3.5 to summarize all of the customer activity transcripts and create prompts for is used to create consistent background images that match the emotions and content of each phase of the customer story. Generative AI allows us to provide a wholistic overview of a customer's story in an engaging way by combining data across various systems and producing an easy to watch 10 - 20 second video set against beautiful artwork. Cohesive AI currently leverages Whisper and Monday's AI Assistant interface to summarize and diarize recorded sales calls, automatically log the transcript into Monday and extract valuable insights such as relevant feature requests and potential ACV opportunities using GPT-3.5. Once the feature requests are identified, a Pinecone database loaded with all of the feature requests in Monday is leveraged to identify similar existing feature requests and automatically attach that customer as interested. Lastly, Cohesive AI provides a Monday AI Assistant interface for Product Management to easily engage with the field by notifying all relevant account teams of an interest to interview their customer.

Cohesive AI
Monday AI AssistantChatGPTWhisperGPT-3.5Stable Diffusion


AI-powered Personal Tutor: Vocava leverages state-of-the-art Large Language Models to act as a personalized language tutor. This AI can adjust its teaching strategy according to user's fluency and interests, making each learning session tailored and efficient. - Immersive Learning: Unlike traditional language apps that focus on vocabulary and grammar, Vocava focuses on creating immersive, context-based learning experiences. This mimics how people naturally acquire languages, making learning more intuitive and enjoyable. - Language Translation and Conversation Practice: Vocava offers a translation module with added features like part-of-speech tagging and explanations. Moreover, users can engage in conversation with the AI tutor in the Chatterbox module, practicing their speaking and listening skills. - Storytelling and Reading Comprehension: The Storytime module presents learners with stories in their target language and offers comprehension questions, reinforcing understanding in an entertaining way. - Culture Corner: Vocava goes beyond language learning, offering insights into the culture and traditions of different regions. This helps users understand the context of the language and adds richness to the learning experience. - Learning Through Games: Vocava's Arcade module presents a series of games that teach language in a fun and engaging manner. From Pictionary and MadLibs to Jeopardy, learning becomes a delightful activity rather than a tedious chore. - Dynamic Vocabulary Learning: The Playground module allows learners to generate new vocabulary and phrases, save known phrases, and review them. All these phrases are embedded in a vector database for future reference. - Analytics Dashboard: Vocava offers a comprehensive dashboard to track learner's progress over time, making it easy to see improvements and identify areas for focus. - Newsfeed: Users can access real-time content in their target language, practicing their skills with actual, relevant information.

The Irrelevant Elephant
application badge
CohereChromaOpenAIDALL-E-2WhisperAnthropic Claude

WIM Whatd I Miss

Ask pointed questions about a given playlist and get back a summary, key points, and related timestamps generated via AI! 🤖 Could be podcast series, a learning series, or something completely different! Can take in even very large/long series (tested on ~150 ~2-hour long podcasts)!Ask pointed questions about a given playlist and get back a summary, key points, and related timestamps generated via AI! 🤖 Could be a podcast series, a learning series, or something completely different! Can take in even very large/long series (tested on ~150 ~2-hour long podcasts)! This tool can take a YouTube transcript from one or more videos to be used to answer questions on a topic. The output will include a generated overall summary and generated key points from the video(s) by reading select parts of the transcript. The output will also include links to the relevant video, timestamped to the specific quote/snippet related to its respective key point. This tool can be useful to learners going through a video series playlist to review or identify where the series talks about a topic. It can also be used for educators in creating lessons from a series of videos. It also can be used for more casual enjoyment such as reviewing what the hosts have said on a particular topic. This use case is especially relevant for podcasts where hosts may revisit the same topic across multiple topics. Although Anthropic's Claude model can take in 100k tokens, this still creates a limit to what's read in by the LLM. This project will attempt to read in all the selected transcripts for the available model but if the transcript is too big for even the beefiest model, the tool will strategically select portions of the relevant transcripts based on the user fed question.

Anthropic ClaudeBERTGenerative AgentsWhisper

Flow Genius

Flow Genius is an intuitive and user-friendly conversational bot creation platform designed to help businesses of all sizes build powerful chatbots without coding or technical knowledge. With Flow Genius, users can easily create custom chatbots that can handle customer queries, process transactions, and perform various other functions. The platform features a drag-and-drop interface, pre-built templates, and a variety of integrations with popular messaging platforms and business tools. Flow Genius is designed to be simple and easy to use, even for users without technical knowledge or coding experience. The platform's drag-and-drop interface and pre-built templates make creating custom chatbots that can handle a wide range of customer queries and tasks quick and easy. With Flow Genius, businesses can create chatbots that perform various functions, from handling customer queries and processing transactions to scheduling appointments, sending notifications, and more. This versatility makes the platform a valuable tool for businesses of all sizes and industries. Flow Genius integrates with popular messaging platforms and business tools, including Facebook Messenger, Slack, and Zapier. This makes it easy for businesses to connect with their customers on their preferred platforms and automate their workflows across multiple tools. Flow Genius provides users with detailed analytics and reporting tools to help them track their bot's performance, optimize their conversational strategies, and improve their customer engagement over time. This data-driven approach can help businesses save time, increase efficiency, and boost customer satisfaction.

The Astro Cats-trophes
OpenAIAI21 LabsWhisperRedisLangChain


Our app is a unique platform that offers both content creators and users an innovative way to generate and access various types of content. The app has two interfaces: Explorer and Creator, where visitors can access various types of content, including videos, articles, audios, and tweets while creators can upload, edit and use AI tools to generate content. Our app aims to solve the problem of time-consuming content creation and fragmented content discovery. By offering multiple types of content in a single platform, we aim to increase user engagement and retention while offering creators an opportunity to monetize their content. Market: The global content creation and discovery market is expected to reach $892.5 billion by 2027, with an annual growth rate of 16.8%. The increasing demand for video content, podcasts, and other forms of digital media presents a significant opportunity for our app to succeed in the market. Competitive analysis: Our app faces competition from established content creation and discovery platforms such as YouTube, Medium, and Spotify. However, our unique value proposition of offering multiple types of content in a single platform, along with AI generative tools for creators, sets us apart from competitors. Marketing strategy: Our app will be marketed primarily through social media, paid advertising, and partnerships with content creators and publishers. We will also offer referral programs to incentivize users to invite their friends and family to use the app. Revenue model: We plan to generate revenue through a freemium model, where the app is free to access for users, but creators pay for premium tools and features. We will also offer subscription plans for users to access premium content and an advertising model, where advertisers can display ads on the app.

application badge


ChattyRental is an innovative AI-powered software platform designed to revolutionize the room rental experience. By integrating cutting-edge AI technology into the booking process, ChattyRental enables room rental agencies to enhance their commercial systems and streamline their operations, resulting in significant cost savings and improved customer satisfaction. The platform's primary features include AI-driven conversational booking, personalized recommendations, and intelligent search capabilities, allowing customers to find and book rooms with ease. ChattyRental's mission is to provide a seamless and efficient booking experience, fostering loyalty among renters and driving growth for room rental agencies. Initially targeting agencies in Madrid, ChattyRental aims to expand its reach globally, offering its transformative solution to agencies worldwide. The platform's unique value proposition lies in its ability to optimize key processes, reducing sales department costs by up to 36%, while simultaneously delivering a more personalized and enjoyable experience for renters. ChattyRental's SaaS solution comprises three layers: the main dashboard, maintenance and issues, and the commercial dashboard. By focusing on value, innovation, collaboration, and transparency, ChattyRental aims to become the go-to platform for room rental agencies, providing tailored solutions to meet the evolving needs of their customers and empowering them to excel in the industry.


AI Alliance 4 Voice Analytics

Call centers handle an immense volume of customer interactions every day, and it's crucial for businesses to evaluate the quality of these interactions to maintain high customer satisfaction rate. Traditionally, quality and assurance auditing has been a time-consuming and manual process, where human auditors listen to and evaluate customer calls. This approach is prone to human error, inconsistency, and scalability challenges. The Voice Analytics with AI aims to revolutionize the quality & assurance auditing process of call centers by transcribing and analyzing audio recordings using GPT-3 and Whisper models. The proposed solution leverages Automated Speech Recognition (ASR) and Large Language Models (LLM) to automate and streamline the quality and assurance auditing process. First, the system summarizes key information in call recordings, such as operator's name, issues, and solutions, and other relevant data points. After that, the solution conducts sentiment analysis to evaluate the tone and mood of the conversation using NLP and LLM. In addition, LLM evaluates customer experience and satisfaction levels and provides scores for each. Last but not least, the model ends the report of each call with feedback and insights about the performance of operator and suggests areas for improvement. ​ Overall, the proposed solution has the potential to transform the call center industry, providing businesses with valuable accurate insights into their customer interactions and enabling them to take proactive steps to train their operators and improve their overall customer experience.​

AI Alliance


Our project aims to improve the academic performance of undergraduate students in their first cycles, helping them to adapt to the academic environment and develop effective study skills. To achieve this, we have created an innovative platform that integrates artificial intelligence technologies, such as GPT-4 and Whisper, to offer a series of tools that facilitate the learning process. The tools offered by our platform include automatic transcription of lectures, which allows students to access the information presented in a more accessible way; summary generation, which helps them understand and retain essential information from lectures; and personalized learning paths, which guide students to the most appropriate resources and activities for their individual needs. In addition, our platform provides research and writing assistance, which facilitates the completion of high-quality academic papers. All of this helps to reduce the stress and frustration associated with adjusting to the university environment and improve study skills, which in turn translates into better academic performance. In the $5.76 billion grade app market, our solution has great potential to revolutionize the way students deal with academic challenges. Moreover, being aligned with Sustainable Development Goal 4, our project also contributes to improving the quality of education and promoting academic success in a broader context. We are currently seeking support and investment to incorporate GPT-4 and Whisper into our platform and optimize it, thus ensuring that our solution is at the forefront of artificial intelligence technology and has an even greater impact on the lives of university students.

Smart Notes Learn Better Fast

Clear Speak

The SaaS (Software as a Service) product is a powerful speech analysis tool that utilizes state-of-the-art speech-to-text technology to transcribe recorded audio and analyze it for various speech patterns. The tool identifies areas where the user may need to improve their speech, including reducing stuttering and improving clarity and coherence. Once the audio is transcribed, the tool provides users with in-depth statistics on their speech patterns, including word frequency, common mistakes, and areas that need improvement. These statistics are displayed in an easy-to-understand format that allows users to quickly identify their strengths and weaknesses. One of the most unique aspects of the tool is its ability to give real-time feedback on speech patterns. As users speak, the tool analyzes their speech in real-time and offers suggestions on how to improve their communication skills. This feedback is critical for users who want to improve their speech in real-life situations and gain confidence in their ability to communicate effectively. The tool also provides personalized suggestions for each user based on their speech patterns and areas for improvement. For example, if the tool identifies that a user tends to stutter on certain words or phrases, it will offer personalized exercises and techniques to help them reduce their stuttering and speak more fluently. Overall, the speech analysis SaaS product is an invaluable tool for anyone looking to improve their communication skills, reduce stuttering, and gain confidence in their ability to speak effectively. With its cutting-edge speech-to-text technology and real-time feedback, the tool offers a powerful solution for anyone looking to improve their speech and become a more effective communicator.

Coders Legion
application badge
WhisperChatGPTOpenAI gymGPT-3

MyQuiz AI

Have you ever wanted to test your knowledge on a specific topic, but found traditional methods of studying and taking quizzes to be tedious and unengaging? Look no further, because we have the solution! Introducing MyQuiz.AI, a trivia game that utilizes the power of AI to generate questions tailored to your interests and abilities. With just your voice, you can embark on a fun and challenging quiz journey that will leave you wanting more. So sit back, relax, and get ready to put your knowledge to the test with MyQuiz.AI! Our team has developed a cutting-edge speech-based game that incorporates advanced technologies to deliver a highly engaging and personalized user experience. The system utilizes the Whisper API for speech recognition, Redis for data storage, and ChatGPT to generate questions and validate user answers. The quiz asks unique questions every time, tailored to the user's level of knowledge and abilities. The system's machine learning capabilities ensure that the difficulty level of the questions is appropriate and challenging, and according to age. The Whisper API's advanced speech recognition capabilities provide an immersive and interactive experience, allowing users to use their voices to mention their age and category for quiz. This feature also makes the quiz accessible to users with disabilities or those who prefer voice-based interactions. The Redis database stores questions, answers, and user responses. Overall, our speech-based quiz or game represents a significant step forward in the field of educational technology. With its advanced algorithms and machine learning capabilities, the system offers a new and innovative way for users to learn and engage with the material. The quiz's personalized approach, speech-based interface, and advanced features make it a powerful educational tool that has the potential to revolutionize the way people learn and retain knowledge.

Space Cats
ChatGPTOpenAI gymRedisWhisperGPT-3

Smart Lecture

Our app is designed to address some common problems that students and learners face when trying to engage with lectures Difficulty taking comprehensive notes: Many students struggle to capture all of the key points and details of a lecture while also actively listening and processing the information being presented. This can result in incomplete or inaccurate notes that make it harder to study and review later. Time-consuming manual transcription: In order to review lectures later, students may need to manually transcribe the audio recordings, which can be time-consuming and tedious. Limited ability to identify important information: Even with comprehensive notes or transcripts, it can be challenging to distill the most important information from a lecture, especially if there is a lot of extraneous detail or repetition. Our app aims to address these problems by automating the process of creating summaries, notes, and questions from lecture audio. By using WhisperAI to transcribe the audio to text and ChatGPT to generate a summary, notes, and questions, the app streamlines the process of reviewing lectures and helps learners more easily identify and retain key information. Here is a possible flow for the app: The user opens the app and selects the lecture they want to review. The app uses WhisperAI to transcribe the lecture audio to text. The text is passed to ChatGPT, which generates a summary, notes, and questions based on the content of the lecture. The user can review the summary, notes, and questions generated by the app, edit them as needed, and save them for future reference. Overall, this app has the potential to be a valuable tool for learners who want to optimize their engagement with lectures and maximize their retention of important information.

The bad batch

Health BOt

an innovative and user-friendly health application that uses artificial intelligence to provide Ugandan citizens with access to vital health information. The app is designed to address the challenges that many Ugandans face in accessing quality healthcare services, particularly in rural areas where health facilities are scarce. The application is built using state-of-the-art technology, including ChatGPT API for natural language processing and speech-to-text capabilities, Streamlit for the user interface, and the Reddit API to access relevant health information. These tools work together seamlessly to provide a comprehensive and user-friendly health platform that meets the unique needs of Ugandan citizens. Through the app, users can access reliable and up-to-date information on common illnesses, including symptoms, causes, and treatments. They can also receive personalised recommendations based on their symptoms and medical history, as well as find nearby health facilities and book appointments. The app can also provide educational resources on topics such as sexual health, maternal and child health, and HIV/AIDS. The app's user-friendly interface and speech-to-text capabilities make it accessible to all Ugandan citizens, regardless of their level of education or literacy. This is particularly important in rural areas where illiteracy rates are high. Additionally, the app's use of local languages such as Luganda and Runyakitara ensures that it is inclusive and accessible to all Ugandans. Overall, "Health Solutions Uganda" is a powerful tool that has the potential to revolutionise healthcare in Uganda by providing access to vital health information and services to all citizens, regardless of their location or socioeconomic status.


WeCare Caretaker Assistant

We have built a solution for agencies which provide the caretaker services for parents who are in search of babysitters for their child. When users call the agency after business hours or when agents are not available for assistance, we are routing them to leave a voicemail with their babysitter requirement and contact number. With this solution, agents can focus on more complex tasks rather than manually retrieving voicemails, analysing them and coming up with a resolution. When the caller dials the agency phone number during office closed hours or peak hours when agents are not available to serve them, we route the caller to the voicemail menu where we ask them to leave a voicemail with babysitting requirements and their contact details, etc. Once the voicemail is available, we extract it and convert this speech to text using OpenAI’s whisper API which gives us the voicemail transcription. After that, we meticulously perform the prompt engineering for ChatGPT API to provide us all the required information from voicemail like intent, sentiment, babysitting date and time, etc in JSON format. Using this information, we query the EmployeeSchedule table which is in the H2 database. Once we have the information about availability of babysitters, we query RedisJSON to get the employee profile information like employee name, contact details, date of birth, languages spoken, image, etc. We then build a PDF document using itext library. This PDF containing available babysitter information will be sent on the caller’s WhatsApp. After this, we send an SMS to the agency as an alert notification about the customer enquiry and ask them to get in touch with the customer. Github link - Video link - Presentation - DEMO is at the end of the video.

ChatGPTWhisperDALL-E-2Cohere GenerateCohere ClassifyRedis


People with dyslexia often find it hard to read and write, primarily because it’s “hard for them to mentally lock in” as described by the person suffering from Dyslexia. This makes studying a struggle for them however it is seen that in most cases people with this condition find it easier to read on electronic devices rather than the real document itself. Although it’s helpful to have a guide to help out when difficulties arise. But what if a person doesn’t have access to a guide? This is where NotAlone comes in to help. Dyslexia can be an obstacle but with LLMs and Deep Learning making people’s life easier let’s take a step to make it more accessible to everyone so that barriers like these don’t dominate a person’s will to learn and write. NotAlone is specifically designed to empower individuals with dyslexia by providing a seamless learning-rich writing environment tailored to their unique needs. Our goal is to ensure that no one feels left behind in today's fast-paced world. People with dyslexia often prefer speaking over writing, hence many take the help of an ASR app to help them do so. Inspired by this, we provide a Whisper-based STT feature to help them type by speaking. ChatGPT-based writing assistance is another essential feature of NotAlone. This feature provides personalized guidance and support by helping users overcome challenges in writing and reading by:- 1. Helping them write about anything they want 2. Grammar Correction 3. Rewriting 4. Explaining a word/phrase 5. Summarize a paragraph 6. Suggest Synonyms 7. Chat-based assistance Even though we help them read better but some words are complicated even for us to understand this is where Text-to-Speech service can help them read a word or a paragraph whenever they feel stuck. Beyond these core features, we provide an interface that allows users to adjust settings, such as line height, word spacing, and background color. All this along with custom fonts that cater specifically to dyslexic people.

application badge


Introducing our innovative Streamlit application, which harnesses the power of OpenAI GPT-3 to generate multi-layer encryption and decryption codes for secure communication. This application is designed to help users easily encrypt and decrypt their messages using state-of-the-art encryption techniques, making it nearly impossible for unauthorized parties to access their sensitive information. To use this application, users can input their speech message through OpenAI Whisper, which transcribes the message accurately. The application then uses GPT-3 to generate a multi-layer encryption code, which can be customized by the user according to their specific requirements. Once the encryption code is generated, it is applied to the speech message, making it indecipherable to anyone without the decryption code. Users can choose from a variety of encryption algorithms and key lengths, and can also input their own unique encryption key for added security. The application also allows users to save and retrieve their encryption codes for future use, making it easy to communicate securely with their contacts. In addition to its powerful encryption capabilities, the application is also highly user-friendly, with a clean and intuitive interface that allows users to easily navigate and customize their encryption settings. With its cutting-edge technology and ease of use, this Streamlit application is the perfect solution for anyone looking to communicate securely and confidently in today's digital world.

team phoeniks
application badge
OpenAI gymWhisperGPT-3


Our app provides a fully digitalized package for our clients. We offer a range of services, including the creation of a logo, ads that can be used on social media platforms such as Facebook and Instagram, a website, and marketing videos. In order to enhance the quality of our videos, we use a technology called DeepFake. This technology generates faces which are then placed onto the video to create a more engaging advertisement. To create the ads, we use two different technologies called dalle and gpt3. Dalle is used to generate images, while gpt3 is used for text. The logo is also created using dalle for the image and gpt3 for the text under the image. For the website, we will use dalle for images and gpt3 to code the website itself. Additionally, we will be adding automation to our app to streamline the entire process. Impact:: Our app offers a comprehensive range of services that can potentially have a significant impact on the market. The fields in which our app can be used includes branding, digital marketing, web development, and video production.One potential way to use client data and requests of images for further work is to analyze the data to identify trends and patterns in the type of images that clients are requesting. This can help us to tailor our services to meet the specific needs and preferences of your clients. For example, if we notice that clients are frequently requesting certain types of images or logos, we could focus on developing more options in that style., our app has the potential to make a significant impact on the market and attract a wide range of clients.

RedisCodexWhisperDALL-E-2ChatGPTStable DiffusionGPT-3

Liquid LMS

The Problem: Traditional education has not changed much in the last century, and it fails to meet the diverse needs of students. One-size-fits-all teaching methods, outdated curricula, and limited access to resources often result in disengaged students who are unprepared for the workforce of tomorrow. The Solution: We propose a revolutionary approach to education that integrates AI and new technology. By leveraging the power of AI, we can create personalized learning experiences that cater to each student's unique needs, interests, and abilities. The Implementation: Our approach is built on three pillars: a. Adaptive Learning: Our AI-powered algorithms will analyze each student's performance data to create a customized learning path. This will help students learn at their own pace and achieve better learning outcomes b. Immersive Learning: We will use virtual and augmented reality to create immersive learning experiences. This will enable students to explore complex concepts in a more engaging and interactive way. c. Collaborative Learning: We will facilitate collaborative learning by leveraging AI-powered tools that enable students to work together on projects and assignments in real-time. The Benefits: Our approach to education will offer several benefits, including: a. Improved Learning Outcomes: Personalized and engaging learning experiences will help students achieve better learning outcomes and prepare them for the workforce of tomorrow. b. Cost-Effective: Our AI-powered approach to education will be cost-effective as it will reduce the need for physical classrooms and expensive resources. c. Accessible: Our approach will be accessible to all students regardless of their location, socioeconomic status, or learning abilities. Our approach to education will revolutionize the way we teach and learn. By leveraging the power of AI and new technology, we can create personalized, engaging, and cost-effective learning experiences that prepare students for tomorrow.



MediFix is an AI-powered assistant that utilizes the latest technologies such as GPT 3.5, Whisper, and gTTS to provide users with valuable healthcare information. With its advanced capabilities, MediFix is able to analyze symptoms mentioned by users and provide them with preventive measures to help them stay healthy. One of the key features of MediFix is its ability to support both voice and text input. This means that users can either speak to the assistant or type their symptoms, making it accessible to a wide range of users. When users input their symptoms, MediFix uses GPT 3.5 technology to analyze the information and provide relevant information on the causes of the symptoms and possible preventive measures. The assistant is trained on a vast amount of medical data, allowing it to provide users with accurate and reliable information. In addition, MediFix also utilizes Whisper technology to provide a personalized experience for each user. By understanding the user's context and history, MediFix is able to provide customized recommendations and preventive measures that are specific to their needs. Finally, gTTS technology is used to deliver the information to the user in a clear and easy-to-understand manner. This ensures that users are able to comprehend and follow the recommendations provided by MediFix. Overall, MediFix is a powerful healthcare assistant that leverages the latest AI technologies to provide users with accurate and personalized healthcare information. With its support for both voice and text input, MediFix is accessible to a wide range of users, making it an invaluable tool for anyone looking to take control of their health.

application badge


Our application is designed to help individuals, especially those with concentration or mental health issues, to learn effectively. Leveraging advanced technologies like GTP-3, Whisper, Dall-E-2, Python, and React Native on the front end, our application is unmatched in its ability to provide personalized learning experiences tailored to specific dysfunctions. With our app, users can access a variety of learning resources such as a To-do-list, interactive exercises, and personalized quizzes. The app's intelligent algorithm also tracks the user's progress and offers personalized recommendations to help them learn more effectively. This approach ensures that users are engaged and motivated throughout their learning journey. One of the most unique aspects of our app is its ability to adapt to the specific needs of individual users. For example, if a user has a learning disability, the app will adjust the pace and difficulty level of the content to suit their needs. Similarly, for users with concentration issues, the app will provide techniques and exercises to help them stay focused. Our app is available on a freemium model for private use, while we also offer it for sale to schools, learning centers, and care facilities for people with disabilities. With these revenue streams, we aim to make our app accessible to everyone who needs it, regardless of their financial situation. In summary, our app is a game-changer for personalized learning, offering a unique and adaptive approach that is unmatched by any other application. With its ability to help those with mental health and learning difficulties, we believe our app has the potential to make a significant positive impact on the lives of millions of people. *Right now we have a to-do-plan, and help you find important information from text files, soon will be more*

Inclusive Solutions


During the hackathon, we fine-tuned GPT-3 and built a self-analysis tool that helps one objectively assess their problem and develop new ideas for solving it. It can be used by people who can't access mental health care because of high prices and stigma. It is based on CBT and should be highly effective in the following cases: 1. A person has a problem and doesn't know how to solve it. For example, "I can't keep up with deadlines," or "My parents are overprotective." 2. A person can't make a decision. "Should I move?", "Should I accept an offer from a new company?" etc. 3. A person can't sort out their thoughts. "I can't understand why I'm so uncomfortable being a dad," "Why have I become so irritable?" etc. 4. A person wants to improve their relationship. "I'm so jealous," "We fight all the time," "I'm not happy with my wife. I cheated, and I feel guilty". In therapy, people who are objective about their situation and able to set specific goals tend to achieve better results. This tool does exactly that. A typical session consists of three parts: 1. Analysis. This part includes questions that make the person analyze various aspects of the situation and draw an objective picture. The essence of this part is the transition from an emotional to a rational perception of reality. 2. Empathy. It consists of a comprehensive generalizing statement aimed at supporting the client emotionally. 3. Decision. It consists of questions that allow the person to analyze the availability of resources and ways to solve the problem. Questions force the person to move from emotions to concrete steps toward the goal.

Elomia Health

Phoenix Whisper

According to research made by J. Birulés-Muntané1 and S. Soto-Faraco (10.1371/journal.pone.0158409), watching movies with subtitles can help us learn a new language more effectively. However, the traditional way of showing subtitles in YouTube or Netflix does not provide us the best way to check the meaning of new vocabulary nor understand complex slang and abbreviation. Therefore, we found out that if we display dual subtitles (the original subtitle of the video and the translated one), the learning curve immediately improves. In research conducted in Japan, the authors concluded that the participants who viewed the episode with dual subtitles did significantly better ( After understanding both the problem and the solution, we decided to create a platform for learning new languages with dual active transcripts. When you enter a YouTube URL or upload an MP4 file in our web application, the app will produce a web page where you can view the video and have a transcript running next to it in two different languages. We have accomplished this goal and successfully integrated OpenAI Whisper, GPT and Facebook's language model for the backend of the app. At first, we use Streamlit for the app, but it does not provide a transcript that automatically move with the audio timeline, also Streamlit does not give us the ability to design the user interface, so we create our own full stack application using Bootstrap, Flask, HTML, CSS and Javascript. Our business model is subscription-based and/or one-time purchase based on the usage. Our app isn’t just for language learners. It can also be used for writers, singers, YouTubers, or anyone who would like to make their content reach out to more people by adding different languages to their videos/audios. Due to the limitation of free hosting plan, we could not deploy the app on cloud for now but we have a simple website that you can have a quick look at what we are creating (

application badge