Yoruba Image Synthesis - Multi-Modal Fusion

Created by team HackAttack on May 15, 2024

In this project, we confront the linguistic barriers faced by Yoruba speakers due to limited language resources. Image generation models primarily excel with English prompts, posing a challenge for non-English speakers. To address this, we embarked on a dual-track approach: data collection and model development. Firstly, recognizing the scarcity of Yoruba datasets, particularly in image generation prompts, we meticulously curated our own dataset. English sentences were carefully selected to serve as image generation prompts and then translated into Yoruba using a dictionary-based approach. Next, we developed a custom translator model trained specifically to translate Yoruba into English. This intermediary step ensures seamless integration with image generation models, allowing for smoother operation and accurate results. Through rigorous testing, we achieved an impressive 85% accuracy on the test set, affirming the efficacy of our approach. The core strength of our project lies in its ability to empower users to generate images in their native language without encountering language barriers. By collecting our own data and training custom models, we circumvent the limitations imposed by the scarcity of Yoruba resources. Leveraging the SDXL API for image generation further enhances the user experience, ensuring high-quality outputs. Looking ahead, we envision extending our efforts to include additional languages such as Fon and Dendi, expanding our dataset and catering to a broader audience. Furthermore, our ultimate goal is to develop a model capable of directly generating images from Yoruba, Fon, and Dendi without the need for translation into English. In summary, our project not only addresses a pressing need within the Yoruba-speaking community but also lays the groundwork for future advancements in multi-lingual image generation. Through our innovative approach, we pave the way for inclusive, barrier-free communication and creative expression.

Category tags:

Language and Translation

Github Presentation Demo

Explore more applications

MEDVAULT

MedVault AI ensures secure, AI-powered health records with offline access, SMS support & blockchain security, bridging healthcare gaps in remote areas.

streamlit

Level-4 Autonomous Connectivity Network

A fully autonomous, AI-driven connectivity solution that leverages "Giga Nodes"—smart, low-power, solar-powered devices—to establish resilient internet networks in underserved and remote regions.

Creativity with AI

AI/ML API

Maa-connect

Our project is a simple network management chatbot. It will help teachers, health facility network managers and individuals do basic diagnostic analysis of their network.

Maa Connect

TinyLlama

Smarti

This project uses machine learning to optimize TVWS base stations, predicting interference, failures, and providing AI recommendations. It features a dashboard, a simulation map, and procurement tracking.

VIBOT

Gemini AIGenerative AI Studio

AI for Connectivity Global Green Guard

AI platform to optimize school connectivity using geospatial data and LLMS for underserved regions

Global Green Guard

GPT-4 Vision

"great work. generating images is very simple idea that could have great future if implemented in the right business field. good luck"

Walaa Nasr Elghitany

Lablab Head Judge

"Great project guys, addresses an important challenge. I would love to see a better use of the multimodal model, but its still pretty good"

Shebagi Mitra

Technical Mentor

Events @ lablab
For Innovators & Creators