OtwoX African RLHF Foundry addresses a critical gap: Africa has 2,000+ languages, yet none appear in major AI training datasets. Millions of Sudanese, Ugandan, and Chadian speakers cannot contribute to modern AI systems. Built on AMD Instinct MI300X (192GB HBM3) during this hackathon, OtwoX Foundry is a production-ready annotation workspace designed to compete with Scale AI and Outlier.ai — but built specifically for African linguistic diversity. Key Features: - 10+ African languages: Sudanese Arabic, Fur, Zaghawa, Nubian, Dinka, Luganda, Acholi - Voice Annotation: Native speakers record corrections via Whisper ASR on ROCm - Bilingual Interface: Arabic/English with RTL/LTR switching - AI Generation: Qwen2.5-7B via vLLM on ROCm 7.2 - RLHF Dataset Builder: Structured JSONL with quality scores, categories, audio paths - Live GPU Telemetry: Real-time MI300X stats in the UI Technical Stack: - Hardware: AMD Instinct MI300X (192GB VRAM) - Runtime: ROCm 7.2 + Docker - Inference: vLLM 0.17.1 - Model: Qwen/Qwen2.5-7B-Instruct - Speech: OpenAI Whisper (Arabic) - Frontend: Gradio 6.14 Strategic Vision: OtwoX is building data infrastructure for African AI. We are the primary data supplier for companies needing African language training data. This project is submitted for the LINGUA Africa grant (Masakhane + Microsoft AI for Good + Gates Foundation, deadline June 15, 2026) — funding up to $250,000 cash and $400,000 compute credits. OtwoX Foundry: Sovereign African-Language AI, powered by AMD.
Category tags: