2
2
Pakistan
1 year of experience
I am an experienced AI Developer with a strong hands-on background in building intelligent systems and AI-powered applications. Currently, I am actively developing AI projects that focus on solving real-world problems using machine learning, deep learning, and generative AI technologies. My work involves end-to-end development — from data collection and model training to deployment and optimization of AI solutions. I have practical experience in: Building and fine-tuning Large Language Models (LLMs) Developing computer vision and NLP-based applications Creating AI agents and automation tools Working with frameworks like LangChain, TensorFlow, PyTorch, and Hugging Face I am passionate about leveraging AI to create scalable, efficient, and impactful products. I continuously experiment with the latest advancements in AI to deliver high-quality solutions. I am very excited to bring my expertise and contribute to innovative AI initiatives.
.png&w=828&q=75)
DocuMind AI is a multi-modal, multi-agent document intelligence platform built on AMD MI300X GPU infrastructure via Fireworks AI's OpenAI-compatible inference API. The Problem: Professionals spend hours manually extracting insights from complex documents containing both text and visual content (charts, diagrams, figures). Existing tools handle text OR images, not both together. The Solution: DocuMind AI runs a sequential 4-agent CrewAI pipeline on AMD Instinct MI300X GPUs: 1. Vision Agent — uses Kimi K2.5 to analyze every image, chart, and diagram embedded in the document 2. Reader Agent — uses DeepSeek V3.1 to extract text, identify entities, classify document type, and extract key facts 3. Analyst Agent — synthesizes visual AND textual findings into cross-modal insights 4. Reporter Agent — produces a structured intelligence report: Executive Summary, Key Entities, Categorized Insights, and Actionable Recommendations AMD Technology Used: AMD Instinct MI300X GPUs via Fireworks AI with OpenAI-compatible API — zero-friction integration with CrewAI and Python AI ecosystem. Kimi K2.5 vision-language model + DeepSeek V3.1 open-source frontier text model, both running on AMD hardware. Why This Matters: Covers Track 1 (AI Agents) AND Track 3 (Vision/Multimodal). Real business value for researchers, analysts, and professionals. Live demo on Hugging Face Spaces with zero setup required for judges. MIT open-source with full documentation.
10 May 2026

AgentFlow is a real-time economic execution layer for AI agents built on Arc L1 blockchain and Circle infrastructure. Five specialized AI agents — DataAnalyst, ContentWriter, CodeReviewer, Translator, and Orchestrator — autonomously bid, subcontract, and settle USDC micropayments on-chain with no human intervention. Each agent has its own Circle Developer Controlled Wallet pre-funded with 20 USDC. When a task is submitted, the Orchestrator agent routes it to the appropriate specialist, deducts micro-payments ($0.002–$0.008 USDC per task) from agent wallets, and settles on the Arc blockchain in real-time. The live dashboard displays every transaction hash, agent balance, and payment status as it happens. We demonstrated 55 confirmed on-chain transactions totaling $0.31 USDC settled in approximately 30 seconds. Arc's near-zero gas fees make sub-cent machine-to-machine commerce economically viable for the first time — on Ethereum, $1 gas fees would destroy the entire value of these micropayments. Technical stack: FastAPI backend, Next.js 15 frontend, Circle DCW API, Arc EVM L1 testnet. The x402 payment protocol was also wired for HTTP-native payment headers, enabling future pay-per-API-call monetization at internet scale.
26 Apr 2026