Reinforcement Learning

Top Builders

Explore the top contributors showcasing the highest number of app submissions within our community.

Reinforcement Learning

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

General
Relese date	1960

About Reinforcement Learning

What is reinforcement learning? Comprehensive article on Wikipedia about reinforcement learning

Edit on GitHub

Reinforcement Learning AI technology page Hackathon projects

Discover innovative solutions crafted with Reinforcement Learning AI technology page, developed by our community members during our engaging hackathons.

EduOrbit

Most students practice with generic question banks that don't adapt to what they actually struggle with. EduOrbit solves this by reading real textbook content and generating curriculum-aligned questions on demand. A student picks a class, subject, and chapter and instantly gets MCQ or written questions drawn from that exact chapter — in English or Bangla, with proper math/equation rendering. They take an exam (multiple choice, written, or even a photo of handwritten work), and the platform auto-grades it with an LLM and produces a per-topic breakdown of their weaknesses. The standout feature is the reinforcement-learning loop: every graded exam updates a per-chapter policy that learns which topics that chapter's students consistently fail. So even a brand-new student with zero history gets a well-targeted practice set from day one (solving the cold-start problem), and the system keeps improving as more exams are taken. Around this adaptive core, EduOrbit adds a full learning ecosystem: teacher-scheduled exams, student group exams, 1v1 "Battleground" competitions, school & platform analytics dashboards, public leaderboards, gamification (XP, levels, streaks, badges), a moderated community with peer doubt-solving, and a RAG-based study chatbot — all served from a single FastAPI backend.

Probably runs on Samsung S25

Hybrid Router is an intelligent AI inference system that dynamically selects the best execution path for each user query instead of relying on a single large language model. The system first analyzes the incoming prompt to identify the task type, estimate its complexity, and determine whether it can be solved deterministically or requires generative AI. For structured tasks such as mathematical calculations, JSON validation, regular expression verification, and date/time operations, the router invokes specialized deterministic tools that produce fast, accurate, and reproducible results without consuming LLM tokens. For more complex natural language and coding tasks, the router attempts local inference using OpenVINO-optimized models running on Intel hardware, reducing latency and API costs. If the local model is unlikely to provide a sufficiently reliable answer or the task exceeds its capabilities, the system automatically falls back to Fireworks AI models for high-quality remote inference. The routing decisions are driven by task classification, confidence estimation, and configurable thresholds, allowing the system to balance accuracy, response time, and operational cost. The architecture is modular, making it easy to add new tools, local models, or routing strategies in the future. The project also includes benchmarking and evaluation components that measure routing accuracy, latency, model utilization, and fallback frequency to continuously improve routing performance. By combining deterministic tools, local inference, and cloud-based language models into a single adaptive pipeline, Hybrid Router delivers efficient, scalable, and cost-aware AI inference while maintaining high response quality across a wide range of tasks.

QueryWise AI – Natural Language SQL Copilot

QueryWise AI – Intelligent Natural Language SQL Copilot QueryWise AI is an AI-powered analytics assistant that makes data exploration simple and accessible. Instead of writing SQL manually, users can upload CSV, TSV, or Parquet datasets and ask questions in plain English. The application automatically detects the dataset schema, generates SQL, executes it using DuckDB, and displays the results through an interactive dashboard. The goal of QueryWise AI is to reduce the learning curve for SQL and enable analysts, students, business users, and developers to explore data more efficiently. Traditional analytics tools require users to understand database structures and SQL syntax before extracting insights. QueryWise AI removes this barrier by translating natural language into executable SQL. The application features a modern React frontend and a FastAPI backend. After uploading a dataset, users can ask questions such as "How many orders are there?", "Show total sales by category", or "Who are the top customers?". The backend generates and validates SQL queries, executes them using DuckDB, and returns results along with execution statistics. Key Features Natural Language to SQL conversion CSV, TSV, and Parquet dataset upload Automatic schema detection FastAPI backend with DuckDB execution Interactive React dashboard SQL validation and query history Offline intelligent fallback engine Docker-ready deployment QueryWise AI is built with a modular architecture, making it easy to extend with cloud databases, authentication, advanced AI models, and enterprise analytics. By combining natural language processing with fast local SQL execution, it provides a lightweight, privacy-friendly, and developer-friendly solution that simplifies structured data analysis for users of all skill levels.

Veta QA

Veta replaces brittle, scripted UI tests with an autonomous, AI-driven QA agent fleet that treats mobile applications exactly like a human tester would. Operating on an observe-decide-act loop, the agent evaluates real-time Android screen captures alongside compressed accessibility hierarchies to interact with applications via a strict, type-safe action vocabulary. The architecture relies on a multi-phase, multi-sub-agent pipeline consisting of a Planner, Executor, Verifier, and Reporter. The Planner establishes deep, multi-screen verification checkpoints; the Executor executes contextual UI steps; the Verifier checks outcomes against checkpoint evidence; and the Reporter generates comprehensive post-run analysis complete with remediation guidance and severity rankings. Built with high-throughput scalability in mind, the system orchestrates fleets of containerized Android instances (redroid) utilizing custom device identity profiles. Every step of the pipeline, spanning planning, execution, verification, and reporting, runs natively on AMD silicon with zero non-AMD fallbacks by default, harnessing Fireworks AI on AMD Instinct or self-hosted vLLM on ROCm for rapid, low-overhead inference.

MESSI Motion Estimation for Soccer Skill Imitation

We started with a pretty simple question: can one recorded human soccer kick become one believable robot skill? So we built MESSI: Motion Estimation for Soccer Skill Imitation. It takes a SoccerKicks penalty clip, uses the dataset’s 3D pose annotations, retargets the motion onto a Unitree G1 humanoid, and then checks the result in MuJoCo with a real ball, goal, contact, and friction. The hard part is that you can’t just copy human joints onto a robot. The bodies are different, the joint limits are different, and a kick only matters if the ball actually moves the way it should. So we used the human motion as the starting point, refined the trajectory with AMD MI300X / ROCm, and treated MuJoCo as the final test instead of just trusting a proxy metric. We also built a Goalie Lab where you can move the keeper and run a fresh physics rollout to see if the learned kick still scores.

ForgeAi

ForgeAI is a hardware-aware AI model optimization platform that automatically finds the fastest, most efficient version of a model for a specific GPU — starting with AMD MI300X. Instead of manually tuning models for each accelerator, ForgeAI runs a 7-phase optimization pipeline: architecture search finds the best candidate structures, knowledge distillation transfers accuracy from a teacher model, pruning removes redundant weights, quantization compresses from FP32 to INT8, benchmarking measures real performance on target hardware, Pareto analysis identifies optimal latency-accuracy tradeoffs, and Optuna hyperparameter tuning auto-optimizes across 6 parameters with 50 trials and early stopping. The platform consists of a FastAPI backend with 9 optimization modules, a Next.js 14 frontend, and WebSocket-based live progress streaming. Users upload a PyTorch checkpoint, select target hardware, set constraints (max latency, max memory, min accuracy), and watch the pipeline execute in real time. Results include a Pareto frontier chart, before/after performance comparison, and export to ONNX or TorchScript. ForgeAI targets the $100B+ AI inference market where hardware-specific optimization is still done manually. Unlike Neural Magic and OpenVINO (CPU-focused, tool-by-tool), ForgeAI is AMD-native, full-pipeline, and open source under Apache 2.0.