NCERT Based Local LLM For Low end devices

Created by team The Budget Brawlers on May 07, 2026

AMD ROCm AMD Developer Cloud Llama 3.2

Fine-Tuning on AMD GPUs (Advanced / GPU-Intensive)

The digital divide in education means that advanced AI tutoring is often restricted to those with high-end hardware or fast internet connections. This project was built to shatter that barrier. It is a highly optimized, edge-deployable reasoning model designed specifically for Indian students from Class 6 to 12. At its core, the model is a fine-tuned version of Llama-3.2-3B. Instead of standard instruction tuning, it utilizes Group Relative Policy Optimization (GRPO) Reinforcement Learning trained on the comprehensive Parth Kadam NCERT dataset. This forces the model to generate explicit, step-by-step <reasoning> before delivering an <answer>, ensuring absolute factual accuracy for complex mathematical and scientific queries. To achieve mobile deployment, the training pipeline was engineered on high-performance AMD MI300X hardware using ROCm. After rigorous fine-tuning, the model was aggressively quantized into a 4-bit GGUF format using Unsloth. The result is a highly capable reasoning engine compressed into a remarkably efficient 1.9GB footprint. The final product is capable of running 100% offline, natively on budget smartphones with as little as 6GB of RAM. By utilizing strict memory management and context limits, this model brings elite, hallucination-free educational reasoning directly to the edge, democratizing access to high-quality tutoring for students regardless of their hardware or network limitations.

Category tags:

Github Presentation Demo

Explore more applications

Thymus

Thymus is a lightweight hybrid token-efficient router designed to maximize accuracy while minimizing token costs in multi‑task LLM pipelines. It dynamically routes user queries across local and remote models on LLM providers.

The Disappointer

HuggingFace HubLLaMAAMD Developer Cloud

AI Classroom Edge Intelligence

A privacy-first classroom AI platform that routes sensitive work to local edge systems and eligible anonymized analysis to Fireworks AI, helping teachers make faster, safer instructional decisions even with unreliable internet.

AI Classroom Edge

AMD Developer CloudQwen3rest apiGithub CopilotCodexChatGPT

Taskly: Smart Multi-Model Task Router

Taskly classifies incoming tasks into 8 categories (QA, math, code, NLP) and routes each to the optimal Fireworks AI model with a tuned prompt — maximizing accuracy while minimizing token usage in a fully Dockerized pipeline.

RuntimeTerror

AMD Developer CloudAMD ROCm

router_007_v3

router_007_v2 is a Track 1 agent that records **zero billable tokens**: every answer is computed inside the container by a Qwen2.5-7B-Instruct model bundled in the Docker image and served in-process with llama-cpp-python

roc_auc_half

Claude CodeChatGPTAMD ROCmAMD Developer CloudCodexGemmaGPT-5NVIDIAQwen3

Lexyprep

Agent knows how to make thorough legal research based on my experience of winning legal cases. It understands your issue and advises on procedure

Lexyprep - Do you have a case

AMD Developer CloudAMD ROCmCodexGemma

Ayushmann Dubey
AI Engineer

Upcoming AI Hackathons
For Innovators & Creators

Explore more applications

Thymus

Thymus is a lightweight hybrid token-efficient router designed to maximize accuracy while minimizing token costs in multi‑task LLM pipelines. It dynamically routes user queries across local and remote models on LLM providers.

The Disappointer

HuggingFace HubLLaMAAMD Developer Cloud

AI Classroom Edge Intelligence

A privacy-first classroom AI platform that routes sensitive work to local edge systems and eligible anonymized analysis to Fireworks AI, helping teachers make faster, safer instructional decisions even with unreliable internet.

AI Classroom Edge

AMD Developer CloudQwen3rest apiGithub CopilotCodexChatGPT

Taskly: Smart Multi-Model Task Router

Taskly classifies incoming tasks into 8 categories (QA, math, code, NLP) and routes each to the optimal Fireworks AI model with a tuned prompt — maximizing accuracy while minimizing token usage in a fully Dockerized pipeline.

RuntimeTerror

AMD Developer CloudAMD ROCm

router_007_v3

router_007_v2 is a Track 1 agent that records **zero billable tokens**: every answer is computed inside the container by a Qwen2.5-7B-Instruct model bundled in the Docker image and served in-process with llama-cpp-python

roc_auc_half

Claude CodeChatGPTAMD ROCmAMD Developer CloudCodexGemmaGPT-5NVIDIAQwen3

Lexyprep

Agent knows how to make thorough legal research based on my experience of winning legal cases. It understands your issue and advises on procedure

Lexyprep - Do you have a case

AMD Developer CloudAMD ROCmCodexGemma