Top Builders

Explore the top contributors showcasing the highest number of app submissions within our community.

ROCm

ROCm (Radeon Open Compute) is AMD's open-source software platform for GPU-accelerated computing. It is the AMD equivalent of NVIDIA's CUDA and provides a complete stack for running AI, machine learning, and HPC workloads on AMD GPUs. ROCm supports major ML frameworks including PyTorch, TensorFlow, JAX, and ONNX Runtime, and includes the HIP (Heterogeneous-compute Interface for Portability) programming model for writing GPU code that runs on both AMD and NVIDIA hardware.

General
AuthorAMD
TypeOpen-source GPU Computing Platform
DocumentationROCm Docs
Repositorygithub.com/ROCm
InstallationROCm Installation Guide
Current VersionROCm 7
LicenseMIT and Apache 2.0

Start building with ROCm

ROCm gives you a complete software stack to run AI training and inference workloads on AMD GPUs. It integrates directly with PyTorch, TensorFlow, and JAX so most standard pipelines run with minimal changes from a CUDA environment. Hugging Face Optimum-AMD and vLLM both support ROCm, making it straightforward to run transformer inference and fine-tuning jobs on AMD hardware. Check out the community-built AMD Use Cases and Applications to see what developers are running on ROCm today.

ROCm Tutorials


Documentation and Resources


Framework Support

  • PyTorch Full support for training and inference, including integration with Hugging Face Accelerate and PEFT
  • TensorFlow GPU-accelerated training and inference on AMD hardware
  • JAX Supported via the ROCm JAX build
  • ONNX Runtime Cross-framework model deployment on AMD GPUs
  • Hugging Face Optimum-AMD Optimized inference and fine-tuning pipelines for transformer models
  • vLLM High-throughput LLM serving with a ROCm backend

Libraries

  • hipBLAS BLAS implementation for AMD GPUs
  • MIOpen Deep learning primitives library for AMD GPUs
  • rocRAND Random number generation for AMD hardware
  • hipSPARSE Sparse matrix operations on AMD GPUs
  • rocBLAS BLAS implementation optimized for AMD Instinct accelerators

amd AMD ROCm AI technology Hackathon projects

Discover innovative solutions crafted with amd AMD ROCm AI technology, developed by our community members during our engaging hackathons.

Boundary Forge

Boundary Forge

Boundary Forge is a model-agnostic AI safety pipeline that helps enterprises deploy LLMs with measurable confidence. Instead of relying on manual red-teaming or hoping a system prompt is enough, Boundary Forge automatically attacks a model, identifies where it behaves unsafely or inconsistently, and converts those discovered failures into runtime guardrails. For this hackathon, we demonstrated Boundary Forge using Qwen 2.5-72B on AMD Developer Cloud with AMD MI300X. Qwen powered the adversarial red-team workflow and was also the model under test, allowing the system to expose real behavioral failure boundaries such as jailbreak attempts, policy drift, unsafe financial guidance, KYC bypass, fraud patterns, coercion signals, asset concealment, and inconsistent refusals. The pipeline works in five stages: generate adversarial probes, run high-throughput model inference, mathematically detect boundary failures, compile those failures into semantic safety rules, and enforce them through middleware before risky prompts reach the LLM. This creates a practical enterprise safety layer that can block, flag, or ask for clarification in real time. The important point is that Boundary Forge is not tied to one model. Qwen 2.5-72B was used to demonstrate the system, but the architecture can benchmark and harden other open-source or proprietary models as well. The goal is to improve models exactly where they fail and make model evaluation repeatable across different deployments. In our AMD Cloud production run with Qwen 2.5-72B, Boundary Forge generated 1,009 unique adversarial probes, fired 4,036 total inferences, discovered 25 boundary failures, and compiled 15 semantic safety rules. The middleware intercepted 68% of known attacks and reduced the effective failure rate from 2.48% to 0.79%. Boundary Forge turns AI safety into an automated engineering workflow: attack, measure, learn, protect, and benchmark again.

Thor v2 — RAG-Free Fitness Intelligence

Thor v2 — RAG-Free Fitness Intelligence

Thor v2 is a domain-expert fitness AI built on a single fine-tuned Qwen3-8B model trained on 7,118 carefully constructed instruction-response pairs spanning exercise science, nutrition, programming, injury screening, and population-specific guidance. Unlike RAG-based fitness apps that retrieve documents at query time, Thor v2 encodes knowledge directly into model weights during supervised fine-tuning on AMD MI300X hardware using ROCm. Evidence is referenced through compact citation keys — e.g. [CITE:NSCA_HYPERTROPHY_VOLUME] — that the model emits inline. A lightweight citation resolver validates these keys against a locked registry and surfaces the source document on demand. If the model emits an unknown key, it is rejected at runtime. Hallucinated citations are structurally impossible. The dataset covers 113 unique citation keys from 9 authoritative organisations — NSCA, ACSM, ISSN, NASM, HHS, USDA, NIH, CDC, and ExRx — with 80 exercise technique entries and 14 population profiles including senior, postpartum, teen, vegan, rehab return, and competitive athlete. Six conversational style variants (casual, research-nerd, anxious, skeptical, verbose, follow-up-first) are baked into training so the model adapts tone naturally without prompt engineering. Training results: 100% JSON contract pass rate across all eval prompts. Coach gating behavior confirmed — model asks clarifying questions before prescribing when context is missing, rather than giving generic advice. All responses emit valid citation_keys, follow_up_questions, and safety_notes fields. Adapter size: <350MB on top of a frozen 8B base. Built entirely on AMD MI300X (192GB HBM3, ROCm 6.3) using HuggingFace PEFT + TRL. One model. No retrieval. No vector database. The model knows. The resolver proves.

TempoGraph: Local Multimodal Video Analysis

TempoGraph: Local Multimodal Video Analysis

TempoGraph is a fully-local, privacy-preserving multimodal video analysis system that turns raw video files into rich structured outputs — entities, behaviors, transcripts, timelines, and interactive knowledge graphs — without sending a single frame to the cloud. Stage 1 — Frame Selection: Motion-aware sampling with static, moving, and auto camera modes. For moving cameras it estimates homography to separate object motion from camera movement, then identifies keyframes where motion peaks exceed a configurable sigma threshold. Stage 1.5 — Audio Transcription: Whisper.cpp running on Vulkan transcribes the full audio track to millisecond-accurate segments. Stage 2 — YOLO Detection: YOLO26 runs on 2nd GPU over every sampled frame, outputting normalized bounding boxes, class names, track IDs, and confidence scores. Stage 3 — Depth Estimation: Depth Anything V2 via HuggingFace Transformers adds per-detection mean depth to every bounding box, giving 3D spatial context to 2D detections. Stage 4 — Frame Scoring: Picks which frames the VLM actually sees. In keyframes mode, only motion-peak frames are forwarded. In scored mode, FrameScorer ranks all YOLO-scanned frames using a weighted combination of motion delta, new YOLO class appearances, tracked object churn, and IoU drop between frames — then fills the VLM budget with the highest-signal frames. Keyframes are always pinned in first regardless of mode. Stage 5 — VLM Captioning: Qwen3.5-VL-9B served by a custom llama.cpp build compiled for AMD ROCm/HIP, running on an AMD RX 9070 XT with a 100k-token context window. Frames are chunked and sent to the model alongside YOLO-derived annotations. Each chunk's summary seeds the next prompt for narrative continuity across the video. Stage 6 — Aggregation: A final text-only LLM call synthesizes all per-chunk captions and the audio transcript into a structured JSON with entities, visual events, audio events, and multimodal correlations linking what was said to what was seen.