Top Builders

Explore the top contributors showcasing the highest number of app submissions within our community.

ExecuTorch

ExecuTorch is Meta's production-grade framework for on-device AI inference, allowing PyTorch models to run natively on mobile phones, wearables, embedded systems, and AI PCs without cloud connectivity. Unlike conversion-based pipelines, ExecuTorch exports models directly from torch.export to a .pte binary format, retaining PyTorch semantics through the entire deployment stack. It reached general availability (v1.0) in October 2025 and is already in production across Meta's Ray-Ban Smart Glasses, Meta Quest headsets, and billions of on-device AI feature interactions on Instagram, WhatsApp, and Messenger.

General
GA date18 Oct 2025 (v1.0)
DeveloperMeta / PyTorch
TypeOn-Device AI Inference Framework
LicenseBSD License
GitHubpytorch/executorch
Documentationdocs.pytorch.org/executorch

Core Features

  • PyTorch-native export — models go from torch.export directly to .pte format with no ONNX or TFLite conversion step.
  • 50 KB base runtime — minimal core footprint suitable for microcontrollers and embedded targets.
  • Ahead-of-time compilation — models compiled offline to .pte binaries, reducing on-device startup overhead.
  • Single-line backend switching — swap hardware accelerators (CPU, NPU, GPU) without rewriting model code.
  • Quantization tooling — INT8, INT4 (per-block), QAT+LoRA (QLoRA), and SpinQuant quantization via integrated PyTorch tools.
  • Selective operator builds — include only operators the model uses, minimizing binary size.
  • Multimodal support — composable backbone for LLMs, vision-language models, image segmentation, depth estimation, OCR, ASR, and object detection.
  • Hugging Face Optimum-ExecuTorch — over 80% of the most-downloaded edge-friendly models on Hugging Face run on ExecuTorch out of the box.

Supported Hardware Backends

BackendTargetStatus
XNNPACK + Arm KleidiAICPU (Android, iOS, Linux, AI PCs)Stable
Apple Core MLApple silicon (iOS, macOS)Stable
Qualcomm AI Engine / Hexagon NPUAndroid (Qualcomm SoCs)Stable
Arm Ethos-U NPUEmbedded / MCUStable
Vulkan GPUCross-platform GPU (Android, Linux)Stable
Apple MPS (Metal Performance Shaders)iOS / macOS GPUAlpha
MediaTek NPUAndroid (MediaTek SoCs)Beta
Samsung Exynos NPUAndroid (Samsung SoCs)Alpha
Intel OpenVINOAI PCs (Windows / Linux x86)Alpha
CUDALinux / Windows GPUExperimental

Hardware partners include Apple, Arm, Cadence, Intel, MediaTek, NXP Semiconductors, Qualcomm, and Samsung.


Performance (Llama 3.2 1B Quantized)

DeviceDecode SpeedPrefill Speed
Samsung Galaxy S24+>40 tokens/s>350 tokens/s
OnePlus 1250.2 tokens/s260 tokens/s

Quantization reduces model size by ~52% (2.3 GiB to 1.1 GiB) and peak runtime memory by ~39%, with 2.5x average decode latency improvement over BF16 baseline.


Tools and Resources


Ecosystem and Integrations

  • Powers on-device AI in Meta Ray-Ban Smart Glasses (live translation, visual captions, menu translation) and Oakley Meta Vanguard glasses (athletic performance insights).
  • Runs scene understanding, depth estimation, hand tracking, and persistent room memory on Meta Quest 3 / Quest 3S.
  • Llama 3.2 1B and 3B models were co-developed with Qualcomm and MediaTek for optimized Snapdragon deployment via ExecuTorch.
  • Backend available in Hugging Face Optimum-ExecuTorch for direct integration with the Hugging Face model hub.
  • Complements PyTorch Mobile for teams already in the PyTorch ecosystem, offering a significantly smaller runtime and better edge-hardware coverage.

Get started by cloning github.com/pytorch/executorch and following the quickstart guide, or install via pip install executorch for model export tooling.

Meta Meta Executorch AI technology Hackathon projects

Discover innovative solutions crafted with Meta Meta Executorch AI technology, developed by our community members during our engaging hackathons.

EchoWalk: On-Device Guidance for Low-Vision Users

EchoWalk: On-Device Guidance for Low-Vision Users

Imagine walking through an unfamiliar room with your eyes closed. You need to know what is ahead, what is around you, and how to reach the chair someone mentioned — without cloud latency or sending your camera feed anywhere. EchoWalk is built for that moment. On a Galaxy S25 Ultra, one shared camera pipeline feeds a central ModeManager that decides when to warn, when to describe, and when to search — all on the Snapdragon NPU via ExecuTorch and Qualcomm QNN. Safety Radar runs continuously. Depth Anything V2 and YOLOv10 fuse on the Hexagon NPU: not just what is there, but how far and whether it is a trip hazard or a wall you can trail. Spatial audio and haptics place obstacles in space; a VoiceWarningEngine speaks when it matters. A live bounding-box overlay helps sighted helpers follow along in demos. Scene Description is on demand — tap the preview, the Describe button, or long-press Volume Up. A short burst of frames runs through a Places365 classifier and pairs the room label with live YOLO directions: "You appear to be in a living room — couch on your left, TV ahead." Auto-describe announces stable scene changes hands-free. The full SmolVLM-500M stack is integrated and validated through handoff scripts; richer VLM captions are ready for the next aligned build. Find Mode is voice-first. Long-press Volume Down, say "find the bottle," and the app maps your words to everyday object labels. It scans the room, guides you turn by turn, warns about obstacles in your path, highlights the target on screen, and remembers where it last saw it so the next search starts with a hint. Accessibility is front and center: lock-screen access, screen-on at launch, spoken onboarding with a first orientation from live radar, eyes-free volume shortcuts, and double-tap to repeat your last description. No cloud. No upload. Your home never leaves your pocket.

Beacon

Beacon

When the river crests and the towers go dark, a hundred people end up stranded in a school gym with no signal and no way to call for help. A volunteer nurse faces a growing line of the sick and injured with no one to consult. A teacher manages sixty frightened kids alone. A family doesn't know if their water is safe to drink. Every one of them is holding a phone with a powerful on-device NPU, but cloud AI dies the instant the network does, and no single phone has the memory or compute to run a frontier-grade LLM by itself. Beacon is built around this constraint from the start: the model is pre-sharded before disaster strikes, not after. Users opt in ahead of time, downloading a layer-wise slice of a large language model's weights onto their device, a contiguous block of transformer layers sized to that phone's available memory and NPU class. These shards sit dormant on the device, costing nothing until they're needed. When the network goes down, phones nearby connect over a peer-to-peer hotspot network: one phone hosts, others join directly, with no router or internet infrastructure required. Beacon assembles an inference cluster from whichever pre-loaded layer shards happen to be present in the room, sequencing them in the correct layer order for a forward pass. The hotspot link only needs to negotiate which layers are available, route activations between phones in sequence, and reroute around a phone that drops out or runs out of battery. The heavy lifting, distribution, was done in advance, when everyone still had a connection. The result is a cluster that can assemble in seconds during an emergency, because the only real-time job is discovery and coordination, not download. The nurse gets triage guidance. The teacher gets crisis-management support. The family gets a real answer about their water. The help didn't arrive; it was already pre-positioned in their pockets, just waiting to be switched on.

SnapOn: On-Device Context-Aware Multimodal AI

SnapOn: On-Device Context-Aware Multimodal AI

SnapOn is an Android-based, offline-first multimodal AI assistant that understands what the user says and what the user sees. By combining speech, vision, and on-device reasoning, SnapOn provides fast, privacy-preserving assistance without any cloud dependency. Rather than a general-purpose chatbot, SnapOn is designed for real-world situations, identifying people and objects, summarizing documents, recognizing products and labels, and answering spoken questions about the current scene. The interaction is natural and hands-free. Hold the mic button, speak your question or say "remember this," and SnapOn captures the best camera frame, transcribes your voice using Whisper, and generates a grounded answer using SmolVLM-500M-Instruct running on the Snapdragon Hexagon NPU via ExecuTorch. What makes SnapOn unique is its personal memory layer. Say "remember this is my medication Metformin" and SnapOn saves a visual fingerprint using CLIP embeddings alongside your exact words. Next time you point the camera at the same object or person, SnapOn recognizes it passively and surfaces your saved context automatically, no button press needed. Use cases include identifying people and objects in view, summarizing documents and text in the scene, recognizing products, signs, and labels, answering spoken questions, and saving personal context for future reference. The stack includes SmolVLM-500M-Instruct, OpenAI CLIP ViT-B/32, Whisper-tiny, FAISS, SQLite, CameraX, AudioRecord, and Android TTS. On-device compilation targets SM8750 via ExecuTorch and Qualcomm QNN backend. Built for the ExecuTorch Hackathon with a strong emphasis on NPU utilization, real-world usability, and complete privacy.