Meta AudioCraft: Audio Processing and Generation Library

Welcome to the ultimate destination for groundbreaking audio technology. Crafted meticulously by Meta, AudioCraft is a cutting-edge library for deep learning-powered audio generation and research.

TypeThe library for audio processing and generation

What is AudioCraft?

AudioCraft is a next-gen library revolutionizing the audio industry with its array of features. It's not just an audio library, it's the future of audio.

Key Features:

  • Advanced Audio Generation Models: With models like AudioGen and MusicGen, prepare to experience unparalleled audio quality.
  • EnCodec: An innovative audio compressor and tokenizer that is setting a new standard in audio processing.


  • Generative Audio Needs: A comprehensive code base for all generative audio needs - be it music, sound effects, or compression after training on raw audio signals.
  • Simplified Model Design: The model design, especially for MusicGen and AudioGen, is simplified compared to previous generative models. With a single autoregressive Language Model (LM) that operates on compressed discrete music representation, or tokens, AudioCraft efficiently captures long-term dependencies in audio for high-quality generation.
  • EnCodec: A unique neural audio codec that converts audio signals to discrete tokens and vice-versa. It acts as the bridge between the raw waveform and the autoregressive language model.
  • Text-to-sound Generation: With AudioGen, you can convert text into environmental sounds.
  • Text-to-music Generation: MusicGen brings texts to life by crafting diverse, enchanting melodies based on the provided textual cues.

AI Tutorials

AudioCraft Resources

A curated list of libraries and technologies to help you build great projects with AudioCraft.

  • Installation Guide: Get started with AudioCraft using the detailed guide on GitHub.
  • Models Overview:
    • MusicGen - A top-tier controllable text-to-music model.
    • AudioGen - A groundbreaking text-to-sound model.
    • EnCodec - A high fidelity neural audio codec.
    • Multi Band Diffusion - An EnCodec compatible decoder using diffusion.
  • API Documentation: Delve deeper into the features, functionalities, and integrations with the detailed API Documentation.
  • Meta Intro Article: Understand the technology, its creation, and its capabilities with this Meta Intro Article.

SonicVision: The Pinnacle of Interactive Storytelling and Sensory Immersion In the ever-evolving landscape of gaming and interactive experiences, SonicVision stands as a groundbreaking innovation. Developed to be showcased at the AudioCraft Hack-a-Thon 2023, this transformative platform promises to redefine the way users engage with digital worlds. A Harmonious Blend of Art and Sound At the core of SonicVision is a revolutionary amalgamation of generative music and dynamic art, all woven into compelling stories that users can not only experience but also shape. Imagine entering a fantastical world where every decision you make not only progresses the story but also influences the art and music that envelops you. With SonicVision, this is not just a possibility; it's the standard experience. The Sonic Wonders of AudioCraft A crucial component that drives the platform is AudioCraft—an AI-driven music generation system that goes beyond mere background scores. Developed in-house, AudioCraft uses state-of-the-art AI models to generate music across all genres and styles. Whether you're venturing into an enchanted forest or a post-apocalyptic city, AudioCraft crafts the perfect auditory atmosphere, complete with sound effects that impeccably align with every situation. OpenAI: The Dungeon Master of Your Dreams SonicVision's immersive storytelling experience is powered by OpenAI's Chat-GPT, which serves as the Dungeon Master of your interactive journey. This is not just a chatbot; it's a narrative genius. It utilizes a tailored prompt layer that does more than merely guide the story. Chat-GPT dynamically commands the visual and musical elements of the game, adding layers of depth and interactivity previously unexplored in digital storytelling.



Creating a Symphony of Financial Data: Transforming Cryptocurrency Price Action into Music In the ever-evolving landscape of cryptocurrency, where markets surge and plummet within moments, enthusiasts and traders have long relied on charts and graphs to visualize these price dynamics. However, imagine a world where you not only witness these market fluctuations but also experience them as a unique musical composition. Welcome to "SoundCoin," an innovative project that merges cutting-edge technology, artificial intelligence, and creative expression to transform cryptocurrency price action into captivating music. The Vision Behind SoundCoin: SoundCoin was born out of a vision to bridge the gap between the analytical and artistic realms of cryptocurrency trading. Conceived by a team of tech enthusiasts and financial analysts, this project aims to provide a novel way for users to interact with and understand market data. Beyond traditional candlestick charts and complex technical analysis, SoundCoin introduces a sensory experience that transcends numbers and charts, making cryptocurrency trading not just informative but also enjoyable. The Impact of SoundCoin: SoundCoin transcends the conventional boundaries of financial analysis and creative expression. Here are some key aspects of its impact: - Education: Traders and enthusiasts gain a deeper understanding of market dynamics through auditory and visual means. The fusion of data and music provides a holistic perspective on price action. - Entertainment: SoundCoin introduces an element of fun and entertainment to cryptocurrency trading. Users can enjoy the creative and artistic aspects of market analysis. - Sharing Insights: The ability to export and share the created videos on platforms like YouTube extends the reach of financial insights. Users can use their unique compositions to convey their trading strategies and market observations.