This project focuses on the automatic generation of audiobooks from text input, offering an immersive experience. The system intelligently detects characters in the text and assigns them distinct, appropriate voices using Large Language Model. To enhance the auditory experience, the project incorporates text-to-audio-effect models, adding relevant background sounds and audio effects that match the context of the narrative. The combination of natural-sounding speech synthesis and environmental sound design creates a rich, engaging audiobook experience that adapts seamlessly to different genres and styles of writing, making the storytelling more vivid and captivating for listeners.
Category tags: