Moss-Audio Captioning is a first of its kind! | Here's the repo: I modified the GUI to allow for batch captioning, youtube videos, and file chunking.
·
0 reactions
·
0 comments
·
11 views
I personally think this is a a very cool app and truly something new. MOSS-Audio is a new open-source AI model designed to go far beyond basic speech transcription. It can listen to recordings, caption what is happening, detect sounds and events, analyze music, and even answer questions about the audio. Think of it a bit like Joy Caption, but for audio instead of images. Instead of only converting speech to text, it attempts to understand the entire sound environment. This makes it useful for po
Original article
StableDiffusion
Anonymous · no account needed