TG Telegram Group & Channel
Speech Technology | United States America (US)
Create: Update:

Announcing the AudioMOS Challenge 2025!

Homepage:https://sites.google.com/view/voicemos-challenge/audiomos-challenge-2025

We are enlarging the scope of the previous VoiceMOS challenge series to cover not only speech but also music and general audio.

Founded in 2022, the VoiceMOS Challenge (VMC) series aims to compare prediction techniques for human ratings of speech. To facilitate development in the automatic evaluation of audio generation systems, we decided to enlarge the scope and rename it as the AudioMOS Challenge.

Track 1: MOS prediction for text-to-music systems
This track is based on the MusicEval dataset, spanning 31 TTM systems, along with ratings collected from music experts. Evaluation was conducted across two axes: overall musical impression and alignment with the text prompt.

Track 2: Audiobox-aesthetics-style prediction for TTS, TTA and TTM samples
This track is based on the recently released Meta Audiobox Aesthetics, where they proposed four new axes: production quality, production complexity, content enjoyment, and content usefulness.

Track 3: MOS prediction for speech in high sampling frequencies
For the training set, we provide samples in 16/24/48kHz, and during evaluation, the participants are asked to evaluate samples reflecting their scores in a listening test that contains samples from all frqeuencies.

We are planning to submit a challenge proposal to ASRU2025. The challenge will start officially on April 9th. Please pre-register if interested!

Announcing the AudioMOS Challenge 2025!

Homepage:https://sites.google.com/view/voicemos-challenge/audiomos-challenge-2025

We are enlarging the scope of the previous VoiceMOS challenge series to cover not only speech but also music and general audio.

Founded in 2022, the VoiceMOS Challenge (VMC) series aims to compare prediction techniques for human ratings of speech. To facilitate development in the automatic evaluation of audio generation systems, we decided to enlarge the scope and rename it as the AudioMOS Challenge.

Track 1: MOS prediction for text-to-music systems
This track is based on the MusicEval dataset, spanning 31 TTM systems, along with ratings collected from music experts. Evaluation was conducted across two axes: overall musical impression and alignment with the text prompt.

Track 2: Audiobox-aesthetics-style prediction for TTS, TTA and TTM samples
This track is based on the recently released Meta Audiobox Aesthetics, where they proposed four new axes: production quality, production complexity, content enjoyment, and content usefulness.

Track 3: MOS prediction for speech in high sampling frequencies
For the training set, we provide samples in 16/24/48kHz, and during evaluation, the participants are asked to evaluate samples reflecting their scores in a listening test that contains samples from all frqeuencies.

We are planning to submit a challenge proposal to ASRU2025. The challenge will start officially on April 9th. Please pre-register if interested!


>>Click here to continue<<

Speech Technology






Share with your best friend
VIEW MORE

United States America Popular Telegram Group (US)