https://www.linkedin.com/posts/alexander-polok-b5567284_dicow-diarization-conditioned-whisper-for-activity-7341058825732415488-UVez @Speech Technology

Create: 2025-06-18 Update: 2025-07-03 01:51:30

https://www.linkedin.com/posts/alexander-polok-b5567284_dicow-diarization-conditioned-whisper-for-activity-7341058825732415488-UVez

We are happy to announce that our DiCoW and DiariZen based system finished 🥈 in the Challenge and Workshop on Multilingual Conversational Speech Language Model (MLC-SLM) at Interspeech 2025 organized by Nexdata.jp【旧Datatang株式会社公式】!

https://www.nexdata.ai/competition/mlc-slm

📄 System description and additional analysis (including dataset inconsistencies) are now available on arXiv:
BUT System for the MLC-SLM Challenge:
👉 https://www.arxiv.org/abs/2506.13414

In addition, I’m very pleased to share that our journal paper:
"DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition"
👉https://www.sciencedirect.com/science/article/abs/pii/S088523082500066X
has been accepted for publication in Computer Speech & Language (Elsevier)!

And last but not least — just yesterday I had the pleasure of presenting a tutorial and mini-challenge on fine-tuning DiCoW in data/compute constrained environments at this year's JSALT Summer School!
🎓 If you want to try it yourself: https://colab.research.google.com/github/Lakoc/JSALT_tutorial/blob/main/challenge.ipynb

https://huggingface.co/spaces/BUT-FIT/EMMA_leaderboard

🎥 Recording available here: https://www.youtube.com/watch?v=KqNKGjcsi9g&list=PLSeS0sl8xpTwz7h5iJSniiF89iUdZXNJ2&index=28