TG Telegram Group & Channel
Speech Technology | United States America (US)
Create: Update:

Interesting point. We shouldn't test speech LLMs on factual knowledge

https://www.youtube.com/watch?v=2d1MU280yQk

https://github.com/slp-rl/WhiStress

https://arxiv.org/abs/2505.19103

WHISTRESS: Enriching Transcriptions with Sentence Stress Detection

Iddo Yosha, Dorin Shteyman, Yossi Adi

Spoken language conveys meaning not only through words but also through intonation, emotion, and emphasis. Sentence stress, the emphasis placed on specific words within a sentence, is crucial for conveying speaker intent and has been extensively studied in linguistics. In this work, we introduce WHISTRESS, an alignment-free approach for enhancing transcription systems with sentence stress detection. To support this task, we propose TINYSTRESS-15K, a scalable, synthetic training data for the task of sentence stress detection which resulted from a fully automated dataset creation process. We train WHISTRESS on TINYSTRESS-15K and evaluate it against several competitive baselines. Our results show that WHISTRESS outperforms existing methods while requiring no additional input priors during training or inference. Notably, despite being trained on synthetic data, WHISTRESS demonstrates strong zero-shot generalization across diverse benchmarks. Project page: this https URL.

Interesting point. We shouldn't test speech LLMs on factual knowledge

https://www.youtube.com/watch?v=2d1MU280yQk

https://github.com/slp-rl/WhiStress

https://arxiv.org/abs/2505.19103

WHISTRESS: Enriching Transcriptions with Sentence Stress Detection

Iddo Yosha, Dorin Shteyman, Yossi Adi

Spoken language conveys meaning not only through words but also through intonation, emotion, and emphasis. Sentence stress, the emphasis placed on specific words within a sentence, is crucial for conveying speaker intent and has been extensively studied in linguistics. In this work, we introduce WHISTRESS, an alignment-free approach for enhancing transcription systems with sentence stress detection. To support this task, we propose TINYSTRESS-15K, a scalable, synthetic training data for the task of sentence stress detection which resulted from a fully automated dataset creation process. We train WHISTRESS on TINYSTRESS-15K and evaluate it against several competitive baselines. Our results show that WHISTRESS outperforms existing methods while requiring no additional input priors during training or inference. Notably, despite being trained on synthetic data, WHISTRESS demonstrates strong zero-shot generalization across diverse benchmarks. Project page: this https URL.


>>Click here to continue<<

Speech Technology






Share with your best friend
VIEW MORE

United States America Popular Telegram Group (US)