https://github.com/DataoceanAI/Dolphin
Dolphin is a multilingual, multitask ASR model developed through a collaboration between Dataocean AI and Tsinghua University. It supports 40 Eastern languages across East Asia, South Asia, Southeast Asia, and the Middle East, while also supporting 22 Chinese dialects. It is trained on over 210,000 hours of data, which includes both DataoceanAI's proprietary datasets and open-source datasets. The model can perform speech recognition, voice activity detection (VAD), segmentation, and language identification (LID).
Supports Russian, Uzbek, Kazakh, Tajik, etc
https://github.com/DataoceanAI/Dolphin/blob/main/languages.md
>>Click here to continue<<
