From:Nexdata Date: 2024-08-15
A major problem with speech recognition datasets on the market is that they focus on European languages or English. For the realization of uncommon language speech recognition, due to the great differences between different languages, artificial intelligence manufacturers need to model separately according to different language characteristics. In order to ensure the effect of speech recognition, high-quality speech recognition dataset in different languages are needed for model optimization. However, the scarcity of high-quality uncommon language speech recognition dataset has become a major bottleneck in speech recognition.
As the world's leading AI data service provider, Nexdata currently has pre-labeled speech recognition dataset in more than 30 uncommon languages, which can meet the needs of speech recognition in most uncommon languages. Nexdata strictly abides by the relevant regulations, and all the collected speech recognition datasets have been authorized by the person being collected.
Nexdata Uncommon Language Speech Recognition Dataset
760 Hours - Vietnamese Speech Recognition Dataset by Mobile Phone
1751 Vietnamese native speakers participated in the recording with authentic accent. The recorded script is designed by linguists and covers a wide range of topics including generic, interactive, on-board and home. The text is manually proofread with high accuracy. It matches with mainstream Android and Apple system phones.
292 Hours – Thai Speech Recognition Dataset by Mobile Phone_Reading
Thai Speech Recognition Dataset (reading) is collected from 498 Thailand native speakers and is recorded in quiet environment. The recording is rich in content, covering multiple categories such as economics, entertainment, news, figure, and oral. Around 400 sentences for each speaker. The valid data volume is 292 hours. All texts are manual transcribed with high accuracy.
759 Hours - Hindi Speech Recognition Dataset by Mobile Phone
The data is 759 hours long and was recorded by 1,425 Indian native speakers. The accent is authentic. The recording text is designed by language experts and covers general, interactive, car, home and other categories. The text is manually proofread, and the accuracy is high. Recording devices are mainstream Android phones and iPhones. Hindi Speech Recognition Dataset can be applied to speech recognition, machine translation, and voiceprint recognition.
134 Hours - Malay Speech Recognition Dataset by Mobile Phone_Reading
156 Speakers - Mobile Telephony Malay Speech Recognition Dataset_Reading is recorded by native Malay speakers in the quiet environment. The recording is rich in content, covering multiple categories such as economy, entertainment, news, oral language, numbers, and letters. Around 450 sentences for each speaker. The effective time is 134 hours. All texts are manually transcribed to ensure high accuracy.
639 Hours - Indonesian Speech Recognition Dataset by Mobile Phone
1285 Indonesian native speakers participated in the recording with authentic accent. The recorded script is designed by linguists and cover a wide range of topics including generic, interactive, on-board and home. The text is manually proofread with high accuracy. It matches with mainstream Android and Apple system phones. Indonesian Speech Recognition Dataset can be applied for automatic speech recognition, and machine translation scenes.
End
If you want to know more details about the speech recognition datasets or how to acquire, please feel free to contact us: [email protected].