From:Nexdata Date: 2024-08-15
In the research and application of artificial intelligence, acquiring reliable and rich data has become a crucial part of developing high-efficient algorithm. In order to improve the accuracy and robustness of AI models, enterprises and researchers needs various datasets to train system to cope with complicated scenarios in real applications. This makes the progress of collecting and optimizing data crucial and directly affects the final performance of AI.
Studies have found that the speech database built by adults cannot understand children's speech very well, and the speech recognition engine currently on the market has an accuracy rate of less than 60% for children's speech recognition.
Since children's speech and language characteristics are different from those of adults due to their voice and articulation, children's speech recognition has natural technical difficulties. In addition, children are not good at interacting with them in a way that machines can understand. Whether a more friendly interface or a smarter voice assistant is used, the recognition effect is not satisfactory.
For the application scenarios of children's speech recognition, Nexdata has developed children's speech recognition data covering multiple age groups of children, which can be effectively applied to children's speech recognition training models, improve the adaptability of children's speech, and improve the speech recognition rate of children users.
200 Hours - American Children Speech Recognition Data By Mobile Phone
The data is recorded by 290 children from the U.S.A, with a balanced male-female ratio. The recorded content of the data mainly comes from children's books and textbooks, which are in line with children's language usage habits. The recording environment is relatively quiet indoors, the text is manually transferred with high accuracy.
393 Hours - Korean Children Speech Recognition Data by Mobile Phone
Mobile phone captured audio data of Korean children, with total duration of 393 hours. 1085 speakers are children aged 6 to 15; the recorded text contains common children's languages such as essay stories, and numbers. All sentences are manually transferred with high accuracy.
50 Hours - American Children Speech Recognition Data by Microphone
It is recorded by 219 American children native speakers. The recording texts are mainly storybook, children's song, spoken expressions, etc. 350 sentences for each speaker. Each sentence contain 4.5 words in average. Each sentence is repeated 2.1 times in average. The recording device is hi-fi Blueyeti microphone. The texts are manually transcribed.
55 Hours - British Children Speech Recognition Data by Microphone
It collects 201 British children. The recordings are mainly children textbooks, storybooks. The average sentence length is 4.68 words and the average sentence repetition rate is 6.6 times. This data is recorded by high fidelity microphone. The text is manually transcribed with high accuracy.
3,255 Hours-Chinese Children Speech Recognition Data by Mobile phone
Mobile phone captured audio data of Chinese children, with total duration of 3,255 hours. 9,780 speakers are children aged 6 to 12, with accent covering seven dialect areas; the recorded text contains common children languages such as essay stories, numbers, and their interactions on cars, at home, and with voice assistants, precisely matching the actual application scenes. All sentences are manually transferred with high accuracy.
The progress in the AI field cannot leave the credit of data. By improving the quality and diversity of datasets we can better unleash the potential of artificial intelligence, promote its applications of all walks of life. Only by continuously improving the data system, AI technology can better respond to the fast changing data requirements from market. If you have data requirements, please contact Nexdata.ai at [email protected].