Leverage High-Quality Children Speech Data to Train AI Models

From:Nexdata Date: 08/15/2024

➤ Children speech data for voice assistants

In the research and application of artificial intelligence, acquiring reliable and rich data has become a crucial part of developing high-efficient algorithm. In order to improve the accuracy and robustness of AI models, enterprises and researchers needs various datasets to train system to cope with complicated scenarios in real applications. This makes the progress of collecting and optimizing data crucial and directly affects the final performance of AI.

Recently, scientists has performed a speech recognition capability test on some voice assistant on the market. Researchers found voice assistants including Amazon Echo, Google Home and other devices had recognition errors in the scene of interacting with children.

➤ Speech data of children

Different from adults, children’s voices have natural technical difficulties due to their voice and pronunciation characteristics. More importantly, children are not good at interacting with the voice assistant with the way that machines can understand. Whether it is a more friendly interactive interface or a more intelligent voice assistant, the recognition effect is not satisfactory.

The importance of high-quality children speech data is evident, in order to train a smarter voice assistant. As a professional AI data services provider, Nexdata has accumulated 4,000 hour high-quality children speech data, to supports the research and application of children voice interactive products.

Chinese Children Speech data

Mobile phone captured audio data of Chinese children, with total duration of 3,255 hours. 9,780 speakers are children aged 6 to 12, with accent covering seven dialect areas; the recorded text contains common children languages such as essay stories, numbers, and their interactions on cars, at home, and with voice assistants, precisely matching the actual application scenes.

Chinese Children Speaking English Speech Data

Children read English audio data, covering ages from preschool (3–5 years old) to post-school (6–12 years old), with children’s speech features; content accurately matches children’s actual scenes of speaking English. It provides data support for children’s smart home, automatic speech recognition and oral assessment in intelligent education scene.

American Children Speech Data

➤ British children recordings data

It is recorded by 219 American children native speakers. The recording texts are mainly storybook, children’s song, spoken expressions, etc. 350 sentences for each speaker. Each sentence contain 4.5 words in average. Each sentence is repeated 2.1 times in average.

British Children Speech Data

It collects 201 British children. The recordings are mainly children textbooks, storybooks. The average sentence length is 4.68 words and the average sentence repetition rate is 6.6 times. This data is recorded by high fidelity microphone.

If the above data cannot meet the needs of your current research, Nexdata also provides data customization services for specific groups of people, specific scenarios, and specific languages to meet customers’ diversified data needs.

End

If you need data services, please feel free to contact us: info@nexdata.ai

Data is the key to the success of artificial intelligence. We must strengthen data collection methods and data security to achieve more intelligent and efficient technical solutions. In a rapidly developing market, only by continuous innovate and optimize of artificial intelligence can we build a safer, more efficient and intelligent society. If you have data requirements, please contact Nexdata.ai at [email protected].