From:Nexdata Date: 2024-08-15
In the modern field of artificial intelligence, the success of an algorithm depends on the quality of the data. As the importance of data in artificial intelligence models becomes increasingly prominent, it becomes crucial to collect and make full use of high-quality data. This article will help you better understand the core role of data in artificial intelligence programs.
As one of the most influential global communication languages, English is widely used around the world. Due to the complexity of mixing Chinese and English and the differences in languages, the accuracy of English speech recognition has encountered great challenges.
Based on Nexdata's years of technology accumulation in the field of intelligent speech, Nexdata has accumulated a wealth of British English speech recognition data, covering children, teenagers, and adults of all ages, and the content covers multiple application scenarios.
199 Hours-British English Speech Data by Mobile Phone_Reading
The data set contains 346 British English speakers' speech data, all of whom are English locals. Around 392 sentences of each speaker. The valid data is 199 hours. Recording environment is quiet. Recording contents contain various categories like economics, news, entertainment, commonly used spoken language, letter, figure, etc.
349 People - British English Speech Data by Mobile Phone_Guiding
This data set contains 349 English speaker's speech data, all of whom are English locals. The recording environment is quiet. The recorded content includes many fields such as car, home, voice assistant, etc. About 50 sentences per person. Valid data is 9.5 hours. All texts are manually transcribed with high accuracy.
831 Hours - British English Speech Data by Mobile Phone
831 Hours–Mobile Telephony British English Speech Data, which is recorded by 1651 native British speakers. The recording contents cover many categories such as generic, interactive, in-car and smart home. The texts are manually proofreaded to ensure a high accuracy rate. The database matchs the Android system and IOS.
55 Hours - British Children Speech Data by Microphone
It collects 201 British children. The recordings are mainly children textbooks, storybooks. The average sentence length is 4.68 words and the average sentence repetition rate is 6.6 times. This data is recorded by high fidelity microphone. The text is manually transcribed with high accuracy.
Standing at the forefront of technology revolution, we are well aware of the power of data. In the future, through contentiously improve data collection and annotation process, AI system will become more intelligent. All walks of life should actively embrace the innovation of data-driven to stay ahead in the fierce market competition and bring more value for society.