Harnessing AI's Potential in Retail and E-Commerce

From：Nexdata Date： 14/08/2024

➤ Voice assistants and pronunciation dictionaries

Data is the “fuel”that drives AI system towards continuous progress, but building high-quality datasets isn’t easy. The part where involve data collecting, cleaning, annotating, and privacy protecting are all challenging. Researchers need to collect targeted data to deal with complex problems faced on different fields to make sure the trained models have robustness and generalization capability. Through using rich datasets, AI system can achieve intelligent decision-making in more complex scenario.

A voice assistant is an intelligent application that helps users solve problems through smart conversations and instant question-and-answer interactions. Common voice assistants in daily life include "Siri" and "Xiao Du." These voice assistants come equipped with corresponding pronunciation dictionaries, which contain all the speech they can recognize.

➤ Pronunciation Dictionaries in Speech Recognition

A pronunciation dictionary is a dictionary that stores the pronunciation of all words and indicates their pronunciation. By using the pronunciation dictionary, a mapping relationship is established between the acoustic modeling units and the language modeling units, connecting the acoustic model and the language model. This creates a search state space that can be used by the decoder for decoding.

A sentence can be formed by combining several words, and the pronunciation dictionary allows us to obtain the phoneme sequence of the pronunciation of each word. The transition probabilities between adjacent words can be obtained through a language model, while the probability model of phonemes is mainly obtained through an acoustic model. This results in a probability model for a sentence.

In a speech recognition system, the larger the amount of data covered by the pronunciation dictionary, the higher the accuracy of speech recognition. When encountering new vocabulary, these words and their corresponding phonetic transcriptions can be added to the pronunciation dictionary, continuously expanding the vocabulary within it. It can be said that the three main factors for measuring the quality of a pronunciation dictionary are vocabulary size, phonetic labeling, and the accuracy of proofreading.

Currently, due to the need for professional control over the collection, labeling, and cleaning of pronunciation dictionaries, the performance of a speech recognition system can be impacted if there is not a large number of accurate pronunciation dictionaries that cover a wide range of vocabulary.

➤ English & Korean Pronunciation Dictionaries

Nexdata Pronunciation Dictionary Corpus

80,279 Cantonese Pronunciation Dictionary

This pronunciation dictionary collects words with dialect characteristics in Guangdong cantonese regions. Each entry consists of three parts: words, pinyin and tones. The dictionary can be used to provide pronunciation reference for sound recording personnel, research and development of pronunciation recognition technology, etc.

101,702 Japanese Pronunciation Dictionary

The data contains 101,702 entries. All words and pronunciations are produced by Japanese linguists. It can be used in the research and development of Japanese ASR technology.

500,113 English Pronunciation Dictionary

The data contains 500,113 entries. All words and pronunciations are produced by English linguists. It can be used in the research and development of English ASR technology.

444,202 Korean Pronunciation Dictionary

The data contains 444,202 entries. All words and pronunciations are produced by Korean linguists. It can be used in the research and development of Korean ASR technology.

In the future data-driven era, the development prospects of artificial intelligence are infinite, and data is still a core factor for AI to unleash its full potential. By building richer datasets and advanced annotation technology, we can certainly promote more breakthroughs in AI in all walks of life. If you have data requirements, please contact Nexdata.ai at info@nexdata.ai.

Harnessing AI's Potential in Retail and E-Commerce

Recent

Join Nexdata MLC-SLM Workshop at Interspeech 2025

Exploring Datasets for iBeta Certification: A Guide for Biometric System Developers

The Crucial Role of Healthcare Chatbot Datasets in Advancing Medical Communication

Previous

Revolutionizing Customer Engagement with AI Chatbots

Next

Advancing Automotive Speech Recognition: Overcoming Data Challenges