en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

Artificial Intelligence: Transforming Cybersecurity and Data Protection

From:Nexdata Date: 2024-08-14

Table of Contents
Challenges in Indonesian speech recognition
Indonesian speech recognition data
Indonesian Conversational Speech Data

➤ Challenges in Indonesian speech recognition

In the progress of constructing an intelligent future, datasets play a vital role. From autonomous driving cars to smart security systems, high-quality datasets provide AI models with massive amount of learning materiel, empowering AI model more adaptable in various real-world scenarios. Companies and researchers through continuously improving the efficiency of data collection and annotation can accelerate the implementation of AI technology, help all industries achieve their digital transformation.

Indonesian is one of the most widely spoken languages globally, with over 270 million speakers spread across the archipelago. As technology becomes increasingly integrated into everyday life, it is crucial to enable Indonesian speakers to communicate with and command devices using their native language. However, developing a robust speech recognition system for Indonesian presents unique challenges due to its phonological complexity and rich morphological structure.

Training data is the backbone of any machine learning model, and speech recognition systems are no exception. High-quality training data plays a pivotal role in the accuracy and performance of these systems. In the case of Indonesian speech recognition, having a diverse and extensive dataset of spoken language is essential. This dataset should encompass a wide range of accents, dialects, and speaking styles to ensure the model's ability to adapt to variations in natural speech.

➤ Indonesian speech recognition data

Obtaining sufficient and accurate training data for Indonesian speech recognition is not without challenges. Firstly, the vast linguistic diversity across Indonesia means that the dataset must capture the nuances of various regional accents and linguistic variations. Secondly, privacy concerns and ethical considerations require developers to anonymize and secure the data while complying with data protection regulations.

Indonesian Speech Datasets

359 Hours-Indonesian Speech Data by Mobile Phone

Indonesia speech data (reading) is collected from 496 Indonesian native speakers and is recorded in quiet environment. The recording is rich in content, covering multiple categories such as econimics, entertainment, news, figure, letter, and oral. Around 400 sentences for each speaker. The valid data volumn is 360 hours. All texts are manual transcribed with high accuray.

496 People – Indonesian Speech Data by Mobile Phone_Guiding

Indonesia speech data (guiding) is collected from 496 Indonesian native speakers and is recorded in quiet environment. The recording is rich in content, covering multiple categories such as in-car scene, smart home, speech assistant. 50 sentences for each speaker. The valid volumn is 10.5 hour. All texts are manual transcribed with high accuray.

639 Hours - Indonesian Speech Data by Mobile Phone

1285 Indonesian native speakers participated in the recording with authentic accent. The recorded script is designed by linguists and cover a wide range of topics including generic, interactive, on-board and home. The text is manually proofread with high accuracy. It matches with mainstream Android and Apple system phones. The data set can be applied for automatic speech recognition, and machine translation scenes.

➤ Indonesian Conversational Speech Data

108 Hours - Indonesian Conversational Speech Data by Mobile Phone

The 108 Hours - Indonesian conversational speech data collected by phone involved 140 native speakers, developed with proper balance of gender ratio, Speakers would choose a few familiar topics out of the given list and start conversations to ensure dialogues' fluency and naturalness. The recording devices are various mobile phones. The audio format is 16kHz, 16bit, uncompressed WAV, and all the speech data was recorded in quiet indoor environments. All the speech audio was manually transcribed with text content, the start and end time of each effective sentence, and speaker identification.

89 Hours - Indonesian Conversational Speech Data by Telephone

The 89 Hours - Indonesian conversational speech data collected by Telephone involved 124 native speakers, developed with proper balance of gender ratio, Speakers would choose a few familiar topics out of the given list and start conversations to ensure dialogues' fluency and naturalness. The recording devices are various mobile phones. The audio format is 8kHz, 8bit, u-law pcm, and all the speech data was recorded in quiet indoor environments. All the speech audio was manually transcribed with text content, the start and end time of each effective sentence, and speaker identification.

Based on different application scenarios, developers needs customize data collection and annotation. For example, autonomous drive need fine-grained street view annotation, medical image analysis require super resolution professional image. With the integration of technology and reality, high-quality datasets will continue to play a vital role in the development of artificial intelligence.

35d04c3d-0a76-4e43-b41e-acf68f9df860