105 Hours - Italian(Italy) Gaming Real-world Casual Conversation and Monologue speech dataset

Italian

Spontaneous Dialogue

Gaming

Italian(Italy) Gaming Real-world Casual Conversation and Monologue speech dataset, covers spontaneous dialogue about popular and evergreen games, including player discussions on battle strategies, social interactions, esports news, etc., mirrors real-world interactions. Transcribed with text content, speaker's ID, gender, accent, offensive expression labeling and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.

Recommended Dataset

200 Hours - Portuguese(Brazil) Financial Entities Real-world Casual Conversation and Monologue speech dataset

Portuguese(Brazil) Financial Entities Real-world Casual Conversation and Monologue speech dataset, covering various financial professional terminologies, primarily focuses on macroeconomics and microeconomics, mirrors real-world interactions. Transcribed with text content, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Brazilian Portuguese Spontaneous Dialogue Financial

203 Hours - German(Germany) Financial Entities Real-world Casual Conversation and Monologue speech dataset

German(Germany) Financial Entities Real-world Casual Conversation and Monologue speech dataset, covering various financial professional terminologies, primarily focuses on macroeconomics and microeconomics, mirrors real-world interactions. Transcribed with text content, speaker's ID, gender, common entities and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

German Entity Spontaneous Dialogue Financial

300 Hours - English(India) Spontaneous Dialogue Smartphone speech dataset

English(India) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(390 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

India English Accent English Dialogue

206 Hours - English Financial Entities Real-world Casual Conversation and Monologue speech dataset

English Financial Entities Real-world Casual Conversation and Monologue speech dataset, covering various financial professional terminologies, primarily focuses on macroeconomics and microeconomics, mirrors real-world interactions. Transcribed with text content, speaker's ID, gender, common entities and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

English Entity Spontaneous Dialogue Financial

198 Hours - Spanish Gaming Real-world Casual Conversation and Monologue speech dataset

Spanish Gaming Real-world Casual Conversation and Monologue speech dataset, covers spontaneous dialogue about popular and evergreen games, including player discussions on battle strategies, social interactions, esports news, etc., mirrors real-world interactions. Transcribed with text content, speaker's ID, gender, accent, offensive expression labeling and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Spanish Spontaneous Dialogue Gaming

Korean Medical Speech Dataset – 203 Hours of Clinical Conversations

This Korean Medical Speech Dataset contains 203 hours of real-world audio including casual conversations and monologues. It spans a wide range of healthcare-related content such as medical consultations, academic lectures, training sessions, and clinical discussions. The dataset includes detailed annotations: transcripts, speaker ID, gender, and tagged medical entities. Designed for use in ASR, medical NLU, speech-based healthcare assistants, and AI model fine-tuning for domain-specific speech recognition. The recordings were collected from a geographically diverse speaker base and validated by multiple AI companies. All data complies with GDPR, CCPA, and PIPL regulations.

Korean medical speech dataset healthcare audio data medical voice dataset Korean clinical conversation dataset domain-specific ASR medical transcription Korean doctor-patient audio Korean medical chatbot dataset

Korean Financial Speech Dataset – 215 Hours of Real-World Audio

This Korean Financial Speech Dataset contains 215 hours of real-world audio, including casual conversations and monologues. The content spans professional financial terminology in macroeconomics and microeconomics contexts, simulating authentic banking and financial service interactions. Each recording includes transcriptions, speaker metadata (ID, gender), and tagged financial entities. The dataset supports a wide range of AI applications such as automatic speech recognition (ASR), financial natural language understanding (NLU), voicebot development, and domain-specific language modeling. All data complies with GDPR, CCPA, and PIPL regulations, ensuring privacy and ethical usage.

Korean financial speech dataset Korean ASR dataset economics audio corpus financial audio dataset Korean business voice data macroeconomic speech dataset finance chatbot training data domain-specific speech dataset Korean language audio for AI

839 Hours - Romanian(Romania) Real-world Casual Conversation and Monologue speech dataset

Romanian(Romania) Real-world Casual Conversation and Monologue speech dataset, covers self-media, conversation, live, variety show and other generic domains, mirrors real-world interactions. Transcribed with text content, speaker's ID, gender, and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Romania Romanian Casual Conversation Monologue Asr