en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

155 Hours - Lip Sync Multimodal Video Data

lip language video data
Lip Sync Data
Multimodal Video Data
Video Data

Voice and matching lip language video filmed with 249 people by multi-devices simultaneously, aligned precisely by pulse signal, with high accuracy. It can be used in multi-modal learning algorithms research in speech and image fields.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Format
Video: mp4 format, 1,280*720, Audio: wav format, 16HZ, 16bit  mono
Recording Environment
Using quiet sunny room to stimulate daytime outdoor driving scenes,Signal to noise ratio 25~20dB
Recording Scenes
divide to big scenes and sub scenes by different intense of sunlight
Recording Content
Short signals and spoken sentences
Speaker
249 Chinese, balance for gender
Recording Device
Camera, HD microphone, Audio board
Recording angle
Recording videos of front face, single side face, looking up, looking down, side face looking down and side face looking up all 6 different angles, and proximal and distant audio at the same time
Language
Mandarin
Application scenario
Lip Language recognization
Accuracy
Accuracy of sentence should not below 95%
Sample Sample
Recommended DatasetsRecommended Dataset
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

4e6926ba-d2e0-4d47-a979-1e3de827689a

e4f08f57-b68b-4da0-bfb9-813ee2e54948