en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

m.nexdata.datatang.com

Chinese-English Parallel Corpus Dataset (80,120,000 Sentence Pairs) – Translation & NLP

Chinese English parallel corpus
Chinese English translation dataset
Chinese English machine translation data
Chinese English bilingual corpus
Chinese English parallel dataset
Chinese English text dataset

This dataset contains 80 million Chinese-English parallel sentences, covering domains such as travel, medicine, daily conversation, and TV scripts. It is stored in txt format, cleaned, desensitized, and quality-checked. It can be used as a fundamental dataset for machine translation, bilingual NLP tasks, and other text processing applications.

Paid Datasets
This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.
SpecificationsSpecifications
Storage format
TXT
Data content
Chinese-English Parallel Corpus Data
Data size
80.12 million pairs of Chinese-English Parallel Corpus Data.
Language
Chinese, English
Application scenario
machine translation
Sample Sample
  • Chinese-English Parallel Corpus Dataset (80,120,000 Sentence Pairs) – Translation & NLP
Recommended DatasetsRecommended Dataset
Tell Us Your Special Needs

By submitting, I agree to the Privacy Protection

ec44e9ad-b310-4fd7-abdb-b09f87f3140a

d69c89c6-88e0-4eab-ab86-26026a12540a