en

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

OCR Datasets

Instantly enhance AI model performance with high quality off-the-shelf datasets.

Data Type

All
28
General Scenario
9
Handwriting
14
Internet image
4
Invoice
2
Others
4
Test paper
2
Table
1

Language

All
28
Chinese
8
English
5
Hindi
2
Japanese
5
Korean
5
Others
19
Vietnamese
2

573,264 Images Test-paper & Workbook & Answer-sheet Collection Data

573,264 Images Test-paper & Workbook & Answer-sheet Collection Data. The dataset inlcudes 35,823 test papers, 457,970 workbooks and 79,471 answer sheets. The dataset covers multiple types of questions, multiple subjects, multiple types, multiple grades. The collection device are cellphone and scanner. The data can be used for tasks such as intelligent scoring and homework guidance. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Test-paper & Workbook & Answer-sheet primary junior high all subjects multiple types of questions multiple subjects multiple types multiple grades intelligent scoring homework guidance collection data test-paper data answer sheet data

497 Images – English Invoice Data

497 Images – English Invoice Data,the collection background is a solid color background, and personal information is desensitized, including various types of invoices, which can be used for tasks such as bill recognition and text recognition.
OCR bill annotation multiple types of bills

5,147 Images Japanese Handwriting OCR data

5,147 Images Japanese Handwriting OCR Data. The text carrier are A4 paper, lined paper, quadrille paper, etc. The device is cellphone, the collection angle is eye-level angle. The dataset content includes Japanese composition, poetry, prose, news, stories, etc. For annotation, line-level quadrilateral bounding box annotation and transcription for the texts were annotated in the data.The dataset can be used for tasks such as Japanese handwriting OCR.
Japanese Handwriting OCR line-level annotation line-level text transcription

5,156 Images - Mathematical Formula Handwriting OCR Data

5,156 Images - Mathematical Formula Handwriting OCR Data. The writing envirenment includes A4 paper, square paper, lined paper, white board, etc. The data diversity includes multiple writing papers, multiple types of mathematical formulas, multiple photographic angles. The collecting angeles are looking up angleand eye-level angle. The dataset can be used for tasks such as mathematical formula handwriting OCR.
Mathematical formula Handwriting OCR A4 paper square paper lined paper white board A4 paper square paper lined paper white board looking up angle eye-level angle

1,000 People - German Handwriting OCR Data

1,000 People - German Handwriting OCR Data. The writers are Europeans who often write German. The device is scanner, the collection angle is eye-level angle. The dataset content includes address, company name, personal name.The dataset can be used for tasks such as German handwriting OCR.
German Handwriting OCR Europeans scanner eye-level angle

1,000 People - Spanish Handwriting OCR Data

1,000 People - Spanish Handwriting OCR Data. The writers are Europeans who often write spanish. The device is scanner, the collection angle is eye-level angle. The dataset content includes address, company name, personal name.The dataset can be used for tasks such as spanish handwriting OCR.
Spanishn Handwriting OCR Europeans scanner eye-level angle

1,000 People - French Handwriting OCR Data

1,000 People - French Handwriting OCR Data. The writers are Europeans who often write French. The device is scanner, the collection angle is eye-level angle. The dataset content includes address, company name, personal name.The dataset can be used for tasks such as French handwriting OCR.
French Handwriting OCR Europeans scanner eye-level angle

14,511 Images English Handwriting OCR Data

14,511 Images English Handwriting OCR Data. The text carrier are A4 paper, lined paper, English paper, etc. The device is cellphone, the collection angle is eye-level angle. The dataset content includes English composition, poetry, prose, news, stories, etc. For annotation, line-level quadrilateral bounding box annotation and transcription for the texts were annotated in the data.The dataset can be used for tasks such as English handwriting OCR.
English handwriting ocr

5,711 Images Korean Handwriting OCR data

5,711 Images Korean Handwriting OCR Data. The text carrier are A4 paper, lined paper, quadrille paper, etc. The device is cellphone, the collection angle is eye-level angle. The dataset content includes Korean composition, poetry, prose, news, stories, etc. For annotation, line-level quadrilateral bounding box annotation and transcription for the texts were annotated in the data.The dataset can be used for tasks such as Korean handwriting OCR.
Korean Handwriting OCR line-level annotation line-level text transcription

loading

Tailor Your Data Now

Why off-the-shelf Datasets

  • Copyright

    Copyright

    Clear Coyright and Ready to Check
  • Security

    Security

    Properly Authorized Secure to Use
  • Professional

    Professional

    Designed and produced by AI data experts
  • Diversity

    Diversity

    Collected from a varity of real scenes
  • Cost Effective

    Cost Effective

    More Cost-Efficient Than Tailored Data
  • Efficiency

    Efficiency

    Ready-To-Go Deliver in Seconds
2ffe18ea-68b4-488d-ba47-784633257b08