{"id":1437,"datatype":"1","titleimg":"","type1":"226","type1str":null,"type2":"254","type2str":null,"dataname":"1 Million Pairs Image Caption Data Of General Scenes","datazy":[{"title":"Data size","desc":"Data size","content":"1 million pairs of images and descriptions"},{"title":"Image type","desc":"Image type","content":"covers landscapes, animals, flowers and trees, people, cars, sports, industry, and architecture"},{"title":"Data format","desc":"Data format","content":"image format is .jpg, text format is .txt"},{"title":"Text length","desc":"Text length","content":"in principle, the description should be no less than 200 Chinese characters"},{"title":"Main description content","desc":"Main description content","content":"overall scene of the picture, detailed description of the elements within the scene, and the emotions conveyed by the picture"},{"title":"Accuracy rate","desc":"Accuracy rate","content":"the proportion of correctly labeled images is not less than 95%"},{"title":"Image Resolution","desc":"Image Resolution","content":"no less than 2 million pixels, most of them are higher than 5 million pixels"}],"datatag":"AIGC,English description,Chinese description,Multiple image categories,Multiple descriptions","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":"","samplePresentation":[{"name":"/data/apps/damp/temp/ziptemp/APY240731001_demo1733565600188/1.png","url":"https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY240731001_demo1733565600188/1.png?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=ZE1joqM%2Fkptv4wFRlMnRt1e3MEI%3D","intro":"","size":0,"progress":100,"type":"jpg"},{"name":"/data/apps/damp/temp/ziptemp/APY240731001_demo1733565600188/2.png","url":"https://bj-oss-datatang-03.oss-cn-beijing.aliyuncs.com/filesInfoUpload/data/apps/damp/temp/ziptemp/APY240731001_demo1733565600188/2.png?Expires=4102329599&OSSAccessKeyId=LTAI8NWs2pDolLNH&Signature=pTcoPbnWlmGDbmXot7NGi%2BnKy1I%3D","intro":"","size":0,"progress":100,"type":"jpg"}],"officialSummary":"1 million pairs of images and descriptions, the pictures cover various categories, including landscapes, animals, flowers and trees, people, cars, sports, industry, and architecture, along with an aesthetic subset. They depict the overall scene of the image, the details within the scene, and the emotions conveyed by the image. The description is provided in both English and Chinese languages.","dataexampl":null,"datakeyword":["Text description"," multi-modality"," general scene data set"," English caption"," Chinese caption"],"isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Type","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"llm","BGimg":"","voiceBg":["/shujutang/static/image/comm/audio_bg.webp","/shujutang/static/image/comm/audio_bg2.webp","/shujutang/static/image/comm/audio_bg3.webp","/shujutang/static/image/comm/audio_bg4.webp","/shujutang/static/image/comm/audio_bg5.webp"]}

Please fill in your name

Mobile phone format error

Please enter the telephone

Please enter your company name

Please enter your company email

Please enter the data requirement

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

The data requirement cannot be less than 5 words and cannot be pure numbers

1 Million Pairs Image Caption Data Of General Scenes

Text description

multi-modality

general scene data set

English caption

Chinese caption

1 million pairs of images and descriptions, the pictures cover various categories, including landscapes, animals, flowers and trees, people, cars, sports, industry, and architecture, along with an aesthetic subset. They depict the overall scene of the image, the details within the scene, and the emotions conveyed by the image. The description is provided in both English and Chinese languages.

This is a paid datasets for commercial use, research purpose and more. Licensed ready made datasets help jump-start AI projects.

Specifications

Data size

1 million pairs of images and descriptions

Image type

covers landscapes, animals, flowers and trees, people, cars, sports, industry, and architecture

Data format

image format is .jpg, text format is .txt

Text length

in principle, the description should be no less than 200 Chinese characters

Main description content

overall scene of the picture, detailed description of the elements within the scene, and the emotions conveyed by the picture

Accuracy rate

the proportion of correctly labeled images is not less than 95%

Image Resolution

no less than 2 million pixels, most of them are higher than 5 million pixels

Sample

Recommended Dataset

7 Million Sets - High-Quality Video Caption Dataset

7 million global genuine high-quality videos. All are genuine video works released by photographers around the world. 6 million of them are described in English and 1 million in Chinese. They cover a variety of categories such as people, landscapes, animals, etc. The resolution is above 1080p.

multimodal video description caption LLM dataset

300 million pairs of high-quality image-caption dataset

300 million images, each corresponding to a description. All are genuine image works published by photographers. The vast majority of descriptions are in English, with very few in Chinese.

multimodal image description

700,000 Sets Image Caption Data Of General Scenes

700,000 sets of images and descriptions，the types of pictures include landscapes, animals, flowers and trees, people, cars, sports, industries, and buildings. Category and an aesthetic subset, each image has no less than two descriptions, each with one sentence; a small number of images have only one description, and the description languages are English and Chinese

Text description multi-modality general scene data set English caption Chinese caption

10,000 Image Caption Data of Diverse Scenes

10,000 Image caption data of diverse scenes including natural scenes, urban street scenes, exhibitions, family environments and other scenes, shot with different brands of cameras, including multiple time periods, multiple shooting angles, description language is English, mainly describes the main scenes in the image, usually including foreground and background description.

multi-modality natural scene data set scene information data

Tell Us Your Special Needs

Full Name *

Contact Phone No. *

Company name *

Company Email *

Data Requirements *

By submitting, I agree to the Privacy Protection

Submit

Subscribe to our newsletter

Be the first to receive Nexdata latest product releases, data solutions and enterprise news.

Off-the-Shelf Datasets: All Category Datasets; LLM Datasets; Computer Vision Datasets; Speech Recognition Datasets; Speech Synthesis Datasets; OCR Datasets; Pronunciation Dictionary; NLU Datasets

Data Service: 3D Point Cloud Data; Street View Data; OCR Data; Behavior Recognition Data; Identity Recognition Data; Speech Recognition Data; Speech Synthesis Data; Multimodal Data

Industries: Generative AI; Autonomous Vehicles; AR/VR; Conversational AI; Smart Home; Retail; Intelligent Healthcare

Company: About Us; News; partners; Quality & Security; Event
Links: OPENMPD; DataPlus; Datarade

Platform: Platform
Competition: Competition
Resources: Sponsored Datasets

Sharpen Your AI with Better Data

+1(626)594-5598

[email protected]

Sitemap Terms and Conditions

We use cookies to enhance your browsing experience, serve personalized ads or content, and analyze our traffic. By clicking "Accept All", you consent to our use of cookies.

d8fec0d5-cfa3-448b-a4ff-cc82d0f08dbd

489f86d6-894c-412f-9144-9fbcefa30be2