Sample Records

×

Detailed label preview

Format	JPG images with JSON label
License	CDLA-Permissive
Domain	Computer Vision
Number of Records	340,000 images
Size	102 GB
Origin	Images of research papers from PubMed and annotations from IBM Research Australia.
Dataset Version Update	Version 1 – August 07, 2019
Dataset Coverage	The dataset contains images of research papers from the medical domain.
Business Use Case	Document Understanding The dataset can be used to train a model to extract various elements of a document such as tables, figures, texts etc. This can aid businesses dealing with a large number of documents to easily categorize the various elements in their documents.

Feature	Description
images	JSON field containing a list of images and their metadata (size, ID, name)
annotations	Each object instance annotation contains a series of fields, including the category id and segmentation mask of the object.
annotations -> segmentations	Contains the polygon coordinates for the segmentation mask for the specific class instance (table, list, text etc)
annotations -> bbox	Contains the bounding box coordinates for the specific class instance (table, list, text etc).
annotations -> is_crowd	This field indicates whether the class instance is a single object (is_crowd=0) or multiple objects (is_crowd=1). In this dataset we only have single objects so this field is always set to 0.
annotations -> category_id	The class label for the current class instance. This indicates what the current bbox/segmentation mask encapsulates (table, list, text etc).
categories	JSON field containing a list of classes and their metadata (ID, name) This dataset has 5 categories (w/ corresponding "ids") - text ("1"), title ("2"), list ("3"), table ("4"), figure ("5").