Standardized representation of the LIDC annotations using DICOM
- Subject Areas
- Bioinformatics, Oncology, Radiology and Medical Imaging
- data descriptor, cancer imaging, imaging informatics, DICOM, medical image computing, data sharing
- © 2019 Fedorov et al.
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2019. Standardized representation of the LIDC annotations using DICOM. PeerJ Preprints 7:e27378v2 https://doi.org/10.7287/peerj.preprints.27378v2
The Lung Imaging Data Consortium and Image Database Resource Initiative (LIDC) conducted a multi-site reader study that produced a comprehensive database of Computed Tomography (CT) scans for over 1000 subjects annotated by multiple expert readers. The result is hosted in the LIDC-IDRI collection of The Cancer Imaging Archive (TCIA). Annotations that accompany the images of the collection are stored using project-specific XML representation. This complicates their reuse, since no general-purpose tools are available to visualize or query those objects, and makes harmonization with other similar type of data non-trivial. To make the LIDC dataset more FAIR (Findable, Accessible, Interoperable, Reusable) to the research community, we prepared their standardized representation using the Digital Imaging and Communications in Medicine (DICOM) standard. This manuscript is intended to serve as a companion to the dataset to facilitate its reuse.
The manuscript describes a public DICOM dataset of the annotations and measurements lung Computed Tomography (CT) images collected by the LIDC-IDRI project. Compared to the initial version, this version has more extensive instructions about the usage and is accompanied by a Jupyter Notebook illustrating its usage. The underlying dataset has also been updated as follows:
* DICOM Segmentation objects now do not encode empty slices to reduce object size
* the coded terms used to describe the nodule annotations now use fewer non-standard (99QIICR) codes
* SegmentLabel attribute is populated in the DICOM SEG objects to list nodule annotation name instead of "Nodule", to help with readability for the user