TorchIO: A Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning

Fernando Pérez-García; Rachel Sparks; Sébastien Ourselin

doi:10.1016/j.cmpb.2021.106236

TorchIO: A Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning

Fernando Pérez-García^*, Rachel Sparks, Sébastien Ourselin

^*Corresponding author for this work

Surgical & Interventional Engineering

Research output: Contribution to journal › Article › peer-review

250 Citations (Scopus)

Abstract

Background and objective: Processing of medical images such as MRI or CT presents different challenges compared to RGB images typically used in computer vision. These include a lack of labels for large datasets, high computational costs, and the need of metadata to describe the physical properties of voxels. Data augmentation is used to artificially increase the size of the training datasets. Training with image subvolumes or patches decreases the need for computational power. Spatial metadata needs to be carefully taken into account in order to ensure a correct alignment and orientation of volumes. Methods: We present TorchIO, an open-source Python library to enable efficient loading, preprocessing, augmentation and patch-based sampling of medical images for deep learning. TorchIO follows the style of PyTorch and integrates standard medical image processing libraries to efficiently process images during training of neural networks. TorchIO transforms can be easily composed, reproduced, traced and extended. Most transforms can be inverted, making the library suitable for test-time augmentation and estimation of aleatoric uncertainty in the context of segmentation. We provide multiple generic preprocessing and augmentation operations as well as simulation of MRI-specific artifacts. Results: Source code, comprehensive tutorials and extensive documentation for TorchIO can be found at http://torchio.rtfd.io/. The package can be installed from the Python Package Index (PyPI) running pip install torchio. It includes a command-line interface which allows users to apply transforms to image files without using Python. Additionally, we provide a graphical user interface within a TorchIO extension in 3D Slicer to visualize the effects of transforms. Conclusion: TorchIO was developed to help researchers standardize medical image processing pipelines and allow them to focus on the deep learning experiments. It encourages good open-science practices, as it supports experiment reproducibility and is version-controlled so that the software can be cited precisely. Due to its modularity, the library is compatible with other frameworks for deep learning with medical images.

Original language	English
Article number	106236
Journal	Computer Methods and Programs in Biomedicine
Volume	208
DOIs	https://doi.org/10.1016/j.cmpb.2021.106236
Publication status	Published - Sept 2021

Keywords

Data augmentation
Deep learning
Medical image computing
Preprocessing

Access to Document

10.1016/j.cmpb.2021.106236

Cite this

@article{24d0d87cd00e4a2a839f6785e2c06ebf,

title = "TorchIO: A Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning",

abstract = "Background and objective: Processing of medical images such as MRI or CT presents different challenges compared to RGB images typically used in computer vision. These include a lack of labels for large datasets, high computational costs, and the need of metadata to describe the physical properties of voxels. Data augmentation is used to artificially increase the size of the training datasets. Training with image subvolumes or patches decreases the need for computational power. Spatial metadata needs to be carefully taken into account in order to ensure a correct alignment and orientation of volumes. Methods: We present TorchIO, an open-source Python library to enable efficient loading, preprocessing, augmentation and patch-based sampling of medical images for deep learning. TorchIO follows the style of PyTorch and integrates standard medical image processing libraries to efficiently process images during training of neural networks. TorchIO transforms can be easily composed, reproduced, traced and extended. Most transforms can be inverted, making the library suitable for test-time augmentation and estimation of aleatoric uncertainty in the context of segmentation. We provide multiple generic preprocessing and augmentation operations as well as simulation of MRI-specific artifacts. Results: Source code, comprehensive tutorials and extensive documentation for TorchIO can be found at http://torchio.rtfd.io/. The package can be installed from the Python Package Index (PyPI) running pip install torchio. It includes a command-line interface which allows users to apply transforms to image files without using Python. Additionally, we provide a graphical user interface within a TorchIO extension in 3D Slicer to visualize the effects of transforms. Conclusion: TorchIO was developed to help researchers standardize medical image processing pipelines and allow them to focus on the deep learning experiments. It encourages good open-science practices, as it supports experiment reproducibility and is version-controlled so that the software can be cited precisely. Due to its modularity, the library is compatible with other frameworks for deep learning with medical images.",

keywords = "Data augmentation, Deep learning, Medical image computing, Preprocessing",

author = "Fernando P{\'e}rez-Garc{\'i}a and Rachel Sparks and S{\'e}bastien Ourselin",

note = "Funding Information: This work is supported by the Engineering and Physical Sciences Research Council (EPSRC) [EP/R512400/1]. This work is additionally supported by the EPSRC-funded UCL Centre for Doctoral Training in Intelligent, Integrated Imaging in Healthcare (i4health) [EP/S021930/1] and the Wellcome / EPSRC Centre for Interventional and Surgical Sciences (WEISS, UCL) [203145Z/16/Z]. This publication represents, in part, independent research commissioned by the Wellcome Innovator Award [218380/Z/19/Z/]. The views expressed in this publication are those of the authors and not necessarily those of the Wellcome Trust. Publisher Copyright: {\textcopyright} 2021 The Author(s) Copyright: Copyright 2021 Elsevier B.V., All rights reserved.",

year = "2021",

month = sep,

doi = "10.1016/j.cmpb.2021.106236",

language = "English",

volume = "208",

journal = "Computer Methods and Programs in Biomedicine",

issn = "0169-2607",

publisher = "Elsevier Ireland Ltd",

}

TY - JOUR

T1 - TorchIO

T2 - A Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning

AU - Pérez-García, Fernando

AU - Sparks, Rachel

AU - Ourselin, Sébastien

N1 - Funding Information: This work is supported by the Engineering and Physical Sciences Research Council (EPSRC) [EP/R512400/1]. This work is additionally supported by the EPSRC-funded UCL Centre for Doctoral Training in Intelligent, Integrated Imaging in Healthcare (i4health) [EP/S021930/1] and the Wellcome / EPSRC Centre for Interventional and Surgical Sciences (WEISS, UCL) [203145Z/16/Z]. This publication represents, in part, independent research commissioned by the Wellcome Innovator Award [218380/Z/19/Z/]. The views expressed in this publication are those of the authors and not necessarily those of the Wellcome Trust. Publisher Copyright: © 2021 The Author(s) Copyright: Copyright 2021 Elsevier B.V., All rights reserved.

PY - 2021/9

Y1 - 2021/9

N2 - Background and objective: Processing of medical images such as MRI or CT presents different challenges compared to RGB images typically used in computer vision. These include a lack of labels for large datasets, high computational costs, and the need of metadata to describe the physical properties of voxels. Data augmentation is used to artificially increase the size of the training datasets. Training with image subvolumes or patches decreases the need for computational power. Spatial metadata needs to be carefully taken into account in order to ensure a correct alignment and orientation of volumes. Methods: We present TorchIO, an open-source Python library to enable efficient loading, preprocessing, augmentation and patch-based sampling of medical images for deep learning. TorchIO follows the style of PyTorch and integrates standard medical image processing libraries to efficiently process images during training of neural networks. TorchIO transforms can be easily composed, reproduced, traced and extended. Most transforms can be inverted, making the library suitable for test-time augmentation and estimation of aleatoric uncertainty in the context of segmentation. We provide multiple generic preprocessing and augmentation operations as well as simulation of MRI-specific artifacts. Results: Source code, comprehensive tutorials and extensive documentation for TorchIO can be found at http://torchio.rtfd.io/. The package can be installed from the Python Package Index (PyPI) running pip install torchio. It includes a command-line interface which allows users to apply transforms to image files without using Python. Additionally, we provide a graphical user interface within a TorchIO extension in 3D Slicer to visualize the effects of transforms. Conclusion: TorchIO was developed to help researchers standardize medical image processing pipelines and allow them to focus on the deep learning experiments. It encourages good open-science practices, as it supports experiment reproducibility and is version-controlled so that the software can be cited precisely. Due to its modularity, the library is compatible with other frameworks for deep learning with medical images.

AB - Background and objective: Processing of medical images such as MRI or CT presents different challenges compared to RGB images typically used in computer vision. These include a lack of labels for large datasets, high computational costs, and the need of metadata to describe the physical properties of voxels. Data augmentation is used to artificially increase the size of the training datasets. Training with image subvolumes or patches decreases the need for computational power. Spatial metadata needs to be carefully taken into account in order to ensure a correct alignment and orientation of volumes. Methods: We present TorchIO, an open-source Python library to enable efficient loading, preprocessing, augmentation and patch-based sampling of medical images for deep learning. TorchIO follows the style of PyTorch and integrates standard medical image processing libraries to efficiently process images during training of neural networks. TorchIO transforms can be easily composed, reproduced, traced and extended. Most transforms can be inverted, making the library suitable for test-time augmentation and estimation of aleatoric uncertainty in the context of segmentation. We provide multiple generic preprocessing and augmentation operations as well as simulation of MRI-specific artifacts. Results: Source code, comprehensive tutorials and extensive documentation for TorchIO can be found at http://torchio.rtfd.io/. The package can be installed from the Python Package Index (PyPI) running pip install torchio. It includes a command-line interface which allows users to apply transforms to image files without using Python. Additionally, we provide a graphical user interface within a TorchIO extension in 3D Slicer to visualize the effects of transforms. Conclusion: TorchIO was developed to help researchers standardize medical image processing pipelines and allow them to focus on the deep learning experiments. It encourages good open-science practices, as it supports experiment reproducibility and is version-controlled so that the software can be cited precisely. Due to its modularity, the library is compatible with other frameworks for deep learning with medical images.

KW - Data augmentation

KW - Deep learning

KW - Medical image computing

KW - Preprocessing

UR - http://www.scopus.com/inward/record.url?scp=85110998688&partnerID=8YFLogxK

U2 - 10.1016/j.cmpb.2021.106236

DO - 10.1016/j.cmpb.2021.106236

M3 - Article

AN - SCOPUS:85110998688

SN - 0169-2607

VL - 208

JO - Computer Methods and Programs in Biomedicine

JF - Computer Methods and Programs in Biomedicine

M1 - 106236

ER -

TorchIO: A Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this