Transfer Learning of Deep Spatiotemporal Networks to Model Arbitrarily Long Videos of Seizures

Fernando Pérez-García; Catherine Scott; Rachel Sparks; Beate Diehl; Sébastien Ourselin

doi:10.1007/978-3-030-87240-3_32

Transfer Learning of Deep Spatiotemporal Networks to Model Arbitrarily Long Videos of Seizures

Fernando Pérez-García^*, Catherine Scott, Rachel Sparks, Beate Diehl, Sébastien Ourselin

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

9 Citations (Scopus)

Abstract

Detailed analysis of seizure semiology, the symptoms and signs which occur during a seizure, is critical for management of epilepsy patients. Inter-rater reliability using qualitative visual analysis is often poor for semiological features. Therefore, automatic and quantitative analysis of video-recorded seizures is needed for objective assessment. We present GESTURES, a novel architecture combining convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to learn deep representations of arbitrarily long videos of epileptic seizures. We use a spatiotemporal CNN (STCNN) pre-trained on large human action recognition (HAR) datasets to extract features from short snippets (≈ 0.5 s) sampled from seizure videos. We then train an RNN to learn seizure-level representations from the sequence of features. We curated a dataset of seizure videos from 68 patients and evaluated GESTURES on its ability to classify seizures into focal onset seizures (FOSs) (N= 106 ) vs. focal to bilateral tonic-clonic seizures (TCSs) (N= 77 ), obtaining an accuracy of 98.9% using bidirectional long short-term memory (BLSTM) units. We demonstrate that an STCNN trained on a HAR dataset can be used in combination with an RNN to accurately represent arbitrarily long videos of seizures. GESTURES can provide accurate seizure classification by modeling sequences of semiologies. The code, models and features dataset are available at https://github.com/fepegar/gestures-miccai-2021.

Original language	English
Title of host publication	Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 - 24th International Conference, Proceedings
Editors	Marleen de Bruijne, Philippe C. Cattin, Stéphane Cotin, Nicolas Padoy, Stefanie Speidel, Yefeng Zheng, Caroline Essert
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	334-344
Number of pages	11
ISBN (Print)	9783030872397
DOIs	https://doi.org/10.1007/978-3-030-87240-3_32
Publication status	Published - 2021
Event	24th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2021 - Virtual, Online Duration: 27 Sept 2021 → 1 Oct 2021

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	12905 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	24th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2021
City	Virtual, Online
Period	27/09/2021 → 1/10/2021

Keywords

Epilepsy video-telemetry
Temporal segment networks
Transfer learning

Access to Document

10.1007/978-3-030-87240-3_32

Cite this

Pérez-García, F., Scott, C., Sparks, R., Diehl, B., & Ourselin, S. (2021). Transfer Learning of Deep Spatiotemporal Networks to Model Arbitrarily Long Videos of Seizures. In M. de Bruijne, P. C. Cattin, S. Cotin, N. Padoy, S. Speidel, Y. Zheng, & C. Essert (Eds.), Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 - 24th International Conference, Proceedings (pp. 334-344). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12905 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-87240-3_32

Pérez-García, Fernando ; Scott, Catherine ; Sparks, Rachel et al. / Transfer Learning of Deep Spatiotemporal Networks to Model Arbitrarily Long Videos of Seizures. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 - 24th International Conference, Proceedings. editor / Marleen de Bruijne ; Philippe C. Cattin ; Stéphane Cotin ; Nicolas Padoy ; Stefanie Speidel ; Yefeng Zheng ; Caroline Essert. Springer Science and Business Media Deutschland GmbH, 2021. pp. 334-344 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inbook{656bf4be7ac24a438214e6436409e05d,

title = "Transfer Learning of Deep Spatiotemporal Networks to Model Arbitrarily Long Videos of Seizures",

abstract = "Detailed analysis of seizure semiology, the symptoms and signs which occur during a seizure, is critical for management of epilepsy patients. Inter-rater reliability using qualitative visual analysis is often poor for semiological features. Therefore, automatic and quantitative analysis of video-recorded seizures is needed for objective assessment. We present GESTURES, a novel architecture combining convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to learn deep representations of arbitrarily long videos of epileptic seizures. We use a spatiotemporal CNN (STCNN) pre-trained on large human action recognition (HAR) datasets to extract features from short snippets (≈ 0.5 s) sampled from seizure videos. We then train an RNN to learn seizure-level representations from the sequence of features. We curated a dataset of seizure videos from 68 patients and evaluated GESTURES on its ability to classify seizures into focal onset seizures (FOSs) (N= 106 ) vs. focal to bilateral tonic-clonic seizures (TCSs) (N= 77 ), obtaining an accuracy of 98.9% using bidirectional long short-term memory (BLSTM) units. We demonstrate that an STCNN trained on a HAR dataset can be used in combination with an RNN to accurately represent arbitrarily long videos of seizures. GESTURES can provide accurate seizure classification by modeling sequences of semiologies. The code, models and features dataset are available at https://github.com/fepegar/gestures-miccai-2021.",

keywords = "Epilepsy video-telemetry, Temporal segment networks, Transfer learning",

author = "Fernando P{\'e}rez-Garc{\'i}a and Catherine Scott and Rachel Sparks and Beate Diehl and S{\'e}bastien Ourselin",

note = "Funding Information: This work is supported by the Engineering and Physical Sciences Research Council (EPSRC) [EP/R512400/1]. This work is additionally supported by the EPSRC-funded UCL Centre for Doctoral Training in Intelligent, Integrated Imaging in Healthcare (i4health) [EP/S021930/1] and the Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS, UCL) [203145Z/16/Z]. The data acquisition was supported by the National Institute of Neurological Disorders and Stroke [U01-NS090407]. This publication represents, in part, independent research commissioned by the Wellcome Innovator Award [218380/Z/19/Z/]. The views expressed in this publication are those of the authors and not necessarily those of the Wellcome Trust. The weights for the 2D and 3D models were downloaded from TorchVision and https://github.com/moabitcoin/ig65m-pytorch, respectively. Funding Information: Acknowledgments. This work is supported by the Engineering and Physical Sciences Research Council (EPSRC) [EP/R512400/1]. This work is additionally supported by the EPSRC-funded UCL Centre for Doctoral Training in Intelligent, Integrated Imaging in Healthcare (i4health) [EP/S021930/1] and the Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS, UCL) [203145Z/16/Z]. The data acquisition was supported by the National Institute of Neurological Disorders and Stroke [U01-NS090407]. Publisher Copyright: {\textcopyright} 2021, Springer Nature Switzerland AG.; 24th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2021 ; Conference date: 27-09-2021 Through 01-10-2021",

year = "2021",

doi = "10.1007/978-3-030-87240-3_32",

language = "English",

isbn = "9783030872397",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "334--344",

editor = "{de Bruijne}, Marleen and Cattin, {Philippe C.} and St{\'e}phane Cotin and Nicolas Padoy and Stefanie Speidel and Yefeng Zheng and Caroline Essert",

booktitle = "Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 - 24th International Conference, Proceedings",

address = "Germany",

}

Pérez-García, F, Scott, C, Sparks, R, Diehl, B & Ourselin, S 2021, Transfer Learning of Deep Spatiotemporal Networks to Model Arbitrarily Long Videos of Seizures. in M de Bruijne, PC Cattin, S Cotin, N Padoy, S Speidel, Y Zheng & C Essert (eds), Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 - 24th International Conference, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12905 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 334-344, 24th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2021, Virtual, Online, 27/09/2021. https://doi.org/10.1007/978-3-030-87240-3_32

Transfer Learning of Deep Spatiotemporal Networks to Model Arbitrarily Long Videos of Seizures. / Pérez-García, Fernando; Scott, Catherine; Sparks, Rachel et al.
Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 - 24th International Conference, Proceedings. ed. / Marleen de Bruijne; Philippe C. Cattin; Stéphane Cotin; Nicolas Padoy; Stefanie Speidel; Yefeng Zheng; Caroline Essert. Springer Science and Business Media Deutschland GmbH, 2021. p. 334-344 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12905 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

TY - CHAP

T1 - Transfer Learning of Deep Spatiotemporal Networks to Model Arbitrarily Long Videos of Seizures

AU - Pérez-García, Fernando

AU - Scott, Catherine

AU - Sparks, Rachel

AU - Diehl, Beate

AU - Ourselin, Sébastien

N1 - Funding Information: This work is supported by the Engineering and Physical Sciences Research Council (EPSRC) [EP/R512400/1]. This work is additionally supported by the EPSRC-funded UCL Centre for Doctoral Training in Intelligent, Integrated Imaging in Healthcare (i4health) [EP/S021930/1] and the Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS, UCL) [203145Z/16/Z]. The data acquisition was supported by the National Institute of Neurological Disorders and Stroke [U01-NS090407]. This publication represents, in part, independent research commissioned by the Wellcome Innovator Award [218380/Z/19/Z/]. The views expressed in this publication are those of the authors and not necessarily those of the Wellcome Trust. The weights for the 2D and 3D models were downloaded from TorchVision and https://github.com/moabitcoin/ig65m-pytorch, respectively. Funding Information: Acknowledgments. This work is supported by the Engineering and Physical Sciences Research Council (EPSRC) [EP/R512400/1]. This work is additionally supported by the EPSRC-funded UCL Centre for Doctoral Training in Intelligent, Integrated Imaging in Healthcare (i4health) [EP/S021930/1] and the Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS, UCL) [203145Z/16/Z]. The data acquisition was supported by the National Institute of Neurological Disorders and Stroke [U01-NS090407]. Publisher Copyright: © 2021, Springer Nature Switzerland AG.

PY - 2021

Y1 - 2021

N2 - Detailed analysis of seizure semiology, the symptoms and signs which occur during a seizure, is critical for management of epilepsy patients. Inter-rater reliability using qualitative visual analysis is often poor for semiological features. Therefore, automatic and quantitative analysis of video-recorded seizures is needed for objective assessment. We present GESTURES, a novel architecture combining convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to learn deep representations of arbitrarily long videos of epileptic seizures. We use a spatiotemporal CNN (STCNN) pre-trained on large human action recognition (HAR) datasets to extract features from short snippets (≈ 0.5 s) sampled from seizure videos. We then train an RNN to learn seizure-level representations from the sequence of features. We curated a dataset of seizure videos from 68 patients and evaluated GESTURES on its ability to classify seizures into focal onset seizures (FOSs) (N= 106 ) vs. focal to bilateral tonic-clonic seizures (TCSs) (N= 77 ), obtaining an accuracy of 98.9% using bidirectional long short-term memory (BLSTM) units. We demonstrate that an STCNN trained on a HAR dataset can be used in combination with an RNN to accurately represent arbitrarily long videos of seizures. GESTURES can provide accurate seizure classification by modeling sequences of semiologies. The code, models and features dataset are available at https://github.com/fepegar/gestures-miccai-2021.

AB - Detailed analysis of seizure semiology, the symptoms and signs which occur during a seizure, is critical for management of epilepsy patients. Inter-rater reliability using qualitative visual analysis is often poor for semiological features. Therefore, automatic and quantitative analysis of video-recorded seizures is needed for objective assessment. We present GESTURES, a novel architecture combining convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to learn deep representations of arbitrarily long videos of epileptic seizures. We use a spatiotemporal CNN (STCNN) pre-trained on large human action recognition (HAR) datasets to extract features from short snippets (≈ 0.5 s) sampled from seizure videos. We then train an RNN to learn seizure-level representations from the sequence of features. We curated a dataset of seizure videos from 68 patients and evaluated GESTURES on its ability to classify seizures into focal onset seizures (FOSs) (N= 106 ) vs. focal to bilateral tonic-clonic seizures (TCSs) (N= 77 ), obtaining an accuracy of 98.9% using bidirectional long short-term memory (BLSTM) units. We demonstrate that an STCNN trained on a HAR dataset can be used in combination with an RNN to accurately represent arbitrarily long videos of seizures. GESTURES can provide accurate seizure classification by modeling sequences of semiologies. The code, models and features dataset are available at https://github.com/fepegar/gestures-miccai-2021.

KW - Epilepsy video-telemetry

KW - Temporal segment networks

KW - Transfer learning

UR - http://www.scopus.com/inward/record.url?scp=85116450010&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-87240-3_32

DO - 10.1007/978-3-030-87240-3_32

M3 - Conference paper

AN - SCOPUS:85116450010

SN - 9783030872397

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 334

EP - 344

BT - Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 - 24th International Conference, Proceedings

A2 - de Bruijne, Marleen

A2 - Cattin, Philippe C.

A2 - Cotin, Stéphane

A2 - Padoy, Nicolas

A2 - Speidel, Stefanie

A2 - Zheng, Yefeng

A2 - Essert, Caroline

PB - Springer Science and Business Media Deutschland GmbH

T2 - 24th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2021

Y2 - 27 September 2021 through 1 October 2021

ER -

Pérez-García F, Scott C, Sparks R, Diehl B, Ourselin S. Transfer Learning of Deep Spatiotemporal Networks to Model Arbitrarily Long Videos of Seizures. In de Bruijne M, Cattin PC, Cotin S, Padoy N, Speidel S, Zheng Y, Essert C, editors, Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 - 24th International Conference, Proceedings. Springer Science and Business Media Deutschland GmbH. 2021. p. 334-344. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-87240-3_32

Transfer Learning of Deep Spatiotemporal Networks to Model Arbitrarily Long Videos of Seizures

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this