Abstract
Background: Identifying the complete repertoire of genes that drive cancer in
individual patients is crucial for precision oncology. Most established methods identify
driver genes that are recurrently altered across patient cohorts. However, mapping
these genes back to patients leaves a sizeable fraction with few or no drivers,
hindering our understanding of cancer mechanisms and limiting the choice of
therapeutic interventions.
Results: We present sysSVM2, a machine learning software that integrates cancer
genetic alterations with gene systems-level properties to predict drivers in individual
patients. Using simulated pan-cancer data, we optimise sysSVM2 for application to
any cancer type. We benchmark its performance on real cancer data and validate its
applicability to a rare cancer type with few known driver genes. We show that drivers
predicted by sysSVM2 have a low false-positive rate, are stable and disrupt wellknown cancer-related pathways.
Conclusions: sysSVM2 can be used to identify driver alterations in patients lacking
sufficient canonical drivers or belonging to rare cancer types for which assembling a
large enough cohort is challenging, furthering the goals of precision oncology. As
resources for the community, we provide the code to implement sysSVM2 and the pretrained models in all TCGA cancer types (https://github.com/ciccalab/sysSVM2).
individual patients is crucial for precision oncology. Most established methods identify
driver genes that are recurrently altered across patient cohorts. However, mapping
these genes back to patients leaves a sizeable fraction with few or no drivers,
hindering our understanding of cancer mechanisms and limiting the choice of
therapeutic interventions.
Results: We present sysSVM2, a machine learning software that integrates cancer
genetic alterations with gene systems-level properties to predict drivers in individual
patients. Using simulated pan-cancer data, we optimise sysSVM2 for application to
any cancer type. We benchmark its performance on real cancer data and validate its
applicability to a rare cancer type with few known driver genes. We show that drivers
predicted by sysSVM2 have a low false-positive rate, are stable and disrupt wellknown cancer-related pathways.
Conclusions: sysSVM2 can be used to identify driver alterations in patients lacking
sufficient canonical drivers or belonging to rare cancer types for which assembling a
large enough cohort is challenging, furthering the goals of precision oncology. As
resources for the community, we provide the code to implement sysSVM2 and the pretrained models in all TCGA cancer types (https://github.com/ciccalab/sysSVM2).
Original language | English |
---|---|
Publisher | bioRxiv |
DOIs | |
Publication status | Published - 9 Dec 2020 |