TY - JOUR
T1 - Using parenclitic networks on phaeochromocytoma and paraganglioma tumours provides novel insights on global DNA methylation
AU - Brempou, Dimitria
AU - Montibus, Bertille
AU - Izatt, Louise
AU - Andoniadou, Cynthia
AU - Oakey, Rebecca
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12/2
Y1 - 2024/12/2
N2 - Despite the prevalence of sequencing data in biomedical research, the methylome remains underrepresented. Given the importance of DNA methylation in gene regulation and disease, it is crucial to address the need for reliable differential methylation methods. This work presents a novel, transferable approach for extracting information from DNA methylation data. Our agnostic, graph-based pipeline overcomes the limitations of commonly used differential methylation techniques and addresses the “small n, big k” problem. Pheochromocytoma and Paraganglioma (PPGL) tumours with known genetic aetiologies experience extreme hypermethylation genome wide. To highlight the effectiveness of our method in candidate discovery, we present the first phenotypic classifier of PPGLs based on DNA methylation achieving 0.7 ROC-AUC. Each sample is represented by an optimised parenclitic network, a graph representing the deviation of the sample’s DNA methylation from the expected non-aggressive patterns. By extracting meaningful topological features, the dimensionality and, hence, the risk of overfitting is reduced, and the samples can be classified effectively. By using an explainable classification method, in this case logistic regression, the key CG loci influencing the decision can be identified. Our work provides insights into the molecular signature of aggressive PPGLs and we propose candidates for further research. Our optimised parenclitic network implementation improves the potential utility of DNA methylation data and offers an effective and complete pipeline for studying such datasets.
AB - Despite the prevalence of sequencing data in biomedical research, the methylome remains underrepresented. Given the importance of DNA methylation in gene regulation and disease, it is crucial to address the need for reliable differential methylation methods. This work presents a novel, transferable approach for extracting information from DNA methylation data. Our agnostic, graph-based pipeline overcomes the limitations of commonly used differential methylation techniques and addresses the “small n, big k” problem. Pheochromocytoma and Paraganglioma (PPGL) tumours with known genetic aetiologies experience extreme hypermethylation genome wide. To highlight the effectiveness of our method in candidate discovery, we present the first phenotypic classifier of PPGLs based on DNA methylation achieving 0.7 ROC-AUC. Each sample is represented by an optimised parenclitic network, a graph representing the deviation of the sample’s DNA methylation from the expected non-aggressive patterns. By extracting meaningful topological features, the dimensionality and, hence, the risk of overfitting is reduced, and the samples can be classified effectively. By using an explainable classification method, in this case logistic regression, the key CG loci influencing the decision can be identified. Our work provides insights into the molecular signature of aggressive PPGLs and we propose candidates for further research. Our optimised parenclitic network implementation improves the potential utility of DNA methylation data and offers an effective and complete pipeline for studying such datasets.
UR - http://www.scopus.com/inward/record.url?scp=85211080285&partnerID=8YFLogxK
U2 - 10.1038/s41598-024-81486-9
DO - 10.1038/s41598-024-81486-9
M3 - Article
SN - 2045-2322
VL - 14
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 29958
ER -