Please use this identifier to cite or link to this item: https://ninho.inca.gov.br/jspui/handle/123456789/12761
Title: A data science approach for the identification of molecular signatures of aggressive cancers
Authors: Silva, Adriano Barbosa
Magalhães, Milena
Silva, Gilberto Ferreira da
Silva, Fabricio Alves Barbosa da
Carneiro, Flávia Raquel Gonçalves
Carels, Nicolas
Center for Medical Statistics, Informatics and Intelligent Systems, Institute for Artificial Intelligence, Medical University of Vienna
Centre for Translational Bioinformatics, William Harvey Research Institute, Queen Mary University of London
ITTM S.A.—Information Technology for Translational Medicine
Plataforma de Modelagem de Sistemas Biológicos, Center for Technology Development in Health (CDTS), Oswaldo Cruz Foundation (FIOCRUZ)
Laboratório de Modelagem Computacional de Sistemas Biológicos, Scientific Computing Program, Oswaldo Cruz Foundation (FIOCRUZ)
Center for Technology Development in Health (CDTS), Oswaldo Cruz Foundation (FIOCRUZ)
Laboratório Interdisciplinar de Pesquisas Médicas, Instituto Oswaldo Cruz, Oswaldo Cruz Foundation (FIOCRUZ)
Program of Immunology and Tumor Biology, Brazilian National Cancer Institute (INCA)
Keywords: Neoplasias
Neoplasms
Tipagem Molecular
Molecular Typing
Tipificación Molecular
Aprendizado de Máquina
Machine Learning
Aprendizaje Automático
Issue Date: 2022
Publisher: Cancers
Citation: SILVA , Adriano Barbosa; MAGALHÃES, Milena; SILVA, Gilberto Ferreira da; SILVA, Fabricio Alves Barbosa da; CARNEIRO, Flávia Raquel Gonçalves; CARELS , Nicolas. A data science approach for the identification of molecular signatures of aggressive cancers. Cancers, Suiça, v. 14, n. 9, p. 2325, maio 2022. DOI: 10.3390/cancers14092325.
Abstract: The main hallmarks of cancer include sustaining proliferative signaling and resisting cell death. We analyzed the genes of the WNT pathway and seven cross-linked pathways that may explain the differences in aggressiveness among cancer types. We divided six cancer types (liver, lung, stomach, kidney, prostate, and thyroid) into classes of high (H) and low (L) aggressiveness considering the TCGA data, and their correlations between Shannon entropy and 5-year overall survival (OS). Then, we used principal component analysis (PCA), a random forest classifier (RFC), and protein-protein interactions (PPI) to find the genes that correlated with aggressiveness. Using PCA, we found GRB2, CTNNB1, SKP1, CSNK2A1, PRKDC, HDAC1, YWHAZ, YWHAB, and PSMD2. Except for PSMD2, the RFC analysis showed a different list, which was CAD, PSMD14, APH1A, PSMD2, SHC1, TMEFF2, PSMD11, H2AFZ, PSMB5, and NOTCH1. Both methods use different algorithmic approaches and have different purposes, which explains the discrepancy between the two gene lists. The key genes of aggressiveness found by PCA were those that maximized the separation of H and L classes according to its third component, which represented 19% of the total variance. By contrast, RFC classified whether the RNA-seq of a tumor sample was of the H or L type. Interestingly, PPIs showed that the genes of PCA and RFC lists were connected neighbors in the PPI signaling network of WNT and cross-linked pathways.
Description: v. 14, n. 9, 2022, p. 2325
URI: https://ninho.inca.gov.br/jspui/handle/123456789/12761
ISSN: 2072-6694
Appears in Collections:Artigos de Periódicos da Pesquisa Experimental e Translacional



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.