A novel similarity score based on gene ranks to reveal genetic relationships among diseases

View article
Bioinformatics and Genomics

Main article text

 

Introduction

Materials and Methods

MAG and SimSIP

Assessment of significance

Results

Simulation study

Parameter setting

Type I error

Power comparison

TCGA data analysis

Significant cancer pairs in TCGA

Genetic similarity among 18 types of cancer

  1. The vertex HNSC has the largest degree, 15, which shows that the HNSC has significant similarity with the other 15 cancers, except for KICH and KIRC. The degrees of vertices such as BLCA and UCEC reach 14; READ, ESCA, LUSC, and HNSC reach 13; and LUAD and LIHC reach 12. There are close intrinsic genetic relationships among the 18 cancers. The seven types of cancers ESCA, LUSC, HNSC, BLCA, BRCA, STAD, and UCEC ( their vertices with deeper color and bigger size) with a higher degree are closer to each other and tend to form a pivotal hub of the disease network of 18 cancer types.

  2. Cancers originating from the same organ or tissue tend to co-cluster, such as cancer pair READ and COAD or KIRP and KIPC with the most similar relationship. In addition, cancers with proximity also tend to group together, such as LIHC and CHOL with a highly significant relationship. These provide evidence that tumors with closer physical distance in human organs have similar sources of endoderm development or exposure to a common cancer-causing environmental factor (Sell & Dunsford, 1989). Further, for the three types of kidney tumors, KICH, KIRP, and KIPC, the similarity between KIRP and KIPC is more significant than KICH and KIPC or KIRP and KIPC, which may be explained by the fact that KIRC and KIRP are cancers of the proximal tubule segments, whereas KICH is a cancer of the distal tubule segments (Chen et al., 2016; Davis et al., 2014; Lee, Chou & Knepper, 2015). Compared to other cancer pairs from the same tissue, the degree of similarity between the two types of lung tumor LUSC and LUAD is not so significant, which may be due to the derivation of cell types of the two types of lung tumors: LUSC originates from squamous epithelial cells in the respiratory tract and alveoli, whereas LUAD originates from a large number of glandular or alveolar cells (Li et al., 2015; Mainardi et al., 2014; Sutherland et al., 2014).

  3. There are significant similarities among gastrointestinal tumors (READ, COAD, STAD, and ESCA), which is consistent with the results of integrative clustering across data types in the miRNA, mRNA, and RPPA platforms (Hoadley et al., 2018). In the disease network (Fig. 4), squamous cell carcinomas (BLCA, ESCA, HNSC, and LUSC) are also co-clustered, and the similarity score of the cancer pair ESCA and LUSC is the top three and has more significant similarity. Hoadley et al. (2018) reported a similar discovery based on a multi-platform dataset (including miRNA, mRNA, RPPA, aneuploidy, and DNA methylation data) in TCGA, and Abrams et al. (2018) also suggested that regardless of the tissue types of squamous cell carcinoma, potential similarities were detected among the transcription factor expression profiles of BLCA, ESCA, HNSC, and LUSC. In addition, the gynecologic tumors UCEC and BRCA are also close to each other in the network, which is consistent with the results of previous studies (Hoadley et al., 2018).

  4. Finally, similar to previous studies (Abrams et al., 2018; Hoadley et al., 2018), three types of cancer, PRAD, THCA, and GBM, are relatively independent in the network, the relationships among them and other cancers are relatively weak. In addition, instead of squamous cancers being clustered together, adenocarcinomas (PRAD, COAD, LUAD, STAD, and READ) appear to be scattered around the edge of the network.

Associated genes with colorectal cancer

Associated signaling pathway with colorectal cancer

Discussion

Conclusions

Supplemental Information

The function of MAG.

DOI: 10.7717/peerj.10576/supp-3

The function of SimSIP.

DOI: 10.7717/peerj.10576/supp-4

Additional Information and Declarations

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Dongmei Luo conceived and designed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.

Chengdong Zhang conceived and designed the experiments, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.

Liwan Fu analyzed the data, prepared figures and/or tables, and approved the final draft.

Yuening Zhang performed the experiments, prepared figures and/or tables, and approved the final draft.

Yue-Qing Hu conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

Gene expression datasets (whose gene expression profiles were determined experimentally using the Illumina HiSeq 2000 RNA Sequencing platform) for 33 types of cancer are available at the UCSC Xena functional genome browser from the TCGA hub (https://tcga.xenahubs.net).

The names of datasets are TCGA Acute Myeloid Leukemia (LAML), TCGA Adrenocortical Cancer (ACC), TCGA Bile Duct Cancer (CHOL), TCGA Bladder Cancer (BLCA), TCGA Breast Cancer (BRCA), TCGA Cervical Cancer (CESC), TCGA Colon Cancer (COAD), TCGA Endometrioid Cancer (UCEC), TCGA Esophageal Cancer (ESCA), TCGA Glioblastoma (GBM), TCGA Head and Neck Cancer (HNSC), TCGA Kidney Chromophobe (KICH), TCGA Kidney Clear Cell Carcinoma (KIRC), TCGA Kidney Papillary Cell Carcinoma (KIRP), TCGA Large B-cell Lymphoma (DLBC), TCGA Liver Cancer (LIHC), TCGA Lower Grade Glioma (LGG), TCGA Lung Adenocarcinoma (LUAD), TCGA Lung Squamous Cell Carcinoma (LUSC), TCGA Melanoma (SKCM), TCGA Mesothelioma (MESO), TCGA Ocular melanomas (UVM), TCGA Ovarian Cancer (OV), TCGA Pancreatic Cancer (PAAD), TCGA Pheochromocytoma & Paraganglioma (PCPG), TCGA Prostate Cancer (PRAD), TCGA Rectal Cancer (READ), TCGA Sarcoma (SARC), TCGA Stomach Cancer (STAD), TCGA Testicular Cancer (TGCT), TCGA Thymoma (THYM), TCGA Thyroid Cancer (THCA), and TCGA Uterine Carcinosarcoma (UCS).

Funding

This work was supported by the National Natural Science Foundation of China (grants nos. 11971117 and 11571082) and the Natural Science Foundation of Anhui Province of China (grant no. 1808085MG220). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

3 Citations 1,252 Views 392 Downloads

Your institution may have Open Access funds available for qualifying authors. See if you qualify

Publish for free

Comment on Articles or Preprints and we'll waive your author fee
Learn more

Five new journals in Chemistry

Free to publish • Peer-reviewed • From PeerJ
Find out more