Identiﬁcation of diﬀerentially expressed methylated genes in melanoma versus nevi using bioinformatics methods

Background. Melanoma is a highly invasive malignant skin tumor. While melanoma may share some similarities with that of melanocytic nevi, there also exist a number of distinct diﬀerences between these conditions. An analysis of these diﬀerences may provide a means to more eﬀectively evaluate the etiology and pathogenesis of melanoma. In particular, diﬀerences in aberrant methylation expression may prove to represent a critical distinction. Methods. Data from gene expression datasets (GSE3189 and GSE46517) and gene methylation datasets (GSE86355 and GSE120878) were downloaded from the GEO database. GEO2R was used to obtain diﬀerentially expressed genes (DEGs) and diﬀerentially methylation genes (DMGs). Function and pathway enrichment of selected genes were performed using the DAVID database. A protein-protein interaction (PPI) network was constructed by STRING while its visualization was achieved with use of cytoscape. Primary melanoma samples from TCGA were used to identify signiﬁcant survival genes. Results. Totally, 199 genes in the hypermethylation-low expression group while 136 genes in hypomethylation-high expression group were identiﬁed. The former were enriched in the biological processes of transcription regulation, RNA metabolism and regulation of cell proliferation. The later were highly involved in cell cycle regulation. 13 genes were screened out after survival analysis and included: ISG20, DTL, TRPV2, PLOD3, KIF3C, DLGAP4, PI4K2A,


Background
Melanoma is a highly invasive malignant skin tumor.While melanoma may share some similarities with that of melanocytic nevi, there also exist a number of distinct differences between these conditions.An analysis of these differences may provide a means to more effectively evaluate the etiology and pathogenesis of melanoma.In particular, differences in aberrant methylation expression may prove to represent a critical distinction.

Methods
Data from gene expression datasets (GSE3189 and GSE46517) and gene methylation datasets (GSE86355 and GSE120878) were downloaded from the GEO database.GEO2R was used to obtain differentially expressed genes (DEGs) and differentially methylation genes (DMGs).
Function and pathway enrichment of selected genes were performed using the DAVID database.
A protein-protein interaction (PPI) network was constructed by STRING while its visualization was achieved with use of cytoscape.Primary melanoma samples from TCGA were used to identify significant survival genes.

Results
Totally, 199 genes in the hypermethylation-low expression group while 136 genes in hypomethylation-high expression group were identified.The former were enriched in the biological processes of transcription regulation, RNA metabolism and regulation of cell proliferation.The later were highly involved in cell cycle regulation.13 genes were screened out after survival analysis and included: ISG20,DTL,TRPV2,PLOD3,KIF3C,DLGAP4,PI4K2A,WIPI1,SHANK2,SLC16A10,GSTA4O, LFML2A and TMEM47.

Conclusion Introduction
Melanoma is a highly invasive malignant skin tumor derived from melanocytes whose incidence has been showing yearly increases.It can be induced by many factors, of which ultraviolent light is considered to be the most common environmental factor, along with a strong heredity component [1] .In contrast to melanoma, nevus is a pigmented lesion formed by transformed melanocytes and is a common benign tumor in the skin.While most melanomas are primary tumors, approximately 33% of all primary melanomas arise from nevi [2] .In this way, although melanoma and melanocytic nevi may share some common biological characteristics, of perhaps greater significance, are their differences, which represents a rich area of investigation.
Most past research on melanoma has mainly focused on genomic variations, among which is the mutated BRAF gene, which is highly related to melanoma [3] .This work has resulted in substantial contributions for the diagnosis and treatment of melanoma.Current efforts and attention have been directed to the study of cancer epigenetics, a promising research field in discovering tumor developing mechanisms and biomarkers.Epigenetics refers to the effects of genetic alterations other than DNA variations on gene expression and function, including DNA methylation, non-coding RNAs and histone remodeling [4] .CpG dinucleotides, which tend to gather into clusters called CpG islands [5] , represent one such example.Methylation of these CpG islands is associated with the silencing of many genes, with DNA methylation affecting gene expressions that play important roles in the development of melanoma, such as LINE-1, TERT, Manuscript to be reviewed MGMT, KIT, TNF and MITF [6] .However, the mechanism and pathways involved in DNA methylation are still not well understood.

Abstract
Recently, microarray has been increasingly used in the exploration of the genetics and epigenetics of melanoma, with the results of these assays providing new perspectives about the etiology, treatment and prognosis of melanoma.Many differential genes and key pathways have been unearthed through bioinformatics analysis.As for epigenetics, other than methylation, LncRNA and microRNA, represent the most significant topics of investigation.However, to date, in the bioinformatics analysis of melanoma, no examples exist regarding combinations of aberrant methylation and gene expression in melanoma versus nevi.We have explored four data sets in the GEO database, two of which show differences in gene expression in melanoma and nevi samples (GSE3189, GSE46517), the other two show DNA methylation differences between them (GSE86355, GSE120899).In this report, we used bioinformatic tools to analyze related functions and pathways of differential genes, as well as their mutual interactions.Our goal was to identify novel melanoma-related genes and their corresponding DNA methylations, providing new insights into the occurrence and development of melanoma.

Microarry data and data processing
Four data sets from the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) of The National Center for Biotechnology Information (NCBI) were selected.GSE3189 [7] and GSE46517 [8] (platform: GPL96 Affymetrix Human Genome U133A Array) relate to gene expression, and the DNA methylation microarrays are GSE86355 [9] and GSE120878 [10] , respectively (platform: GPL13534 Illumina Infinium HumanMethylation450 BeadChip array) (Table 1).Totally, 45 melanoma and 18 nevi samples were included in GSE3189 while 31 PeerJ reviewing PDF | (2019:09:41508:3:1:NEW 1 May 2020) Manuscript to be reviewed melanoma and 9 nevi samples were included in GSE46517.An additional 33 melanoma and 14 nevi samples were enrolled in GSE86355 while 89 melanoma and 73 nevi were enrolled in GSE120878 (detailed information is contained in Table 1).GEO2R, an online analysis tool built in the GEO website, was used to analyze raw data in order to identify DEGs and DMGs.P value < 0.05 and |t| > 2 were set as cut-off values.Then, the online tool, Bioinformatics & Evolutionary Genomics, was used to calculate overlap genes.Nevi samples were set as the control group.Overlap of hypermethylated genes and down-regulated genes were considered as the hypermethylation-low expression group, while hypomethylated genes and up-regulated genes that intersected were considered as the hypomethylation-high expression group.The above results were illustrated within the Venn chart on the website.

GO term and KEGG pathway analysis of DEGs and DMGs
The DAVID knowledgebase (https://david.ncifcrf.gov/),an online gene functional annotation tool, was used to analyze the function and pathway enrichment of obtained DEGs [10] .The Fisher exact test P-value was calculated and a P-value < 0.05 was regarded as being statistically significant.

PPI network and hub gene analysis
For analyses of interactions among proteins of interest, the STRING platform, an online tool for the structural and functional analysis of protein interactions [12] was used , 0.4 was regarded as the cut-off criterion and the active interaction sources were Textmining, Experiments, Databases, Co-expression, Neighborhood, Gene Fusion and Co-occurrence.Then Cytoscape software 3.6.1 (https://cytoscape.org)and built-in app Molecular Complex Detection (MCODE) was used to identify core modules in these proteins.Manuscript to be reviewed Data on gene expressions from melanoma patients were obtained from the TCGA data portal (https://tcga-data.nci.nih.gov/tcga/),information on RNA expression as well as survival data were downloaded.Only primary melanoma samples were enrolled.A univariate Cox model was used to illustrate the overall survival of differentially expression genes, p<0.05 was the cut off value.

DEGs and DMGs in melanoma
The four data sets were separately analyzed using GEO2R and data with a P-value < 0.05 and |t| > 2 were then selected for further study.Among them, the hypermethylation-low expression group had 199 genes while the hypomethylation-high expression group had 136 genes (Fig. 1).

GO functional enrichment and KEGG pathway analysis
The GO functional enrichment was analyzed with use of DAVID, and included biological process (BP), cellular component (CC) and molecular function (MF).The top ten of each functional group for both hypermethylation-low and hypomethylation-high expression genes are listed in Fig. 2 and Fig. 3.In the hypermethylation-low expression group, BP was mainly involved in the regulation of transcription and RNA metabolism, regulation of cell proliferation, gene expression, cell biosynthesis, nitrogen-containing compounds and macromolecular metabolism.CC enrichment indicated that the differentially expressed genes were mainly distributed in various parts of the plasma membrane including the basilal plasma membrane, apical plasma membrane and the membrane raft, while cell junctions including anchoring junction, adherens junction and cell-substrate junctions were also involved.As for MF, transcription regulation was the main component, with other functions like protein dimerization, protein specific binding and sequence specific DNA binding also being involved (Fig. 2).Manuscript to be reviewed For hypomethylation-high expression genes, biological process enrichment indicated that the cell cycle phase was highly regulated, with immune responses, nuclear division, post-Golgi vesicle-mediated transport and organelle fission also being involved.These genes were mostly distributed in the cytoplasm including the spindle pole, endosome, cytosol, vacuole and organelle lumen.Molecular function showed the presence of binding with ATP, adenyl nucleotides and nucleosides (Fig. 3).The same tool was used for analyzing the KEGG pathway.But there were few significant pathways those genes enriched, the only one has significance is in the hypermethylation-low expression group , pathway of basal cell carcinoma involved.

3.PPI network construction and module analysis.
STRING database was used to construct a protein-protein interaction network, and with use of cytoscape, software key modules and hub genes were constructed.The 136 hypomethylationhigh expression genes identified contained a notable module (Fig. 4), function of genes in this module is more likely to regulate cell cycle (Table 2), while genes in the hypermethylation-low expression group failed to show any significant modules (Fig. 5).

Survival analysis of differentially expressed genes
Information of primary melanoma samples from TCGA was used to screen out significant survival genes, then inserting with the differentially expressed genes above.Finally, 8 genes were selected in the hypomethylation-high expression group.They were ISG20, DTL, TRPV2, PLOD3, KIF3C, DLGAP4, PI4K2A and WIPI1.While 5 genes for hypermethylation-low expression group, including SHANK2, SLC16A10, GSTA4O, LFML2A and TMEM47.
Interestingly, except DTL, significant survival genes belong to hypomethylation-high expression group were positively related to survival time.In contrast , In the hypermethylation-low expression group, they may had a negative effect to survival (Fig 6) Manuscript to be reviewed

Discussion
With continuous development of microarray and high-throughput sequencing technology, thousands of genes can be analyzed simultaneously, providing new insights for the diagnosis and treatment of melanoma.DNA methylation is a common epigenetic variation of melanoma and plays an important role in the development of this condition.In this study, we selected two mRNA expression data sets and two methylation data sets from the GEO database.After calculating the DEGs and DMGs, we combined hyper-methylation and low expression data and hypo-methylation and high expression in order to identify some candidate genes associated with melanoma pathogenesis and development.Then, GO and KEGG pathway enrichment analyses were performed based on these differentially expressed genes, and constructed protein-protein interactions were used to locate hub genes.Finally, we evaluated their effects on tumor survival.
In hypermethylation-low expression genes, transcription regulation was the most enriched function followed by nitrogen and macromolecular compound metabolism, regulation of cell metabolism and biosynthesis and protein binding.It seems clear that melanoma development is strongly associated with gene transcription, and a number of transcription factors are involved in its regulation.Among them, Microphthalmia-associated transcription factor (MITF) is the most common.Forced expression of MITF has been shown to be sufficient for induction of ectopic melanocytes in zebrafish [13] and the expression of at least some melanocyte differentiation genes in cultured mouse cells [14] .In addition to MITF, other transcription factors such as SOX10, YY1 and TFAP2A [15] are involved with regulating the proliferation, differentiation and even invasion and metastasis of melanocytes.For the maintenance of melanoma, macromolecular substances such as sugar and protein are involved.For example, Heparan sulfate proteoglycans (HSPGs) participates in signal transduction by regulating ligands [16] .Proto-oncogene-driven metabolic Manuscript to be reviewed recombination is an adaptation of low energy in the tumor microenvironment, which serves to accelerate the proliferation of tumor cells [17] .The metabolism of macromolecular substances also provides energy for tumorigenesis.As most of these genes are concentrated on the plasma membrane in cells, component analysis, we suspect that many of these molecules are involved in cell signaling as membrane receptors.
In hypomethylation-high expression genes, cell cycle regulation, immune responses, and binding with ATP, represent the main processes exerted by these genes.As most of these genes are distributed intracellularly, and involve spindles, this is consistent with the metabolic processes of melanoma.Spindles are the organelles involved in mitosis, a process which requires ATP to provide energy.Regulation of the cell cycle is an important component of melanoma development, including such processes as cyclin-dependent kinase abnormalities and G1-S transition dysregulation [18] .We performed a protein-protein interaction analysis on these genes and two significant modules were identified within the hypomethylation-high expression genes, of which, one module exhibited enrichment for several gene ontology terms which is mainly involved in the regulation of mitosis and cell cycle; and, our current results indicate that regulation of the cell cycle plays an important role during the development of melanomas.
The role of hypermethylation and demethylation genes in tumors is being further explored, but their function in the progressive ,metastasis process or protection effect is still not so clear.
Studies have found that hypermethylation may be shared in primary disease and metastasis [19] , while loss of DNA methylation (hypomethylation) in cancer can lead to genomic instability, which can also occur in promoters of proto-oncogenes, leading to their activation and contribution to the progression of the malignancy [20,21] .In this study, we screened out genes Manuscript to be reviewed significant to survival of melanoma patients, majority of them were verified to associated with tumors.ISG20 has interferon-related antiviral effects, while a study found that over-expression of ISG20 promotes metastasis and angiogenesis of liver cancer cells [22] .DTL is involved with the regulation of cell cycles and playing a role in the regulation of DNA damage, so we suggest hat it regulate cycle of melanocytes so that to regulate its progression.TRPV2 is an ion channel, it's activation mediated melanoma cell death [23] , which may play a negative role in melanoma development.PLOD3 was found to be highly expressed in lung cancer and gliomas [24] , which is associated with poor prognosis of them.KIF3C is found to be highly expressed in breast cancer, KIF3C and SHANK2 are both related to axons transmitting neural signals [25] , and poor prognosis of tumor, but its role in melanoma remains to be further studied.DLGAP4 is involved in the transmission of neuronal signals and has not been excavated for its role in tumors, but its epigenetic changes and dysfunction have been found to be related to early-onset cerebellar ataxia [26] .PKR / PI4K2A lysosome network is associated with poor prognosis in breast cancer patients [27] .WIPI1 is an indicator of autophagosome formation, it also plays a distinct role in controlling the transcription of melanogenic enzymes and melanosome maturation.SLC16A10 can have function of a net efflux pathway for aromatic amino acids in the basosolateral epithelial cells.
Abel et al. found hat GSTA4 is a novel susceptible gene for non melanoma skin cancer [28] , although the relationship between GSTA4 and melanoma was not analyzed, but it still was suggested expression is associated with a poor prognosis for skin tumors.TMEM47 plays a role in regulating the localization of a subset of tight junction proteins, associated actomyosin structures, cell morphology, and participates in developmental transitions from adherens to tight junctions [29] .But the role of OLFML2A has been seldom studied.While some of these genes Manuscript to be reviewed have been previously studied, for many of them their specific role in melanoma remains unclear.
Therefore, studies directed at investigating their roles in melanoma may provide new insights into the diagnosis and treatment of this condition.

Figure 1 Identification
Figure 1

Figure6.
Figure6.Survival analysis of genes significant to survival of melanoma patients.

Table 2 (on next page)
Top 3 GO terms and KEGG pathway involved in module 1 of hypomethylation-high expression genes .