miRDRN—miRNA disease regulatory network: a tool for exploring disease and tissue-specific microRNA regulatory networks

Background MicroRNA (miRNA) regulates cellular processes by acting on specific target genes, and cellular processes proceed through multiple interactions often organized into pathways among genes and gene products. Hundreds of miRNAs and their target genes have been identified, as are many miRNA-disease associations. These, together with huge amounts of data on gene annotation, biological pathways, and protein–protein interactions are available in public databases. Here, using such data we built a database and web service platform, miRNA disease regulatory network (miRDRN), for users to construct disease and tissue-specific miRNA-protein regulatory networks, with which they may explore disease related molecular and pathway associations, or find new ones, and possibly discover new modes of drug action. Methods Data on disease-miRNA association, miRNA-target association and validation, gene-tissue association, gene-tumor association, biological pathways, human protein interaction, gene ID, gene ontology, gene annotation, and product were collected from publicly available databases and integrated. A large set of miRNA target-specific regulatory sub-pathways (RSPs) having the form (T, G1, G2) was built from the integrated data and stored, where T is a miRNA-associated target gene, G1 (G2) is a gene/protein interacting with T (G1). Each sequence (T, G1, G2) was assigned a p-value weighted by the participation of the three genes in molecular interactions and reaction pathways. Results A web service platform, miRDRN (http://mirdrn.ncu.edu.tw/mirdrn/), was built. The database part of miRDRN currently stores 6,973,875 p-valued RSPs associated with 116 diseases in 78 tissue types built from 207 diseases-associated miRNA regulating 389 genes. miRDRN also provides facilities for the user to construct disease and tissue-specific miRNA regulatory networks from RSPs it stores, and to download and/or visualize parts or all of the product. User may use miRDRN to explore a single disease, or a disease-pair to gain insights on comorbidity. As demonstrations, miRDRN was applied: to explore the single disease colorectal cancer (CRC), in which 26 novel potential CRC target genes were identified; to study the comorbidity of the disease-pair Alzheimer’s disease-Type 2 diabetes, in which 18 novel potential comorbid genes were identified; and, to explore possible causes that may shed light on recent failures of late-phase trials of anti-AD, BACE1 inhibitor drugs, in which genes downstream to BACE1 whose suppression may affect signal transduction were identified.


INTRODUCTION
Protein-protein interactions (PPIs) are critical to almost all biological process, and a good knowledge of the network of interacting proteins is crucial to understanding cellular mechanisms (Rual et al., 2005). Recent advances in biotechnology, such as high-throughput yeast two-hybrid screening, have allowed scientists to build maps of proteome-wide PPI, or interactome. Conventionally, a PPI map is a static network, in which each node represents a protein and an edge connecting two proteins indicates that there is experimental evidence showing that, under certain circumstances, the two proteins would interact. In reality, a PPI network (PPIN) should be viewed as a dynamic entity: it is an interaction network that is intrinsically controlled by regulatory mechanisms and changes with time and space (Liang & Li, 2007), as determined by the physiological condition of the cell in which the proteins reside. If there is a PPIN that includes all possible PPIs, then, under a specific physiological condition only a specific sub-network of the PPIN is realized.
MicroRNAs (miRNAs) are small (∼22 nucleotides) noncoding regulatory RNA molecules in plants, animals, and some viruses. In a process known as RNA interference, a miRNA regulates gene expression by destabilizing and/or disrupting the translation of fully or partially sequenced mRNA (Bartel, 2009;Landgraf et al., 2007). In this way a miRNA regulates the formation of all PPINs to which its target is connected, and by extension all biological processes (BP) with which those PPINs are involved. As well as acting as a tumor suppressor gene (TSG), a miRNA may also act as an oncogene, say, by targeting a TSG (Zhang, Dahlberg & Tam, 2007). The function of a specific biological process, or its malfunction, such as associated with a disease, typically involves a complex composed of a set of miRNA-regulated proteins, together with their interacting protein partners. The study of such miRNA-protein complexes should be an integral part of understanding BP (Hsu, Juan & Huang, 2008) as well as diseases.
An understanding of the molecular and physio-pathological mechanisms of diseases is crucial for the design of disease preventive and therapeutic strategies. The combination of experimental and computational methods has led to the discovery of disease-related genes (Botstein & Risch 2003;Kann, 2010). An example is the causal relation connecting the malfunction causing mutations in the enzyme phenylalanine hydroxylase to the metabolic disorder Phenylketonuria (Scriver & Waters, 1999). Many human diseases cannot be attributed to single-gene malfunctions but arise from complex interactions among multiple genetic variants (Hirschhorn & Daly, 2005). How a disease is caused and how it can be treated can be better studied on the basis of a body of knowledge including all associated genes and biological pathways involving those genes.
Diseases are usually defined by a set of phenotypes that are associated with various pathological processes and their mutual interactions. Some relations between phenotypes of different diseases may be understood on the basis of common underlying molecular processes (Barabási, Gulbahce & Loscalzo, 2011), such as when there are genes associated with both diseases. It has been shown that genes associated with the same disorder encode proteins that have a strong tendency to interact with each other (Goh et al., 2007). More specifically, one may consider two diseases to be related if their metabolic reactions within a cell share common enzymes (Lee et al., 2008). Networks of PPIs have also been studied in the context of disease interactions (Ideker & Sharan 2008;Lim et al., 2006).
Here, we report on a web service platform, miRNA disease regulatory network (miRDRN) (http://mirdrn.ncu.edu.tw/mirdrn/). The platform contains two parts, a database that stores a set of newly constructed set of 6,973,875 p-valued target-specific regulatory sub-pathways (RSPs) associated with 116 diseases in 78 tissue types built from 207 diseases-associated miRNA regulating 389 genes; and a novel web-based tool that, using the RSPs stored in miRDRN and information from miRNA-related databases, facilitates the construction and visualization of disease and tissue-specific miRNA-protein regulatory networks for user specified single diseases and, for comorbidity studies, disease-pairs. We demonstrate three applications of miRDRN: to explore the molecular and network properties of the single disease colorectal neoplasm; to study the comorbidity of the disease-pair Alzheimer's disease-Type 2 diabetes (AD-T2D); and, by using miRDRN to construct a miRNA regulatory sub-network centered on the gene BACE1, to look for insights that may explain why several anti-AD, BACE1 inhibiting drugs that failed recent late-phase trials worsened conditions of treatment groups. We believe findings from miRDRN, even exploratory in nature, may potentially lead to the identification of new drug targets and new understanding in modes of drug action.

Data integration
miRDRM integrated data from several existing database on disease-miRNA association, miRNA-target gene association, gene ontology, biological pathway, and PPI. For disease-specific cases disease-associated miRNAs and targets were obtained from human microRNA and disease associations database (HMDD) (Li et al., 2014) (v2.0, http://www.cuilab.cn/hmdd). For non-disease specific cases, miRNA/siRNA and targets were obtained from TarBase (Vlachos et al., 2015) (v7.0, http://carolina.imis.athenainnovation.gr/diana_tools/web/index.php?r=tarbasev8%2Findex/). In disease-specific cases, the optional filter requiring miRNA-target association be assay validated used TarBase data; the filter excludes miRNA-target pairs appearing in HMDD but not in TarBase (if and when this happens). Gene-tissue associations were taken from NCBI-Entrez (NCBI Resource Coordinators et al., 2018); RSP associations with known pathways from the Construction of miRNA-associated target-specific regulatory sub-pathways Consider a linked sequence (M, T, G 1 , G 2 ) (Fig. 1), where M is a miRNA, T is its regulatory target gene, G 1 is a gene whose encoded protein (p 1 ) interacts (according to PPI data) with  Figure 1 Regulatory sub-pathways. In the linked sequence (M, T, G 1 , G 2 ), called a miRNA-specific regulatory sub-pathway (MRSP), M is a miRNA, T is its regulatory target gene, G 1 is a protein interacting (according to PPI data) with T, and G 2 is a protein interacting with G 1 . In the text the sequence (T, G 1 , G 2 ) is called a target-specific regulatory sub-pathway, or simply, regulatory sub-pathway (RSP). Full-size  DOI: 10.7717/peerj.7309/ fig-1 the protein (p T ) encoded by T, and G 2 is a gene whose encoded protein (p 2 ) interacts with p 1 . In what follows, when there is little risk of misunderstanding, the same symbol will be used to represent a gene or the protein it encodes. We call the sequence (T, G 1 , G 2 ) a target-specific RSP, or simply a RSP, and (M, T, G 1 , G 2 ) a miRNA-specific RSP (MRSP). Given a target gene T, we use PPI data from BioGRID to collect all RSPs by extending from T two levels of interaction.
Jaccard score of a regulatory sub-pathway Jaccard similarity coefficients (Ng, Liu & Lee, 2009) were used to score the RSPs, based on the assumption that there is a tendency for two directly interacting proteins to participate in the same set of BP or share the same set of molecular functions (MF). Given two sets S1 and S2 (in the current application, a set will be either a list of BP or a list of MF, both according to GO), the Jaccard (similarity) coefficient (JC) of S1 and S2 is defined as, Where ∪ is the union (of two sets), ∩ is the intersection, and |Z| is the cardinality of Z. JC, which ranges from zero to one, is a quantitative measure of the similarity between two sets. For example, when S1 Let (T, G 1 , G 2 ) be an RSP as defined in the previous section and denote by [G] the set of BP (or pathways) (Kanehisa et al., 2017;Ashburner et al., 2000) that involve the gene G. We define the Jaccard score, or JS, of RSP as, Where X may be BP or MF. If the pair [T] and [G 1 ] do not share a common term, then the corresponding JC has a zero value; similarly for the pair [G 1 ] and [G 2 ]. In either case the RSP is considered to be not viable and discarded. In other words, miRDRN excludes any RSP with zero JC score.

p-Value of a regulatory sub-pathway
A p-value for an RSP (T, G 1 , G 2 ) was assigned as follows. Let the total number of BP (or MF, as the case may be) terms be N, and the number of terms in ] be x, y, z, n 1 , and n 2 , respectively, then the p-values, P 1 and P 2 , for (T, G 1 ) and (G 1 , G 2 ) are, respectively The p-value for the RSP was set to be the greater of P 1 and P 2 .
Assembly and storage of target-specific regulatory sub-pathways A union set of miRNA-associated target genes were collected from HMDD and TarBase and for every target a complete set of RSPs, with BP-and MF-type JC scores and p-values assigned, was assembled. The entire set of RSPs for all targets was stored in miRDRN.

Construction of disease-specific miRNA regulatory network
A user-initiated construction of a disease-specific miRNA regulatory network (RRN) proceeds as follows.
Step 1. Select a disease.
Step 2. Collect from HMDD all miRNAs (M's) and target genes associated with the disease.
Step 3. For each M and each of its targets retrieve from miRDRN storage all target-specific RSPs, thus forming a set of MRSPs. The union of the sets of MRSPs over all M's is the set of disease-specific MRSPs for the selected disease.
Step 4. Construct the disease-specific RRN from the set of diseasespecific MRSPs by linking all unlinked pairs of genes/proteins if they have interaction according to BioGRID (Fig. 2).

RESULTS miRNA disease regulatory network (miRDRN)-A database and web service platform
We built miRDRN (http://mirdrn.ncu.edu.tw/mirdrn/), a web-based service that allows the user to construct a disease and tissue-specific, p-valued, miRNA-protein  Table 2).

Comparison of miRDRN with other miRNA-related databases
A number of databases and/or web service platforms on miRNA-related topics are publicly available (Table 3). Aside from HMDD and TarBase on which miRDRN was built (Table 1), PhenomiR (Ruepp, Kowarsch & Theis, 2012) is a database on disease-miRNA association, miRwayDB (Das, Saha & Chakravorty, 2018) is a database on disease-miRNAtarget and target-KEGG term association, and miRPathDB (Backes et al., 2017) is a database on miRNA-pathway association. New and unique as a database, miRDRN stores the 6,973,875 p-valued target-specific RSPs it has assembled ( Table 2). As a web service platform miRDRN is a tool that facilitates the construction and visualization of disease-specific RRNs using these RSPs in combination with resources from HMDD, TarBase, and several other databases (Table 1).
Brief description of usage of miRDRN miRNA disease regulatory network is reasonably user friendly; its many features are easily discovered by user exploration. Here, we give a brief description of its main features. User may use miRDRN to explore a single disease, or the comorbidity of a disease-pair. In the course of either type of study, all relevant miRNAs, genes, and RSPs are made accessible to the user in tabulated form, and RRNs in the form of interactive maps, both of Target-specific regulatory sub-pathway (RSP) which may be downloaded by the user. Often a map is too large for practical visualization, and in such a case the user may use options such as setting a p-value cut-off, or requiring a specific gene to be present in the map, or both, to obtain a partial RRN. The entrance interface of miRDRN (http://mirdrn.ncu.edu.tw/mirdrn/) asks the user to select "Single Search" to explore a single disease (or miRNA/siRNA) or "Comorbidity Search" to explore the comorbidity of a disease-pair (Fig. 3). The user is then asked to specify the disease or disease-pair to be explored and tissue/tumor types, and p-value threshold for RSP evaluation, and to click on (or not) several optional filters, respectively, on targets and on RSPs. The filter on miRNA targets allows the user to admit only targets positively validated by the seven direct experimental methods: HITS-CLIP, PAR-CLIP, IMPACT-Seq, CLASH, Luciferase Reporter Assay, 3LIFE, and Genetic Testing (Vlachos et al., 2015); filters on RSP allow the user to select only those RSPs with some or all of the proteins to be cancer related (Fig. 4). The user may then click on "Query" to start the computation. Tabulated results of disease-associated miRNAs and their target genes (Fig. 5), a multi-page list of all RSPs (Fig. 6) and, in the case of Comorbidity Search, a list of all comorbid genes (Fig. 7) will then automatically appear. After the first, automatic iteration, the user may reduce the size of the RSP-list by using the "Gene filter" and "Show top : : : sub-pathways" options (Fig. 6). The next interface (Fig. 8), in ready mode on first appearance, waits for the user to select one of three network layouts: "Tree," "Circle," or "Radial." After "Go" is clicked on, the platform displays an interactive map showing the RRN built from RSPs selected by user-specified options (Fig. 8). When the mouse is placed on a node (a miRNA or a gene) on the map a small pop-up window opens to show the name of the node/gene and the number of other nodes it is linked to, and annotation on the node from GO, OMIM, KEGG, and GeneBank databases.

DISCUSSION
Here, we demonstrate the utility of miRDRN by presenting three applications.
The "Gene filter" option ( Fig. 6) allows the user to focus on a specific gene in RRN construction. As example, TNK2, a key drug target for the treatment of metastatic CRC (Qi & Ding, 2018), was selected as the filter, together with the "Show top 70 RSPs" option. The result was a nine-node sub-RRN: the target gene AXL regulated by three miRNAs-hsa-mir-199b, hsa-mir-34a, hsa-mir-199a-and linked (by PPI) to TNK2, itself linked to four other genes AXL(OCG), MAGI3, HSP90AB2P, MERTK(OCG), KAT8 (Fig. 11).  Table 5 Statistics and gene information in the network-1, the largest connected sub-network of the CRC-specific miRNA regulatory network.
Case 3. A sub-RRN centered on the AD-associated gene BACE1 In recent years a number of anti-AD drugs designed on the basis of the amyloid-beta (Aβ) hypothesis of AD, which holds that Aβ aggregate in the brain is the main causative factor and Atabecestat (Timmers et al., 2016). In all three cases treatment groups scored worse than the control group on the Alzheimer's disease cooperative study activities of daily living inventory (ADCS-ADL) functional measure and reported more anxiety, depression, and sleep problems than controls. In a "Single Search" application on AD (tissue, brain; pvalue threshold, 0.005), we had miRDRN construct a partial RRN (Gene filter, BACE1; Show top 70 sub-pathways; Network layout, Radial) centered on BACE1, which is a regulatory target of hsa-mir-195. The result shows the genes PSEN1, NCSTN, RANBP9, PLSCR1, MMP2, and FURIN to be immediately downstream to BACE1 in the RRN (Fig. 12). PSEN1 and NCSTN encode proteins that are, respectively, catalytic and essential subunits of the c-secretase complex; suppression of these genes are presumably the purpose of BACE1 inhibition. On the other hand, RANBP9 encodes a protein that No. of common genes --500 Table 8 Known, literature supported, and potential novel AD-T2D comorbid genes.
No. of targets in comorbidity gene set (500) Comorbid genes (references) Known data Known AD target facilitates the progression of mitosis in developing neuroepithelial cells (Chang et al., 2010); PLSCR1 encodes a protein that acts in the control of intracellular calcium homeostasis and has a central role in signal transduction (Tufail et al., 2017); MMP2 encodes a protein that promotes neural progenitor cell migration (Rojiani et al., 2010). Suppression of these genes (by BACE1 inhibition) may therefore adversely affect signal transduction and the nerve system, and could be part of the reason why Semagacestat, Verubecestat, and Atabecestat worsened the ADCS-ADL functional measure of treatment groups.

CONCLUSION
This work describes miRDRN (http://mirdrn.ncu.edu.tw/mirdrn/), composed of a new database on target-specific RSPs and a web service platform that allows the user to use the stored RSPs to construct disease and tissue-specific RRNs, which may aid the user to explore disease related molecular and pathway associations, or find new ones. As demonstration, miRDRN was applied to study the single disease CRC, where 34 potential target genes were identified, 26 of which have literature support; to study the comorbidity of the disease-pair AD-T2D, where 20 potential novel AD-T2D comorbid genes were identified, 17 of which have literature support; and to construct a partial miRNA regulatory sub-network centered on the AD-associated gene BACE1, which in turn suggests a possible explanation why, in late-phase trials that ended in failure, several c/β-secretase inhibiting anti-AD drugs worsened the functional measure of treatment groups. We believe that findings from miRDRN, even exploratory in nature, may potentially lead to the identification of new drug targets and new understanding in modes of drug action.

AD
Alzheimer's disease BioGrid biological general repository for interaction datasets CRG cancer related gene GO gene ontology database HMDD human microRNA and disease associations database KEGG Kyoto encyclopedia of genes and genomes miRDRN miRNA disease regulatory network database and web service platform MRSP miRNA-specific regulatory sub-pathway OCG oncogene PPI protein-protein interaction PPIN PPI network RSP target-specific regulatory sub-pathway RRN disease-specific miRNA regulatory network T2D Type 2 diabetes TarBase database on miRNA:mRNA interactions TSG tumor suppressor gene.