Development of a circHIPK3-based ceRNA network and identification of mRNA signature in breast cancer patients harboring BRCA mutation

Qi-xin Lian; Yang Song; Lili Han; Zunxian Wang; Yinhui Song

doi:10.7717/peerj.15572

Development of a circHIPK3-based ceRNA network and identification of mRNA signature in breast cancer patients harboring BRCA mutation

Qi-xin Lian¹, Yang Song², Lili Han³, Zunxian Wang⁴, Yinhui Song ⁵

1Oncology Department, First Affiliated Hospital of Jiamusi University, Jiamusi, China

2Pathology Department, First Affiliated Hospital of Jiamusi University, Jiamusi, China

3General Surgery, The Seventh Medical Center of the General Hospital of the Chinese People’s Liberation, Jiamusi, China

4Chemoradiotherapy, First Affiliated Hospital of Jiamusi University, Jiamusi, China

5Department of Breast Surgery, The First Hospital of Qiqihar, The Affiliate Qiqihar Hospital of Southern Medical University, Qiqihar, China

DOI: 10.7717/peerj.15572

Published: 2023-07-05
Accepted: 2023-05-25
Received: 2023-04-25

Academic Editor: Peixin Dong

Subject Areas: Bioinformatics, Gynecology and Obstetrics, Immunology
Keywords: Breast cancer, BRCA, Competing endogenous RNA, Circular RNA homeodomain-interacting protein kinase 3, Messenger RNA, Prognosis, Risk model

Copyright: © 2023 Lian et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

Cite this article: Lian Q, Song Y, Han L, Wang Z, Song Y. 2023. Development of a circHIPK3-based ceRNA network and identification of mRNA signature in breast cancer patients harboring BRCA mutation. PeerJ 11:e15572 https://doi.org/10.7717/peerj.15572

The authors have chosen to make the review history of this article public.

Abstract

Background

Exploring the regulatory network of competing endogenous RNAs (ceRNAs) as hallmarks for breast cancer development has great significance and could provide therapeutic targets. An mRNA signature predictive of prognosis and therapy response in BRCA carriers was developed according to circular RNA homeodomain-interacting protein kinase 3 (circHIPK3)-based ceRNA network.

Method

We constructed a circHIPK3-based ceRNA network based on GSE173766 dataset and identified potential mRNAs that were associated with BRCA mutation patients within this ceRNA network. A total of 11 prognostic mRNAs and a risk model were identified and developed by univariate Cox regression analysis and the LASSO regression analysis as well as stepAIC method. Genomic landscape was treated by mutect2 and fisher. Immune characteristics was analyzed by ESTIMATE, MCP-counter. TIDE analysis was conducted to predict immunotherapy. The clinical treatment outcomes of BRCA mutation patients were assessed using a nomogram. The proliferation, migration and invasion in breast cancer cell lines were examined using CCK8 assay and transwell assay.

Result

We found 241 mRNAs within the circHIPK3-based ceRNA network. An 11 mRNA-based signature was identified for prognostic model construction. High risk patients exhibited dismal prognosis, low response to immunotherapy, less immune cell infiltration and tumor mutation burden (TMB). High-risk patients were sensitive to six anti-tumor drugs, while low-risk patient were sensitive to 47 drugs. The risk score was the most effective on evaluating patients’ survival. The robustness and good prediction performance were validated in The Cancer Genome Atlas (TCGA) dataset and immunotherapy datasets, respectively. In addition, circHIPK3 mRNA level was upregulated, and promoted cell viability, migration and invasion in breast cancer cell lines.

Conclusion

The current study could improve the understanding of mRNAs in relation to BRCA mutation and pave the way to develop mRNA-based therapeutic targets for breast cancer patients with BRCA mutation.

Introduction

Breast cancer ranks the first in the incidence of female cancers and occupies the fifth place of cancer-related mortality, accounting for more than 2 million new cases and 685,000 deaths (Sung et al., 2021). For breast cancer, heritage as a critical risk factor has been reported that genetic proportion of familial clustering breast cancer is approximately 73% (Wendt & Margolin, 2019). Certain genomic mutations such as BRCA2 and BRCA1 participate in breast cancer development (Yoshimura, Imoto & Iwata, 2022). Patients harboring germline BRCA mutation are initially diagnosed as advanced breast cancer and it is associated with more unfavorable outcomes (Ho et al., 2021). Although in healthy individuals carrying BRCA1/2, bilateral mastectomy can reduce breast cancer risk, the overall survival for most BRCA1/2 carriers remain unclear (Peleg Hasson, Menes & Sonnenblick, 2020). Additionally, radiotherapy and platinum-based chemotherapy need personalized application in breast cancer patients with BRCA1/2 (Peleg Hasson, Menes & Sonnenblick, 2020). Current markers do not accurately define the personalized treatment of breast cancer and the emergence of drug resistance during breast cancer drug therapy. Therefore, it is necessary to develop novel biomarkers in the clinical management of BRCA carriers.

As a class of noncoding RNA, circular RNA (circRNA) is involved in gene regulation. Accumulating evidences have demonstrated that circRNAs are implicated in pathological conditions and contribute to the pathogenesis of diverse diseases like cancers, neurological diseases, immune diseases, cardiovascular diseases (Han, Chao & Yao, 2018). CircHIPK3 derives from exon 2 from the HIPK3 gene, which exerts oncogenic or tumor suppressive effects through serving as microRNA (miRNA) sponges (Xie et al., 2020). CircHIPK3 could promote tumor growth of glioma, prostate cancer, epithelial ovarian cancer, BCRA and colorectal cancer (Cai et al., 2019; Chen et al., 2020; Jin et al., 2018; Liu et al., 2018; Zeng et al., 2018), while it suppresses bladder cancer progression (Li et al., 2017). Targeting circHIPK3 is a promising therapeutic approach to the management of cancer. However, the role of circHIPK3-related genes in breast cancer carrying BRCA1/2 has not been fully elucidated. Hypothesis of ceRNAs represents a large-scale regulatory network that explains the communications of long noncoding RNAs (lncRNAs), circRNAs, messenger RNAs (mRNAs), and pseudogene RNAs with miRNAs mediated by miRNAs response elements. Dysfunction of ceRNA network contributes to development of cancers, while it may provide opportunities for cancer treatments (Salmena et al., 2011).

In this study, we constructed a circRNA-miRNA-mRNA ceRNA network on the basis of circHIPK3 and identified potential mRNAs that were associated with BRCA mutation patients. A total of 11 mRNAs were filtered by univariate Cox regression analysis and LASSO regression analysis for prognostic model development. Clinicopathologic features, immune characteristics, genomic landscape, and pathway characteristics were analyzed between the risk groups to understand the potential mechanism of an 11 mRNA-based signature in BRCA. The responsiveness to chemotherapy and immunotherapy was further predicted to offer a direction for personalized therapies in BRCA mutation patients.

Methods

Data collection and pre-processing

The clinical follow-up and expression data of breast cancer patients with BRAC1/2 susceptibility genes BRCA were collected from TCGA database (https://portal.gdc.cancer.gov/), which included RNA-Seq expression profiles and single nucleotide variants (SNVs) processed by mutect2. Samples living longer than 30 days were retained after eliminating those without survival time or follow-up information. We converted ensemble gene IDs to gene symbol IDs, and the gene with multiple gene symbols was expressed as median value.

We downloaded GSE173766 dataset containing circRNA and mRNA sequencing data from GEO (https://www.ncbi.nlm.nih.gov/geo/) database. Moreover, we obtained the gene expression profiles of BRCA mutation samples in GSE103091, GSE16446, GSE20685, GSE20713, GSE42568, and GSE48390 from GEO database. RMA function in “affy” package (Gautier et al., 2004) was used to process each GSE dataset, followed by removal of batch effects using removeBatchEffect function embedded in “limma” package (Ritchie et al., 2015) was utilized to remove. We converted the probe to gene symbol, and removed normal tissue samples. Patients lacking overall survival (OS) and status, follow-up information were excluded, while samples survived longer than 30 days were kept. The batch effect was presented using t-distributed stochastic neighbor embedding (TSNE) (Fig. S1) based on the expression profiles of GSE dataset before and after the removal.

Construction of circHIPK3-based ceRNA network

We obtained hsa_circ_0021603 corresponding gene HIPK3 from the circRNA data in the GSE173766 dataset, and then calculated the correlation of hsa_ circ_ 0021603 with mRNA using Pearson. Positive correlated genes were selected under the threshold of Corr >0.8 and p < 0.05. Subsequently, we predicted hsa_ circ_ 0021603 target miRNAs on the circBank website (http://www.circbank.cn/index.html). Meanwhile, we predicted target genes of miRNAs using miRtarbase, mircode, microT, TargetMiner, TargetScan, miRDB, miRanda, miRmap, starbase, PITA, RNA22, and PicTar. We reserved miRNA-mRNA in at least two databases, and these mRNAs were significantly positively correlated with hsa_circ_0021603. A ceRNA network was constructed using mRNAs, circRNA, and miRNAs.

Functional enrichment analysis

GO analysis on biological process (BP), molecular function (MF), and cellular component (CC) categories as well as KEGG analyses using “WebGestaltR” package (Liao et al., 2019) in R were conducted for functional enrichment analysis.

Developing and validating an mRNA prognostic model

In the ceRNA network, 241 mRNAs were used for additional study. We partitioned the GSE dataset at random into Train and Test datasets with a 1:1 ratio. The Train dataset was used to identify prognostic genes with P 0.05 using univariate Cox regression analysis, followed by LASSO regression analysis to refine the model using “glmnet” package (Hastie, Qian & Tay, 2021). Finally, key prognostic genes were determined by stepwise multivariate regression analysis with stepAIC.

The risk score was calculated in accordance with the formula as follows:

Risk score = ∑βi*ExPi, where i indicated the gene expression levels of prognosis related genes, and β represented Cox regression coefficient. Through surv_cutpoint function in “survminer” package (Kassambara et al., 2017), the optimal cutoff was determined and patients in Train dataset were grouped into low-risk and high-risk groups. Kaplan–Meier curves for each risk group were plotted and the significance of differences was defined by log-rank test. Besides, we conducted receiver operating characteristic (ROC) analysis using “timeROC” package (Blanche, 2015) and evaluated the areas under the ROC curve (AUCs) for 1-year, 3-years, 5-years and 7-years. To validate the robustness of predictive risk model, we calculated the risk score and predicted survival probability in TCGA cohort.

Analysis of clinicopathologic features and SNV between risk groups

We analyzed the differences in human epidermal growth factor receptor-2 (Her2), progesterone receptor (PR), N Stage, estrogen receptor (ER), T Stage between high-risk groups using Fisher’s exact test. Differences in TMB in the TCGA dataset were further investigated. In mutation dataset from TCGA dataset, we screened 8,707 genes with mutation frequency > 3 and screened significant somatic mutation between risk groups using Fisher’s exact test with P < 0.05. Association between risk score and TMB was assessed with Pearson correlation analysis. Additionally, risk score distribution in different clinicopathologic features (age, ER, PR, and Her2, T Stage, N Stage, Stage, M stage). Wilcox tests were used for the analysis of differences between two groups, and the Kruskal-Wallis test assessed group differences.

Gene Set Enrichment Analysis (GSEA) analysis between risk groups

Thereafter, we analyzed biological processes between risk groups through GSEA function in “clusterProfiler” package in TCGA dataset and GSE datasets using HALLMARK pathways in h.all.v7.4.symbols.gmt. Candidates were considered as significant enriched pathways under false discovery rate (FDR) < 0.25 and P < 0.05.

Immune characteristics and pathway characteristics differences between risk groups

The scores of 28 immune cells from previous study (Charoentong et al., 2017) were calculated using single sample GSEA (ssGSEA) method between risk groups in GSE datasets. The immune score was evaluated using ESTIMATE algorithm. Additionally, MCP-counter was employed to assess immune score of 10 immune cells. From previous study (Danilova et al., 2019), we obtained several immune checkpoint genes and compared their distribution in risk groups. In order to study the pathways characteristics of risk groups, we scored the pathways in h.all.v7.4.symbols.gmt and human gene signatures in previous published research (Mariathasan et al., 2018) using ssGSEA and analyzed their correlation with risk score in TCGA dataset.

Response to chemotherapy and immunotherapy of risk groups

The tumor immune dysfunction exclusion (TIDE) algorithm (http://tide.dfci.harvard.edu/) evaluates the M2 subtype of tumor-associated macrophages (exclusion), myeloid-derived suppressor cells, tumor-associated fibroblasts that limit the infiltration of T cells in tumors. Furthermore, two different mechanisms of tumor immune escape, including the dysfunction score of tumor-infiltrating cytotoxic T lymphocytes (dysfunction) and rejection score of CTLs by immunosuppressive factors is also studied with TIDE. A high TIDE score denotes a low rate of ICI therapeutic response. Response prediction to immune checkpoint inhibition (ICI) therapy was performed using TIDE. Additionally, we used the “pRRophetic” package (Geeleher, Cox & Huang, 2014) to calculate the half-maximal inhibitory concentration (IC50) in risk groups both in TCGA and GSE datasets.

Performance of predictive model in immunotherapy datasets

IMvigor210 (Balar et al., 2017), GSE135222 (Kim, Choi & Jung, 2020), and GSE78220 (Hugo et al., 2016) are immunotherapy datasets. Risk score in the above three immunotherapy datasets were calculated and then survival chances were predicted using Kaplan–Meier curves with AUCs. Meanwhile, we used Fisher’s exact test to compare the clinical response distribution amongst risk groups, including partial response (PR), complete response (CR), progressive disease (PD), and stable disease (SD). Risk score distribution in different clinical response groups was compared using Kruskal-Wallis test or wilcox.tests.

Construction of nomogram

In TCGA dataset, a decision tree based on stage, age, T stage, risk type, N stage, M stage was produced to determine risk subtypes and we generated Kaplan–Meier curves of risk subtypes. The significant differences were determined by log-rank test. Univariate and multivariate Cox regression analysis were further used to select predictor of prognosis. Clinical outcomes were evaluated by a nomogram constructed using risk type and clinicopathological features (stage and age, T stage, M stage, N stage). The accuracy and reliability of predictive risk model were analyzed with calibration curves and decision curve analysis (DCA).

Cell culture and transient transfection

MCF-10A, MDA-MB-231, MCF-7 cells were commercially bought from ATCC (Manassas, USA). The cells were cultured in Gibco DEME F-12 medium (Thermo Fisher, Scientific, Waltham, MA, USA). The negative control (NC) and circHIPK3 siRNA (Invitrogen, Carlsbad, CA, USA) were transfected into the cells utilizing Lipofectamine 2000 (Invitrogen, Carlsbad, CA, USA). The target sequences for circHIPK3 siRNAs were ACCATATGTTTATCAAACTCAGT (circHIPK3-si).

Quantitative real-time polymerase chain reaction (qRT-PCR)

TRIzol reagent (Sigma-Aldrich, USA) was applied to extract total RNA from breast cancer cells. The HiScript II Q RT SuperMix (Vazyme, Beijing, China) was used to reverse-transcribe 500 ng of RNA into cDNA. The SYBR Green Master Mix (Thermo Fisher, Scientific, Waltham, MA, USA) was used for qRT-PCR in an ABI 7500 Fast System. For a total of 45 cycles, the PCR amplification settings were at 94 °C for 10 min (min), at 94 °C for 10 s (s), and at 60 °C for 45 s. The internal reference was GAPDH. The primer sequences of GADPH were F: 5′-AATGGGCAGCCGTTAGGAAA-3′and R: 5′-GCCCAATACGACCAAATCAGAG-3′. The primer sequences of circHIPK3 were F: 5′-TTCAACATGTCTACAATCTCGGT-3′and R: 5′-ACCATTCACATAGGTCCGT -3′.

Cell viability

The Cell Counting Kit-8 assay (Beyotime, Jiangsu, China) was performed according to the manufacturer’s protocol for cell viability detection. Different treatment groups of cells were cultured at a density of 1 ×10³ cells per well in 96-well plates. CCK-8 solution was applied at indicated time points. After incubating the cells at 37 °C for 2 h, the OD 450 values of each well were detected using a microplate reader.

Transwell assay

Transwell assays for migration and invasion of MDA-MB-231 and MCF-7 were performed. Briefly, in a nutshell, Matrigel (BD Biosciences, Franklin Lakes, NJ, USA) was coated or left uncoated in chambers before cells (5 104) were added (for migration). In the upper layer, serum-free medium was supplied, and in the lower layer, full DMEM medium. Following a 24-hour incubation period, migrating or invading cells were fixed by 4% paraformaldehyde and stained with 0.1% crystalline violet, counted under a microscope.

Statistical analysis

R software (version 4.0.4) was used in all the statistical analyses, with a two-tailed P < 0.05 being considered as statistical significance.

Results

Construction of a circHIPK3-based ceRNA network

With the threshold of Corr > 0.8 and P < 0.05, we screened 1,358 mRNAs that were positively correlated with hsa_ circ_ 0021603. Subsequently, we predicted 45 hsa_ circ_ 0021603 target miRNAs. Through predicting target genes of 45 miRNAs, we reserved miRNA-mRNA in at least two databases, and these mRNAs were significantly positively correlated with hsa_circ_0021603. Finally, a ceRNA network containing 1 circRNA, 40 miRNAs, and 241 mRNAs was constructed (Fig. 1A). Based on functional enrichment analysis of 241 mRNAs, we found that these mRNAs were closely related with breast cancer, ErbB signaling pathway, Ras signaling pathway, and pathways in cancer (Figs. 1B–1E). The findings suggested that circHIPK3 competing with related mRNA for miRNAs, which may play a regulatory role in the occurrence of cancer.

Development and verification of mRNA prognostic model

According to the ceRNA network, 241 mRNAs that were closely related to breast cancer were used for screening mRNAs for prognostic model construction. Through univariate Cox regression analysis, 40 mRNAs associated with prognosis were screened in the Train dataset (Fig. S2A). LASSO Cox regression analysis was employ to further refine the model. Independent variable coefficients that are progressively heading to zero rose as lambda was steadily increased (Fig. S2B). Figure S2C displays the confidence interval under each lambda from 10-fold cross-validation. 21 mRNAs were found when lambda = 0.0182. With stepwise multivariate regression analysis and stepAIC, a total of 11 mRNAs were identified, including four risk mRNAs and seven protective mRNAs (Fig. S2D).

The risk score was calculated of each patient based on 11 mRNAs (Fig. 2A) as follows: riskScore = 0.652*ANGPTL2−0.289*CD1E−0.282*COLEC12−0.582*FAM117B+0.314* JAG2+0.186*L1CAM−0.405*MSL2+0.718*PSME4−0.42*SH2B3−0.31*TTC39C−0.308* ZBTB20. Survival analysis showed high risk patients exhibited poor prognosis compared with low risk patients (P < 0.0001) with 1-year AUC of 0.67, 3-year AUC of 0.75, 5-year AUC of 0.77 and 7-year AUC of 0.77 in the Train dataset (Fig. 2B). Similar results were also found in the Test dataset and GSE datasets, respectively (Figs. 2C–2D). We further validated the robustness of this model in TCGA dataset. Figure 2E revealed that high risk patients had lower OS (P < 0.0001), progression-free interval (PFI), disease-free interval (DFI) (P = 0.0034), disease specific survival (DSS) (P < 0.0001) (P < 0.0001) than that of low risk patients.

Figure 2: Development and verification of mRNA prognostic model.
(A) Multivariate Cox analysis for model genes. (B) ROC curves for 1-year, 3-years, 5-years and 7-years and Kaplan–Meier curves of risk score in the Train dataset. (C) ROC curves for 1-year, 3-years, 5-years and 7-years and Kaplan–Meier curves of risk score in the Test dataset. (D) ROC curves for 1-year, 3-years, 5-years and 7-years and Kaplan–Meier curves of risk score in GSE datasets. (E) Validation of predictive efficiency in TCGA dataset.

Download full-size image

DOI: 10.7717/peerj.15572/fig-2

GSEA analysis between risk groups

In GSE and TCGA datasets, EPITHELIAL_MESENCHYMAL_TRANSITION, GLYCOLYSIS, HYPOXIA were significantly enriched in high risk group, while ESTROGEN_RESPONSE, BILE_ACID_METABOLISM, FATTY_ACID_METABOLISM, were significantly enriched in low risk group (Figs. 3A–3B).

Analysis of clinicopathologic features and SNV between risk groups

Thereafter, we assessed the distribution of risk score in N stage, T stage, ER, PR, and Her2, stage. Patients with T3 stage, N2 stage, stage, ER negative, PR negative and Her2 positive status had higher risk score (Fig. 4A). As shown in Fig. 4B, we found that high risk patients exhibited higher proportion of T2 stage, N2 stage, positive ER, negative PR, and positive Her2. We further evaluated the differences in genomic changes between risk groups. High risk patients possessed elevated TMB (P < 0.0001) (Fig. 4C) and had a positive correlation between risk score and TMB (P = 4.18e−09, R = 0.191) (Fig. 4D). In mutation dataset from TCGA dataset, we screened 612 significant mutated genes and displayed the top 20 in Fig. 4E. TP53 (33.73%), PIK3CA (33.3%), CDH1 (14.45%) were the most frequently mutated genes.

Immune characteristics between risk groups

Furthermore, we evaluated the immune landscape in high and low risk groups in GSE datasets. We found that effector memory CD8 T cell, Type 1 T helper cell, activated B cell, effector memory CD4 T cell, activated CD4 T cell, eosinophil, activated CD4 T cell, natural killer cell, CD56 bright natural killer cell were significantly increased in low risk patients (Fig. 5A). Besides, low patients had higher adaptive score and innate score (Fig. 5B), StromalScore, ImmuneScore, and ESTIMATEScore were higher than that of high risk patients (Fig. 5C). Figure 5D deciphered that NK cells, cytotoxic lymphocytes and T cells were distinctly infiltrated in tumors of low patients. Several immune checkpoints such as BTLA, CD200, CD200R1, CD274, CD28, ICOS, IDO2, LAG3 and PDCD1LG2 were overexpressed in low risk patients (Fig. 5E).

Figure 5: Immune characteristics analysis between risk groups in GSE datasets.
(A) Box plots of 28 immune cells in high and low risk groups. (B) Differences of adaptive score and innate score between risk groups. (C) Comparison of immune scores predicted by ESTIMATE between risk groups. (D) MCP-counter score in high and low risk groups. (E) Box plots of immune checkpoints in high and low risk groups. ns represents P > 0.05; ^∗P < 0.05, ^∗∗P < 0.01, ^∗∗∗P < 0.001, and ^∗∗∗∗P < 0.0001.

Download full-size image

DOI: 10.7717/peerj.15572/fig-5

Relationship of risk score and pathway characteristics

Meanwhile, we compared the distribution of pathways in TCGA dataset and found that there were 29 significant altered pathways in low and high risk groups (Fig. 6A). Association of risk score with 29 significant altered pathways was further analyzed (Fig. 6B). GLYCOLYSIS, EPITHELIAL_MESENCHYMAL_TRANSITION, and HYPOXIA were dramatically positively correlated with the Risk score. Additionally, risk score was in a positive relationship to DDR, cell cycle, DNA replication, homologous recombination, mismatch Repair (Fig. 6C).

Figure 6: Pathway characteristics analysis and prediction of response to immunotherapy/chemotherapy of risk groups.
(A) Differences of pathway scores between risk groups in TCGA dataset. (B) Correlation analysis of risk score and pathway score in TCGA dataset. (C) Correlation analysis of risk score and human gene signatures in TCGA dataset. (D) Distribution of TIDE score between risk groups in TCGA dataset. (E) Distribution of TIDE score between risk groups in GSE datasets. (F) Visualization of distribution of 43 anti-tumor drugs between risk groups in TCGA dataset. (G) Visualization of distribution of 53 anti-tumor drugs between risk groups in GSE dataset. ns represents P > 0.05; ^∗P < 0.05, ^∗∗P < 0.01, ^∗∗∗P < 0.001, and ^∗∗∗∗P < 0.0001.

Download full-size image

DOI: 10.7717/peerj.15572/fig-6

Prediction of response to immunotherapy and chemotherapy of risk groups

Through TIDE algorithm, we observed low risk patients exhibited lower TIDE score, indicating a high response to ICI therapy, while high risk patients had higher exclusion and MDSC both in TCGA and GSE datasets (Figs. 6D–6E). Additionally, low risk patient were sensitive to 14 drugs and high risk patients were more sensitive to 29 anti-tumor drugs in TCGA (Fig. 6F). Low risk patient were sensitive to 47 drugs and high risk patients were sensitive to six anti-tumor drugs in GSE (Fig. 6G).

Performance of predictive model in immunotherapy datasets

The performance of risk model was verified in three immunotherapy datasets. High risk patients had unfavorable prognosis in IMvigor210 (P < 0.0001) and GSE78220 (P = 0.0031) than that of low risk patients with good prediction performance (Figs. 7A–7C). The proportion of PD or PD/SD was significantly increased in high risk group. Besides, patients with PD or PD/SD also exhibited higher risk score (Figs. 7D–7E).

Construction of nomogram

Based on the decision tree, we determined six risk subtypes (Fig. 8A). Figure 8B displayed the different survival probability among six risk subtypes, where patients with subtype C3 were high risk patients while patients with subtype C1 and C2 were low risk patients (Figs. 8C–8D). Figures 8E–8F revealed that risk type was the most important factor related to prognosis. Furthermore, we constructed a nomogram and found that risk score had the greatest impact on survival (Fig. 8G). To evaluate the accuracy of this model, calibration curves for 1, 3, and 5-years were generated (Fig. 8H). The observed curve was highly consistent with predicted curve. Figure 8I displayed nomogram and risk score had the most powerful survival prediction ability.

The cell viability, migration and invasion of breast cancer cells were promoted by circHIPK3

We detected the expression of circHIPK3 in MCF-7 cells by qRT-PCR, and we found that the mRNA expression of circHIPK3 was increased in MCF-7 cells in comparison to normal MCF10A cells (Fig. 9A). We subsequently examined the cell viability of MCF-7 and MDA-MB-231 after circHIPK3 inhibition, and the results showed that after circHIPK3 inhibition, the cell viability of these two kinds of breast cancer cell lines viability was significantly suppressed after circHIPK3 inhibition (Figs. 9B, 9C). We then test the invasion and migration of MCF-7 (Figs. 9D–9F) and MDA-MB-23 (Figs. 9G–9I) cells after circHIPK3 inhibition, and we observed that the invasion and migration ability of the two cell lines was significantly diminished after circHIPK3 inhibition.

Figure 9: CircHIPK3 promoted breast cancer cells viability, invasion and migration.
(A) circHIPK3 was higher expression in breast cancer cells MCF-7. (B–C) si-circHIPK3 suppressed the cell viability of MCF-7 and MDA-MB-231 cells. (D–F) si-circHIPK3 suppressed the cell invasion and migration of MCF-7 cells. (G–I) si-circHIPK3 suppressed the cell invasion and migration of MDA-MB-231 cells. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001.

Download full-size image

DOI: 10.7717/peerj.15572/fig-9

Discussion

It has been shown that the ceRNAs have regulatory roles in post-transcription of cancer cells. Recently, a circRNA-based ceRNA regulatory network has been established to classify breast cancer patients and facilitate the diagnosis, therapy and prognosis (Zhong et al., 2022). In breast cancer, miRNA signature has been developed to predict neoadjuvant chemotherapy responsiveness (Xing et al., 2021). In this study, we constructed a circHIPK3-based circRNA-miRNA-mRNA ceRNA network, and then established an 11 mRNA-based signature, which could predict the prognosis and anti-tumor response in BRCA carriers.

Investigation of the crucial roles of novel mRNAs in circRNA-miRNA-mRNA regulatory network may decipher the mechanism of mRNAs in breast cancer and has the potential to predict response to anti-tumor therapies. Small extracellular vesicles that contains ANGPTL2 derived from vascular endothelial cells has been reported to promote the progression of leukemia (Huang et al., 2021a). A recent study has revealed that miR-378a-3p enhances osteolysis and then promotes release of ANGPTL2, which leads to tumor progression (Wang et al., 2022). CD1E is one type of CD1 molecule that is expressed in dendritic cells, and it has been identified to be associated with diffuse large B-cell lymphoma patients’ long-term survival (Gao et al., 2020). LncRNA ENST00000582120 involves in epithelial-mesenchymal transition (EMT) through the regulation of COLEC12 (Zhang et al., 2017). COLEC12 acts as an innate immune-related risk mRNA that is related to immunosuppression in hepatocellular carcinoma (HCC) (Huang et al., 2021b). FAM117B is highly associated with bone formation and contribute to the metastasis of breast cancer to bone (Sui et al., 2022). MiRNA-876-3p exerts inhibitory effects on pancreatic adenocarcinoma cell growth and metastasis through regulating JAG2 (Yang et al., 2018). L1CAM is a promotor for tumor invasion, whereas inhibition of L1CAM can suppress breast cancer aggressiveness and attenuate cisplatin resistance via modulating AKT signaling (Li et al., 2022; Zhang, Shen & Guo, 2022b). Additionally, miRNA-296-3p could inhibit HCC cell proliferation and invasion targeting MSL2 and functions as a tumor suppressor (Li, An & Gao, 2021). PSME4 enhances aggressiveness of HCC via activating mTOR signaling (Ge et al., 2022). It has been documented that SH2B3 contributes to glioblastoma progression by transducing IL-6/gp130/STAT3 signaling (Cai et al., 2021). TTC39C has been screened as an alternative splicing-related mRNA that is implicated in tumor microenvironment and immune infiltration of BRCA mutation patients (Zhang et al., 2022a). LncRNA SNHG8 promotes breast cancer progression and sponged miR-634, leading to the upregulation of ZBTB20 (Xu et al., 2021). The 11 mRNA-based signature regulated by circHIPK3 might be closely related to breast cancer and might be promising therapeutic targets for BRCA carriers.

TP53, PIK3CA, and CDH1 were the most common genomic alterations for patients with breast cancer. CDH1 and PIK3CA mutation relate to metastasis in breast cancer patients (Aftimos et al., 2021). Additionally, the transfer of luminal A/B to Her2-enriched has a strong association with TP53 and/or PIK3CA mutation (Aftimos et al., 2021). TP53 mutation is significantly observed in Her2-, Her2+, and triple-negative breast cancer patients, and it is related to the dismal prognosis (Meric-Bernstam et al., 2018). The co-mutation of CDH1 with PIK3CA and ERBB2 contributes to endocrine resistance in breast cancer patients with invasive lobular carcinoma (Davis et al., 2022). Hence, evaluation of genomic alterations may reveal the pathogenesis, and help histologic classification and personalized treatment for patients with breast cancer. In the present study, we found high risk patients exhibited higher proportion of T2 stage, N2 stage, positive ER, negative PR, and positive Her2 and possessed highly mutated TP53, PIK3CA, and CDH1. The established risk score may provide a supplement for classification of breast patients and guide precision treatment.

As a crucial microenvironmental factor, hypoxia facilitates tumor cells survival and promotes tumor growth and metastasis. Tumor cells prefer glycolysis as the source of ATP under hypoxia (Paredes, Williams & Martin, 2021). Glycolysis products metabolic intermediates that is essential for tumor cell growth and adhesion. Intriguingly, compelling evidence has confirmed that hypoxia-driven activation of HIF-1α enhances EMT, which involves in tumor metastasis, immune escape and drug resistance (Saxena, Jolly & Balamurugan, 2020). It has been found that HYPOXIA, GLYCOLYSIS, and EPITHELIAL_MESENCHYMAL_TRANSITION were significantly enriched in the high risk group, which might explain the poor prognosis of high risk patients. Moreover, we also found less immune cell infiltration in tumors of high patients whereas more infiltration of NK cells, cytotoxic lymphocytes and T cells in low risk patients. This is an important reason for unfavorable prognosis of high risk patients.

Although we have constructed a circHIPK3-based ceRNA network, the regulatory mechanism of the important circRNA, miRNAs and mRNAs within the ceRNA network need further exploit. The predictive value of this mRNA signature has been validated in TCGA dataset, while larger samples should be enrolled in further prospective studies to verify our retrospective data.

Conclusion

In conclusion, we constructed a circHIPK3-based ceRNA regulatory network and identified a novel 11-mRNA-based signature in BRCA carriers. The current study might help understanding the underlying mechanism of mRNA signature and offer potential targets for prognosis and personalized treatment in breast cancer patients with BRCA mutation.

Supplemental Information

Removal of batch effect among 6 GSE datasets A-B, TSNE diagram of GSE datasets before and after removing batch effect

DOI: 10.7717/peerj.15572/supp-1

Download

Screening of prognostic mRNAs for risk model construction

(A) 40 mRNAs associated with prognosis were screened in Train dataset through univariate Cox regression analysis. (B) 21 mRNAs were found when lambda = 0.0182. (C) The confidence interval under each lambda. (D) 11 mRNAs were identified after stepwise multivariate regression analysis and stepAIC, including 4 risk mRNAs and 7 protective mRNAs.

DOI: 10.7717/peerj.15572/supp-2

Download

[1] Aftimos P, Oliveira M, Irrthum A, Fumagalli D, Sotiriou C, Gal-Yam EN, Robson ME, Ndozeng J, Di Leo A, Ciruelos EM. 2021. Genomic and transcriptomic analyses of breast cancer primaries and matched metastases in AURORA, the Breast International Group (BIG) molecular screening initiative genomics and transcriptomics of metastatic breast cancer. Cancer Discovery 11:2796-2811

[2] Balar AV, Galsky MD, Rosenberg JE, Powles T, Petrylak DP, Bellmunt J, Loriot Y, Necchi A, Hoffman-Censits J, Perez-Gracia JL. 2017. Atezolizumab as first-line treatment in cisplatin-ineligible patients with locally advanced and metastatic urothelial carcinoma: a single-arm, multicentre, phase 2 trial. The Lancet 389:67-76

[3] Blanche P. 2015. TimeROC: time-dependent ROC curve and AUC for censored survival data. R package version 2 software

[4] Cai C, Zhi Y, Wang K, Zhang P, Ji Z, Xie C, Sun F. 2019. CircHIPK3 overexpression accelerates the proliferation and invasion of prostate cancer cells through regulating miRNA-338-3p. OncoTargets and Therapy 12:3363-3372

[5] Cai S, Lu J-X, Wang Y-P, Shi C-J, Yuan T, Wang X-P. 2021. SH2B3, transcribed by STAT1, promotes glioblastoma progression through transducing IL-6/gp130 signaling to activate STAT3 signaling. Frontiers in Cell and Developmental Biology 9:606527

[6] Charoentong P, Finotello F, Angelova M, Mayer C, Efremova M, Rieder D, Hackl H, Trajanoski Z. 2017. Pan-cancer immunogenomic analyses reveal genotype-immunophenotype relationships and predictors of response to checkpoint blockade. Cell Reports 18:248-262

[7] Chen ZG, Zhao HJ, Lin L, Liu JB, Bai JZ, Wang GS. 2020. Circular RNA CirCHIPK3 promotes cell proliferation and invasion of breast cancer by sponging miR-193a/HMGB1/PI3K/AKT axis. Thoracic Cancer 11:2660-2671

[8] Danilova L, Ho WJ, Zhu Q, Vithayathil T, De Jesus-Acosta A, Azad NS, Laheru DA, Fertig EJ, Anders R, Jaffee EM. 2019. Programmed cell death ligand-1 (PD-L1) and CD8 expression profiling identify an immunologic subtype of pancreatic ductal adenocarcinomas with favorable survival. Cancer Immunology Research 7:886-895

[9] Davis AA, Gerratana L, Clifton K, Medford AJ, Velimirovic M, Hensing WL, Bucheit L, Shah AN, D’Amico P, Reduzzi C. 2022. Circulating tumour DNA characterisation of invasive lobular carcinoma in patients with metastatic breast cancer. EBioMedicine 86:104316

[10] Gao Q, Li Z, Meng L, Ma J, Xi Y, Wang T. 2020. Transcriptome profiling reveals an integrated mRNA—lncRNA signature with predictive value for long-term survival in diffuse large B-cell lymphoma. Aging 12:23275-23295

[11] Gautier L, Cope L, Bolstad BM, Irizarry RA. 2004. affy—analysis of affymetrix GeneChip data at the probe level. Bioinformatics 20:307-315

[12] Ge S, Huang H, Huang W, Ji R, Chen J, Wu S, Wang L, Huang T, Sheng Y, Yan H. 2022. PSME4 activates mTOR signaling and promotes the malignant progression of hepatocellular carcinoma. International Journal of General Medicine 15:885-895

[13] Geeleher P, Cox N, Huang RS. 2014. pRRophetic: an R package for prediction of clinical chemotherapeutic response from tumor gene expression levels. PLOS ONE 9:e107468

[14] Han B, Chao J, Yao H. 2018. Circular RNA and its mechanisms in disease: from the bench to the clinic. Pharmacology & Therapeutics 187:31-44

[15] Hastie T, Qian J, Tay K. 2021. An introduction to glmnet. CRAN R Repository

[16] Ho PJ, Khng AJ, Loh HW, Ho W-K, Yip CH, Mohd-Taib NA, Tan VKM, Tan BK-T, Tan S-M, Tan EY. 2021. Germline breast cancer susceptibility genes, tumor characteristics, and survival. Genome Medicine 13:1-14

[17] Huang D, Sun G, Hao X, He X, Zheng Z, Chen C, Yu Z, Xie L, Ma S, Liu L. 2021a. ANGPTL2-containing small extracellular vesicles from vascular endothelial cells accelerate leukemia progression. The Journal of Clinical Investigation 131:e138986

[18] Huang H, Hu Y, Guo L, Wen Z. 2021b. Integrated bioinformatics analyses of key genes involved in hepatocellular carcinoma immunosuppression. Oncology Letters 22:1-12

[19] Hugo W, Zaretsky JM, Sun L, Song C, Moreno BH, Hu-Lieskovan S, Berent-Maoz B, Pang J, Chmielowski B, Cherry G. 2016. Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell 165:35-44

[20] Jin P, Huang Y, Zhu P, Zou Y, Shao T, Wang O. 2018. CircRNA circHIPK3 serves as a prognostic marker to promote glioma progression by regulating miR-654/IGF2BP3 signaling. Biochemical and Biophysical Research Communications 503:1570-1574

[21] Kassambara A, Kosinski M, Biecek P, Fabian S. 2017. survminer: drawing survival curves using ‘ggplot2’. R package version 03 1 software

[22] Kim JY, Choi JK, Jung H. 2020. Genome-wide methylation patterns predict clinical benefit of immunotherapy in lung cancer. Clinical Epigenetics 12:1-10

[23] Li L, Wang X, Hu K, Liu X, Qiu L, Bai C, Cui Y, Wang B, Wang Z, Wang H. 2022. ZNF133 is a potent suppressor in breast carcinogenesis through dampening L1CAM, a driver for tumor cell invasion.

[24] Li X, An M, Gao Z. 2021. In hepatocellular carcinoma, miRNA-296-3p targets MSL2 and suppresses cell cell proliferation and invasion. Journal of Oncology 7430468

[25] Li Y, Zheng F, Xiao X, Xie F, Tao D, Huang C, Liu D, Wang M, Wang L, Zeng F. 2017. Circ HIPK 3 sponges miR-558 to suppress heparanase expression in bladder cancer cells. EMBO Reports 18:1646-1659

[26] Liao Y, Wang J, Jaehnig EJ, Shi Z, Zhang B. 2019. WebGestalt 2019: gene set analysis toolkit with revamped UIs and APIs. Nucleic Acids Research 47:W199-W205

[27] Liu N, Zhang J, Zhang L, Wang L. 2018. CircHIPK3 is upregulated and predicts a poor prognosis in epithelial ovarian cancer. European Review for Medical and Pharmacological Sciences 22:3713-3718

[28] Mariathasan S, Turley SJ, Nickles D, Castiglioni A, Yuen K, Wang Y, Kadel IIIEE, Koeppen H, Astarita JL, Cubas R. 2018. TGFβ attenuates tumour response to PD-L1 blockade by contributing to exclusion of T cells. Nature 554:544-548

[29] Meric-Bernstam F, Zheng X, Shariati M, Damodaran S, Wathoo C, Brusco L, Demirhan ME, Tapia C, Eterovic AK, Basho RK. 2018. Survival outcomes by TP53 mutation status in metastatic breast cancer. JCO Precision Oncology 2:1-15

[30] Paredes F, Williams HC, Martin ASan. 2021. Metabolic adaptation in hypoxia and cancer. Cancer Letters 502:133-142

[31] Peleg Hasson S, Menes T, Sonnenblick A. 2020. Comparison of patient susceptibility genes across breast cancer: implications for prognosis and therapeutic outcomes. Pharmacogenomics and Personalized Medicine 22:7-238

[32] Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. 2015. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research 43:e47

[33] Salmena L, Poliseno L, Tay Y, Kats L, Pandolfi PP. 2011. A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell 146:353-358

[34] Saxena K, Jolly MK, Balamurugan K. 2020. Hypoxia, partial EMT and collective migration: emerging culprits in metastasis. Translational Oncology 13:100845

[35] Sui L, Sanders A, Jiang WG, Ye L. 2022. Deregulated molecules and pathways in the predisposition and dissemination of breast cancer cells to bone. Computational and Structural Biotechnology Journal 30(20):2745-2758

[36] Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. 2021. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians 71:209-249

[37] Wang J, Du X, Wang X, Xiao H, Jing N, Xue W, Dong B, Gao W-Q, Fang Y-X. 2022. Tumor-derived miR-378a-3p-containing extracellular vesicles promote osteolysis by activating the Dyrk1a/Nfatc1/Angptl2 axis for bone metastasis. Cancer Letters 526:76-90

[38] Wendt C, Margolin S. 2019. Identifying breast cancer susceptibility genes—a review of the genetic background in familial breast cancer. Acta Oncologica 58:135-146

[39] Xie Y, Yuan X, Zhou W, Kosiba AA, Shi H, Gu J, Qin Z. 2020. The circular RNA HIPK3 (circHIPK3) and its regulation in cancer progression. Life Sciences 254:117252

[40] Xing A-Y, Wang B, Li Y-H, Chen X, Wang Y-W, Liu H-T, Gao P. 2021. Identification of miRNA signature in breast cancer to predict neoadjuvant chemotherapy response. Pathology & Oncology Research 27:1609753

[41] Xu X, Xie Q, Xie M, Zeng Y, Liu Q. 2021. LncRNA SNHG8 serves as an oncogene in breast cancer through miR-634/ZBTB20 axis. Cancer Management and Research 13:3017-3028

[42] Yang F, Zhao WJ, Jia CL, Li XK, Wang Q, Chen ZL, Jiang DQ. 2018. MicroRNA-876-3p functions as a tumor suppressor gene and correlates with cell metastasis in pancreatic adenocarcinoma via targeting JAG2. American Journal of Cancer Research 8:636-649

[43] Yoshimura A, Imoto I, Iwata H. 2022. Functions of breast cancer predisposition genes: implications for clinical management. International Journal of Molecular Sciences 23:7481

[44] Zeng K, Chen X, Xu M, Liu X, Hu X, Xu T, Sun H, Pan Y, He B, Wang S. 2018. CircHIPK3 promotes colorectal cancer growth and metastasis by sponging miR-7. Cell Death & Disease 9:1-15