Proteomic analysis of human cervical adenocarcinoma mucus to identify potential protein biomarkers

Background Cervical cancer is the most common gynecological cancer, encompassing cervical squamous cell carcinoma, adenocarcinoma, and other epithelial tumors. There are many diagnostic methods to detect cervical cancers but no precision screening tool for cervical adenocarcinoma at present. Material and methods The cervical mucus from three normal cervices (Ctrl), three endocervical adenocarcinoma (EA), and three cervical adenocarcinoma in situ (AIS) was collected for proteomic analysis. The proteins were screened using liquid chromatography-mass spectrometry analysis (LC-MS). The biological function of the differently expressed proteins were predicted by Gene Ontology (GO). Results A total of 711 proteins were identified, including 237 differently expressed proteins identified in EA/Ctrl comparison, 256 differently expressed proteins identified in AIS/Ctrl comparison, and 242 differently expressed proteins identified in AIS/EA comparison (up-regulate ≥ 1.5 or down-regulate ≤ 0.67). Functional annotation was performed using GO analysis on 1,056 differently expressed proteins to identify those that may impact cervical cancer, such as heme protein myeloperoxidase, which is involved in the immune process, and APOA1, which is associated with lipid metabolism. Conclusion We used proteomic analysis to screen out differently expressed proteins from normal cervical mucus and cervical adenocarcinoma mucus samples. These differently expressed proteins may be potential biomarkers for the diagnosis and treatment of cervical adenocarcinoma but require additional study.


INTRODUCTION
Cervical cancer is the third most commonly diagnosed gynecological malignancy and the fourth leading cause of cancer-related death in women worldwide resulting in approximately 530,000 deaths annually (Siegel et al., 2014;Bray et al., 2018). Advances in the treatment of cervical cancer, such as radiotherapy, chemotherapy, surgery, immunotherapy, and targeted therapy have not improved the survival rate for cervical cancer (Dai et al., 2016;Menderes et al., 2016). Cervical adenocarcinoma follows squamous cell carcinoma as the most common subtype of cervical carcinoma and is increasing in incidence and prevalence in younger populations (Seoud, Tjalma & Ronsse, 2011). Novel biomarkers for the diagnosis and treatment of cervical adenocarcinoma are needed.
A number of studies have looked at the prevention and treatment of cervical cancer by investigating relevant cells, tissues, and subtypes (Farzanehpour et al., 2019;Li, Hong & Wijayakulathilaka, 2019;Siegler et al., 2019). Cervical cancer is commonly diagnosed through cytological screening known as a Pap test in conjunction with the detection of high-risk human papillomaviruses (hr-HPVs). However, these tests are expensive and depend on good infrastructure and well-trained personnel (Lynge et al., 2014). Proteomics analysis is a powerful tool for monitoring the change of protein levels to discover new biomarkers in many cancers, such as colorectal cancer, pancreatic cancer, and neuroendocrine cervical cancer (Lin et al., 2014;Pan, Brentnall & Chen, 2015;Chauvin & Boisvert, 2018). Unique membrane proteins have been identified using proteomics analysis of the cervical cancer cell lines (Pappa et al., 2018). Lee et al. (2011) studied the proteome of cervical mucus plugs and suggested its role for maintaining pregnancy and parturition. However, their approach was not suitable for comparative studies of mucins among different groups. The constitutive protein composition of cervical mucus in fertile women and changes in the cervical mucus proteome were identified throughout the menstrual cycle but interactions between immunoglobulin, defense-binding protein, and other cervical mucus proteins were not studied (Grande et al., 2015).
We sought to discover novel biomarkers using proteomics analysis of nine uterine cavity mucus samples to produce a better prognosis for cervical cancer and to provide a reliable reference for the diagnosis and treatment of patients with cervical adenocarcinoma.

Patient materials
Cervical mucus samples were obtained from three patients with normal cervices (Ctrl), three patients with endocervical adenocarcinoma (EA), and three patients with adenocarcinoma in situ of the cervix (AIS) who had a total hysterectomy at the Nanjing Maternity and Child Health Care Hospital Affiliated to Nanjing Medical University. The three normal cervical mucus samples acted as the control. All samples were stored at −80 C and an initial histopathological diagnosis was conducted. The cancerous cervical mucus and normal cervical mucus were used for proteomic analyses. This study was approved by the ethical committee of Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Care Hospital ((2019) KY-040). All participants voluntarily agreed to participate in this study and signed an informed consent.

Protein extraction
Protein extraction was performed on the nine cervical mucus samples. Each sample was mixed and homogenized with a protein lysate (7 M Urea, 2 M Thiourea, 4% SDS, 40 mM Tris-HCl, pH 8.5, 1 mM Phenylmethanesulfonyl fluoride (PMSF), 2 mM Ethylene Diamine Tetraacetic Acid (EDTA)) on ice for 5 min. Dithiothreitol (DTT) (Solarbio, 428F0422) was added to a final concentration of 10 mM, followed by an ice bath ultrasound for 15 min and centrifuging at 13,000g at 4 C for 20 min. After ultracentrifugation, supernatants of each samples were collected to a new centrifugal tube. Cold acetone was added into the centrifugal tube at four times the volume and was left to stand overnight at −20 C. The protein precipitation was collected and left to air dry. A total of 8 M urea/100 mM tetraethyl-ammonium bromide (TEAB) (BCBC6216; Sigma-Aldrich, St. Louis, MO, USA) (pH 8.0) solution was added to re-dissolve the protein and then DTT was added to the final concentration of 10 mM and the solution was immersed in water at 56 C for 30 min for the reduction reaction. Iodoacetamide (IAM, Aladdin, J1513091) was added to the final concentration of 55 mM and placed at room temperature in the dark for 30 min to produce an alkylation reaction. The Bradford method was used to measure protein concentrations.

iTRAQ labeling and peptide fractionation
The extracted proteins were digested by trypsin to obtain the corresponding peptides, which were desalted by the Durashell C18 column (5 ms, 100 A, 4.6 × 250 mm) (Agela, Tianjin, China) and vacuum dried. The peptides were dissolved with 0.5 M TEAB and labeled using the itraq-8 standard kit (SCIEX, Shanghai, China) according to the manufacturer's instructions. The samples were labeled and mixed and the mixed peptides were then graded and separated using the Ultimate 3000 HPLC system (DINOEX; Thermo, Waltham, MA, USA). The separation of peptides was achieved by increasing acetonitrile (CAN) concentration under alkaline conditions. The flow rate was 1 ml/min with one tube collected per minute. A total of 42 secondary fractions were collected and combined into 12 fractions, which were desalinated and vacuum-dried on the Strata-X column. All analyses were completed by Wuhan Genecreate Biological Engineering Co. LTD. (Wuhan, China).

Protein identification and quantification
Proteins were identified by the Proteinpilot TM V4.5 search engine. At least one unique peptide section of each protein line and the unused score was believed to be no more than 1.3 (just above 95%). The reliability of each identified peptide and its protein quantification was considered for protein quantification.

Liquid chromatography-mass spectrometry analysis
Mass spectrometry data were collected using the TripleTOF 5600+ liquid/mass coupling system (SCIEX, Shanghai, China). Polypeptide samples were dissolved in 2% acetonitrile/0.1% formic acid and analyzed using the TripleTOF 5600+ mass spectrometer. The peptide solution was added to the C18 capture column (5 ms, 100 ms × 20 mm) and eluted on the C18 analysis column (3 ms, 75 ms × 150 mm) for 90 min with a flow rate of 300 nL/min. The two mobile phases were buffer A (2% acetonitrile, 0.1% formic acid, 98% H 2 O) and buffer B (98% acetonitrile, 0.1% formic acid, 2% H 2 O). The primary mass spectrometry for information dependent acquisition (IDA) was scanned at 250 ms ion accumulation time and a secondary mass spectrometry of 30 precursor ions was collected at 50 ms ion accumulation time. The MS1 spectrum was collected in the range of 350-1,500 m/z, and the MS2 spectrum was collected in the range of 100-1,500 m/z. The dynamic exclusion time for precursor ions was set at 15 s.

Functional annotation
Gene ontology (GO) functional annotation analysis was performed for all identified proteins and the GO functions of the cellular component, biological process, and molecular function corresponding to all proteins were analyzed. Detailed information can be found at http://www.geneontology.org. COG analysis and KEGG analysis were used to further determine the function of the proteins.

Statistics
The mean value of the ratio of repeated samples was normalized using the median as the difference multiple by the samples to be compared. The minimum P value of Student's t test of the single sample of parewise comparisons between repeated samples was used as the significance difference of the samples. Differential proteins were screened according to fold change and P value. The proteins were considered to be statistically different when the difference of multiples was ≥1.5 (i.e., up-regulate ≥1.5 and down-regulate ≤0.67) with a P value < 0.05 after the significance statistical test.

Identification of proteins in cervical mucinous
A total of nine samples were divided into three groups to identify differently abundant proteins. Ctrl was comprised of samples from three healthy individuals, EA was comprised of samples from three invasive cervical adenocarcinoma individuals, and AIS had samples from three cervical adenocarcinoma in situ individuals. All samples were obtained from individuals with an average age of 41 years. The clinicopathological characteristics of these patients were summarized in Table 1. A diagnosis of endocervical adenocarcinoma was suggested by the cervical sample based on cytomorphological features. The postoperative pathological diagnosis showed no lymph node metastasis and immunohistochemical results directed the diagnosis and subsequent treatment. The corresponding pathological images were obtained for the nine samples and were stained by HE ( Fig. 1A; Fig. S1).
The unique peptides only needed to be detected in one protein to identify the protein in the sample. The presence of the peptide uniquely determined the presence of the corresponding protein. The biordinate distribution of the unique peptides representing the unique proteins was identified (Fig. 1B). The abscissa is the number of unique peptides contained in the protein, and the left ordinate is the number of proteins corresponding to the abscissa. The right ordinate is the cumulative protein proportion corresponding to the abscissa. The results showed that there are 757 protein with at least 2 unique peptides, accounting for 78.12% of the total protein. The length distribution of the peptide was obtained by mass spectrometer (Fig. 1C). The average length of the peptides identified was 12.55, which was a reasonable peptide length. A protein with more peptide support makes the protein more reliable, therefore, protein coverage can indirectly reflect the overall accuracy of the identification results. Pie charts were used to represent the percentage of different proteins within the identification coverage (Fig. 1D). The identified range of proteins was 37.56% and proteins with coverage ≥20% accounted for 39.73% of the total protein. The average protein coverage was 21.90%.

The analysis of differential protein expression profile
Of the identified proteins, 237, 256, and 242 proteins were significantly differently expressed in the comparison group EA/Ctrl, AIS/Ctrl, and AIS/EA using iTRAQ quantitative analysis with the filtered threshold of up-regulation ≥1.5 or down-regulation ≤0.67, respectively. The number of significant protein differences was shown between two samples ( Fig. 2A). The results of the relative quantitative analysis of the protein showed that the protein ratio distribution followed a normal distribution pattern. The distribution of the differential multiples of all quantifiable proteins was shown (Fig. 2B), where the x-coordinate represented the value of the differential multiples after logarithmic transformation in base 2. The part greater than 0 indicated the up-regulated expression, while the part less than 0 indicated the down-regulated expression. The hierarchical clustering heat map of the intersection of 98 differently significant proteins in the comparison groups visually aggregated samples/proteins with higher similarity (Fig. 2C). The data source is the log2 logarithmic value of the protein abundance ratio between the two comparison groups. Quantitative information of some significantly different proteins is shown in Table 2. Some of the differently expressed proteins were

Functional and pathway analysis of identified proteins
It is important to analyze the function of genes as more genomes are sequenced and to predict the function of unknown genes to guide further experiments. We provided the commonly used GO annotation and KEGG annotation results, as well as the COG annotation results to comprehensively reflect the function of proteins obtained from different databases to reveal the biological significance of proteins in various life activities. A total of 1,118 proteins were identified and functionally annotated (Fig. 3A). The annotation information for some proteins varied among the databases due to the limitations of the background annotation library. GO, COG, and KEGG annotated 1,056 proteins, 569 proteins, and 751 proteins, respectively. GO is a community-based bioinformatics resource that supplies information about gene product function using ontologies to represent biological knowledge (Gene Ontology Consortium, 2015).  GO was used to annotate 1,056 proteins and statistical analysis was performed based on biological, cellular, and molecular processes (Figs. 3B-3D). We conducted independent functional annotation analysis on the up-regulated and down-regulated differently expressed proteins to better analyze the functions of differently expressed proteins. A total of 237 differently expressed proteins (98 significantly up-regulated and 139 significantly down-regulated proteins) in EA/Ctrl were used as examples to compare and analyze their GO annotation results. Functional notes were analyzed for groups AIS/Ctrl and AIS/EA, and the results are shown in a bar chart in Figs. 4A-4C.

DISCUSSION
Cervical cancer is an aggressive gynecological cancer of the uterine cervix. It is the fourth most common cancer among women worldwide, with an estimated 570,000 cases and 311,000 deaths in 2018 (Bray et al., 2018;Takayanagi et al., 2019). Although the incidence of cervical squamous cell carcinoma has decreased, the incidence of cervical adenocarcinoma remains high (Vizcaino et al., 1998;Smith et al., 2000;Fan et al., 2014;Saida et al., 2019;Ouyang et al., 2020). Cervical adenocarcinoma currently accounts for up to 25% of all cases of cervical cancer in many western countries (Parkin & Bray, 2006). It is important to find novel biomarkers relevant to cervical adenocarcinoma to improve the current treatment strategies and prognosis of this disease. We identified differently expressed proteins from subjects with cervical adenocarcinoma and those with normal cervices using the iTRAQ proteomics approach. A total of 1,118 proteins were identified in three separate trials with 711 common proteins identified. We found that the number of proteins with significant differences in the comparison groups (EA/Ctrl, AIS/Ctrl and AIS/EA) was 237, 256 and 242, respectively. We performed functional annotations, including GO, COG, and pathway analysis, on significantly differently expressed proteins. Expression level clustering analysis and functional enrichment analysis was performed on all significantly differently expressed proteins to determine significant proteins. There are many studies on the proteomics of cervical cancer but few studies on the proteomics analysis of cervical adenocarcinoma (Kacerovsky & Tosner, 2009;Serafín-Higuera et al., 2016;Zhang et al., 2016). We performed proteomics analysis on samples of cervical adenocarcinoma mucus and normal cervical mucus a nd used functional annotation to screen out differently expressed proteins relevant to immune, metabolic, cell adhesion, and other cellular processes, such as myeloperoxidase and APOA1. Differently expressed proteins may be used as markers for the early diagnosis of cervical adenocarcinoma, however, our study had some limitations, including its small sample size and lack of other techniques to verify the accuracy of our analysis.
Myeloperoxidase, also known as MPO or PERM, is an important member of the heme peroxidase-cyclooxygenase superfamily. It is mainly expressed in neutrophils and monocytes (Ndrepepa, 2019). The enzyme can be released through phagocytosis in phagocytic bodies to catalyze the synthesis of hypochlorous acid (HOCl) from hydrogen peroxide (H 2 O 2 ) and chloride ions (Cl − ) (Arnhold & Flemmig, 2010;Gomez-Mejiba et al., 2010;Rossmann et al., 2011). HOCl is a potent microbicidal agent that damages DNA, proteins, and lipids. Myeloperoxidase is associated with many cancer types, including lung, ovarian, colorectal, and prostate cancers (Arslan, Pinarbasi & Silig, 2011;Li et al., 2011;Castillo-Tong et al., 2014;Zou et al., 2018) through the −463G/A promoter polymorphism. It is reported that the −463G/A MPO gene polymorphism is not associated with cervical intraepithelial neoplasia and susceptibility to cervical cancer and the genotype GG of MPO with a higher transcriptional activity is a protective against cervical cancer (Mustea et al., 2007;Castelao et al., 2015;Natter et al., 2016). Additional studies are required to determine the role of myeloperoxidase in cervical tumorigenesis.
APOA1 belongs to the apolipoprotein A1/A4/E family and participates in lipid metabolism, including the intracellular reuse of fatty acids; it is an important carrier and cofactor (Zhang & Yang, 2018). APOA1 is the major component of high density lipoprotein. Rhee, Byrne & Sung (2017) reported that the ratio of HDL cholesterol to APOA1 may be a risk marker for cancer mortality. APOA1 is dysfunctional in cervical squamous cell carcinoma and is identified as a biomarker (Guo et al., 2015). We suggest that APOA1 may be a novel candidate marker for cervical adenocarcinoma and further study is needed to determine its functional mechanisms.
We collected nine cervical mucus samples and summarized their clinical information. Menopause is a life phenomenon with characteristics of estrogen secretion depletion and the cessation of menstruation. The majority of women enter menopause between the ages of 49 and 52 (Takahashi & Johnson, 2015) and some studies show no statistically significant difference in cervical cancer rates of premenopausal and postmenopausal women (Holt et al., 2017;Zhang et al., 2018). A total of 8 of the 9 individuals used in this study were premenopausal and the postmenopausal individual was within the age range of normal menopause. All samples met the guidelines for normal menstruation and had a uterus of normal size and position; the cervix was enlarged or inflamed in all nine cases.
Our sample size was small due to variable sample collection time, long cycles, and limited scientific research funds. We intend to verify our results in the future and will investigate the mechanism of biomarkers used in the treatment of cervical adenocarcinoma.

CONCLUSION
We performed a non-targeted proteomics study to profile differently expressed proteins in cervical adenocarcinoma. The proteins studied may serve as potential biomarkers for cervical cancer research and treatment.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
This study was supported by the National Natural Science Foundation of China (81802105, 81702561); Jiangsu Provincial Key Research and Development Program (BE2017619); and by the Nanjing Technological Development Program (201605049). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.