Computational-approach understanding the structure-function prophecy of Fibrinolytic Protease RFEA1 from Bacillus cereus RSA1

Microbial fibrinolytic proteases are therapeutic enzymes responsible to ameliorate thrombosis, a fatal cardiac-disorder which effectuates due to excessive fibrin accumulation in blood vessels. Inadequacies such as low fibrin specificity, lethal after-effects and short life-span of available fibrinolytic enzymes stimulates an intensive hunt for novel, efficient and safe substitutes. Therefore, we herewith suggest a novel and potent fibrinolytic enzyme RFEA1 from Bacillus cereus RSA1 (MK288105). Although, attributes such as in-vitro purification, characterization and thrombolytic potential of RFEA1 were successfully accomplished in our previous study. However, it is known that structure-function traits and mode of action significantly aid to commercialization of an enzyme. Also, predicting structural model of a protein from its amino acid sequence is challenging in computational biology owing to intricacy of energy functions and inspection of vast conformational space. Our present study thus reports In-silico structural-functional analysis of RFEA1. Sequence based modelling approaches such as—Iterative threading ASSEmbly Refinement (I-TASSER), SWISS-MODEL, RaptorX and Protein Homology/analogY Recognition Engine V 2.0 (Phyre2) were employed to model three-dimensional structure of RFEA1 and the modelled RFEA1 was validated by structural analysis and verification server (SAVES v6.0). The modelled crystal structure revealed the presence of high affinity Ca1 binding site, associated with hydrogen bonds at Asp147, Leu181, Ile185 and Val187residues. RFEA1 is structurally analogous to Subtilisin E from Bacillus subtilis 168. Molecular docking analysis using PATCH DOCK and FIRE DOCK servers was performed to understand the interaction of RFEA1 with substrate fibrin. Strong RFEA1-fibrin interaction was observed with high binding affinity (−21.36 kcal/mol), indicating significant fibrinolytic activity and specificity of enzyme RFEA1. Overall, the computational research suggests that RFEA1 is a subtilisin-like serine endopeptidase with proteolytic potential, involved in thrombus hydrolysis.

Notably, anticoagulants and antiplatelet drugs such as apixaban, warfarin, dabigatran, aspirin or dipyridamole have been employed for thrombus hydrolysis but are highly expensive and leave undesirable after-effects such as haemorrhage, esophagitis, gastrointestinal discomfort and alopecia etc. (Thachil, 2016;Watras, Patel & Arya, 2016;Yoshihide, Eri & Hajime, 2019). Thrombolysis therapy involving the use of microbial fibrinolytic enzymes is thus preferably used to combat thrombosis. Extensive industrial and therapeutic applicability of fibrinolytic enzymes has increased interest in understanding their mechanism of action and structure-function properties (Bora et al., 2017). Fibrinolytic enzymes based on their catalytic mechanism are classified as serine proteases (EC. 3.4.21), metalloproteases (EC 3.4.24) and serine metalloproteases (Raju & Divakar, 2014). Serine proteases cleave peptide bonds, where serine is present as nucleophilic amino acid at enzyme's active site (Page & Cera, 2008) while metalloproteases necessitate the administration of metal ions (Zn 2+ , Ca 2+ , Co 2+ , Mg 2+ etc.) to perform varied biological functions such as substrate recognition/binding, electron transfer and catalysis (Chen et al., 2019). The fibrinolytic enzyme of the third category, i.e serine metalloprotease exhibits both serine and metalloprotease properties (Peng, Yang & Zhang, 2005).
Additionally, research-oriented pharmaceutical industry progressively necessitates 3D protein structural and functional elucidation using bioinformatic molecular modelling tools (Ferreira et al., 2015). Henceforth, structure based functional characterization of a protein is preeminent goal in biological sciences (Manjasetty et al., 2012). Also, biochemical/cellular processes are mainly controlled by intermolecular protein-ligand interactions with efficient physiological substrate and inhibitor/activator specification. Such computer-aided interaction is carried out with two independent three-dimensional (3D) crystallized ligand co-ordinates to obtain and examine the bound structure model (Bora et al., 2017).
In this work, we thus investigated structure-function attributes of fibrinolytic protease RFEA1 via an In-silico approach. Molecular docking was performed to study RFEA1substrate interaction. Besides, multiple sequence alignment and sequence logos were generated to study comparative analysis with similar sequences and recognize the conserved motifs.

Fibrinolytic protease RFEA1 production and identification
Fibrinolytic protease producing strain Bacillus cereus RSA1 (NCBI Accession number MK288105) was isolated from soil samples, procured from numerous garbage dumps of Noida (U.P, India). Bergey's brochure for identification of bacteriology and 16S rDNA sequencing was employed for identification of the strain. The fibrinolytic protease production was statistically optimized in our previous study (Plackett Burman Design and Central Composite Design using Design Expert R 10.0.8.0, Stat-Ease Inc., Minneapolis, MN, USA) which comprised of peptone (10 g/L), yeast extract (5 g/L), and glucose (0.5 g/L) with final pH 8.0. The cultures were incubated for 24 h at 37 • C with 1% inoculum and agitation rate of 120 rpm. The enzyme was purified as per our standardized protocol in previous study, employing chilled ethanol precipitation and gel filtration chromatography-Sephadex G75 column (50 × 15 mm; Sigma Aldrich, St. Louis, MO, USA), and identified through MALDI-TOF Mass Spectroscopy (Sharma et al., 2020). Further, statistical compositional amino acid outcome of protein RFEA1 was determined using statistical tool SAP Application Performance Standard (SAPS: http://www.ebi.ac.uk/Tools/seqstats/saps/). SAPS utilize FASTA organized amino acid sequence of individual protein and investigates its composition, repetitive structure, charge dispersion, dividing and multiple periodicity (Brendel et al., 1992).

Determination of structural analogs of RFEA1
TM-align (https://zhanglab.ccmb.med.umich.edu/TM-align/), an algorithm for protein structure alignment and comparison was employed to identify structural analogs of validated RFEA1 model in PDB library. TM-align employ heuristic dynamic programming iterations to produce residue to residue alignment for two protein structures of unfamiliar uniformity. The algorithm reports top 10 proteins from PDB with closest structural similarity and provides TM-score (ranges from 0 to 1) determining highest degree of structural similarity. Higher the TM-score value, better and perfect is the structural match (Zhang & Skolnick, 2005).

Prediction of enzyme commission numbers and gene ontology terms of RFEA1
COFACTOR (https://zhanglab.ccmb.med.umich.edu/COFACTOR/) online meta-server was employed for reporting molecular and biological functional annotations of RFEA1. COFACTOR subjects the query BioLiP to identify functional insights such as Enzyme Commission number (EC) and Gene Ontology (GO). COFACTOR provides detailed insights about GO from UniProt-GOA and STRING databases. Thus, structure-based function of RFEA1 was predicted using COFACTOR (Roy, Yang & Zhang, 2012;Zhang, Freddolino & Zhang, 2017), wherein the server provides C-Score values (range 0 to 1), where higher score indicates high reliability of each outcome.

In-vitro validation
In-vitro validation and comparative analysis (with our previous research) of the bioinformatics outcome was performed to draw conclusive decisions. The effect of calcium ion on efficacy and thermostability of RFEA1 was studied and confirmed in-vitro. RFEA1 was purified as per our standardized protocol in previous study, employing chilled ethanol precipitation and gel filtration chromatography-Sephadex G75 column (50 × 15 mm; Sigma Aldrich, St. Louis, MO, USA) (Sharma et al., 2020). The enzyme was incubated for 2 h at 37 • C with different concentrations of CaCl 2 (0.5-3.0 mM) and then tested for fibrin hydrolysis. The fibrinolytic activity assay was performed according to Sharma et al. (2020). Further, in another set of experiments we have used 2 mM of CaCl 2 and stability of RFEA1 was tested at different temperatures (20-80 • C). Solution with no CaCl 2 was considered control. All the experiments were performed in triplicates and statistically analyzed. The In-silico interaction of RFEA1 with substrate (fibrin) was compared with our previous in-vitro study.

Procured amino acid sequence of fibrinolytic protease RFEA1 and SAP application performance standard analysis
The protein sequence of RFEA1 derived through MALDI-TOF mass spectrometric analysis in our previous study with Mr 39,483 Da and sequence score of 381 amino acids is mentioned underneath (Sharma et al., 2020).

Structural modelling of RFEA1 using multiple servers
I-TASSER predicted five models with C-score: −0.16 (Model 1), −1.33 (Model 2), −2.42 (Model 3), −3.24 (Model 4), −0.86 (Model 5). C-score ranges between −5 and 2 where higher conviction is controlled by higher C-score. Thus, first of the five projected models by I-TASSER online server i.e., Model 1 with highest confidence score was selected (Fig. 1A). The analysis further revealed that RFEA1 exhibited maximum sequence identity (99%) with crystal structure of unautoprocessed form of IS1-inserted Pro-subtilisin E template (IS1-ProS221A) (PDB ID: 3whi.1.A) and henceforth structural imposition of RFEA1 with template protein (3whiA) is displayed in Fig. 1A. The estimated template modeling (TM) score (ranges between 0 and 1) is a projected scale for evaluating the structural resemblance between two structures whereas estimated root-mean-square deviation (RMSD) signify an average distance of all residue pairs of 3D model and its experimental structure (Zhang & Skolnick, 2004;Xu & Zhang, 2010). A higher TM-Score specify superior structural match while smaller RMSD value signifies good quality of model. TM-score of Model 1 is 0.69 ± 0.12 and RMSD is 7.1 ± 4.1 Å, which confirms precise topology. The outcome also revealed that projected model possessed single chain comprising ten alpha-helices and sixteen beta-strands. Fig. 1B represents structural modeling of RFEA1 using second server SWISS-MODEL. According to the predicted GMQE (0.83) and QMEAN (−0.82), RFEA1 showed maximum identification (99.15%) to the '3whi' chain A of pro-subtilisin E of Bacillus subtilis. The modelled RFEA1 presented calcium binding site (Ca1) as a part of its structure and five residues (Asp 147 , Leu 181 , Asn 183 , Ile 185 and Val 187 ) were found within 4 Å for ligand contacts with chain A. Similarly, RFEA1 structural modelling was performed using RaptorX server (Fig. 1C). The tool predicted five models with estimated RMSD (Å): 6.9122 (Model 1), 8.9796 (Model 2), 8.3941 (Model 3), 8.7051 (Model 4) and 10.419 (Model 5). Model 1 with lowest RMSD value was selected and further analysed. The selected model constituted: Strand: 20.5%, Alpha Helix: 32.8%, 3 10 Helix: 2.9% and Other: 43.8%. Another server, Phyre2 was employed to generate RFEA1 3D model (Fig. 1D). This server projected model with 100% confidence and 99% identity with template '3whiA'. Alpha helix (29%), beta strand (25%) and TM helix (4%) were present in the predicted model. Even though there is high similarity between RFEA1 and template 3whiA, the analysis of unprocessed structures through model-template alignment (Fig. S1) suggests that a large segment of 29 residues from position 1-29 (Met 1 -Ala 29 ) is lacking in template with respect to RFEA1 while a segment of 13 residues from position 78-90 (Gly 78 -Pro 90 ) is lacking in enzyme RFEA1 with respect to template. Furthermore, the alignment clearly indicates huge dissimilarity in the positions of amino acid residues in both RFEA1 and template.

Validation of the predicted 3D structures of RFEA1
The modelled RFEA1 structures were further validated through online SAVES v6.0 scrutinizing RC plot (Fig. 2) Table 1 reveals validation statistics which indicated that Phyre2 server predicted better RFEA1 model than other servers, with maximum (100%) of residues in acceptable region and no (0.00%) residues in disallowed region of RC plot, overall quality factor of 89.64 and 99.42% residues with averaged 3D-ID score >= 0.2. Henceforth, validation scores suggest that Phyre2 modelled RFEA1 can be used for further structure-based molecular docking analysis.   et al., 2015). Several other Bacillus enzymes (BPN and Carlsberg) with fibrinolytic potential have been reported with Ca1 and Ca2 sites (Bryan et al., 1992;McPhalen & James, 1988

Identification of structural analogs of RFEA1 in Protein Data Bank
The top 10 TM-align identified structural analogs of RFEA1 are detailed in Table 3, in which the first model (Rank 1) with PDB hit: 3whiA holds highest TM-score 0.905. RMSD, IDEN and Cov score of the homolog were 0.81, 0.983 and 0.911 respectively,

In-silico interaction of RFEA1 with Fibrin
PATCH DOCK server is a docking algorithm based on shape complementarity principles and was employed to study molecular interactions between fibrinolytic enzyme RFEA1 and fibrin. Figure S4 illustrates the details of output files of predicted RFEA1-fibrin interactive complexes. PATCH DOCK results were further refined and investigated for binding energy and hydrogen bonding in the complex with FIRE DOCK server. FIRE DOCK refined top 10 solutions (Fig. S5) and indicated that solution No. 7 of PATCH DOCK results has highest binding energy of −21.36 kcal/ mol with hydrogen bond contribution of −5.11. The binding energy between protein-protein complexes is demarcated by their contact region and interface, and determines their interaction strength. The more negative value of binding energy signifies stronger interaction between both the proteins. The 3D view of RFEA1-fibrin interaction was visualized using Chimera software (Fig. S6) and the interactive site residues of RFEA1 and fibrin were examined using Ligplot (Fig. 4)

In-vitro validation
The in-vitro analysis revealed a gradual increase in fibrinolytic activity of RFEA1 in the presence of 0.5-2.00 mM of Ca 2+ ions after 2 h of incubation (Table 4). An increase of approximately 30% in the activity of RFEA1 (130.87 ± 1.76%) was observed with 2 mM CaCl 2 . However, slight reduction in the activity (125.71 ± 1.82 and 119.98 ± 1.87%) was detected with 2.5 and 3.0 mM of CaCl 2 . Also, with 2 mM of CaCl 2 , a significant increase in the stability of RFEA1 was testified at different temperatures. An increase of 45.65, 41.25, 36.98, 30.17, 26.71, 21.12 and 10.58% in fibrinolytic activity was observed at temperature 20, 30, 40, 50, 60, 70 and 80 • C, respectively (Table 5). Further, the in-vitro efficacy of RFEA1 has been evaluated using both fibrin and mammalian blood clot as substrate in our previous study (Sharma et al., 2020). The study reported higher affinity of RFEA1 towards fibrin with K m and V max values of 1.093 mg/mL and 52.39 ug/mL/min. The thrombolytic potential of RFEA1 was evaluated in comparison to a commercial thrombolytic agent streptokinase/myokinase (Biocon, India), using mammalian blood clot. The endogenous fibrinolytic factors such as plasmin and plasminogen were deactivated by thermal treatment of blood clots. Complete clot dissolution was observed within 4 h with RFEA1 and streptokinase.

Identification of conserved motifs of RFEA1
Conserved domain analysis of RFEA1 was predicted using top 10 homologous enzymes, as depicted by PSI-BLAST, through multiple sequence alignment constructed using ClustalW (Fig. 5). UNIPROT: SUBN_BACNA P35835 Subtilisin NAT (3.4.21.62) (Nattokinase)  Positives and E-value = 0. Multiple sequence alignment suggests numerous conserved columns with a score of 11 indicated by asterisk ( * ) and mutations (Score:10) but conserved properties marked with plus (+). The analysis suggests that RFEA1 is a subtilisin-like fibrinolytic serine protease obtained from novel Bacillus cereus RSA1, with homologs from other sources as well. Besides, the sequence logos of homologous enzymes were generated using Weblogo server to further construct a clear alignment of the identified conserved domains. The results indicated that very low/no significant conservation was found at positions 1, 27, 57-62 and 70-72, whereas highly conserved domains were present at 181-188, 207-213, 303-307, 335-340 and 342-350 residue positions (Fig. 6).

DISCUSSION
Scientific studies suggest that limited attempts are made to explore the 3D structures and intermolecular interactions of bacterial fibrinolytic enzymes. In this investigation, we have focused on structural-functional analysis of fibrinolytic enzyme RFEA1 obtained from Bacillus cereus RSA1. Statistical interpretation, structural depiction and ligand interactions of RFEA1 has been successfully accomplished to get an insight into enzyme's attributes.
In-vitro validation and comparative analysis of the work is performed to confirm our In-silico predictions. Statistical analysis by SAPS webserver has been efficiently used to predict precise composition of RFEA1. The server predicted RFEA1 as a serine (13.1%) and alanine (12.6%) rich protein. In Mycobacterium tuberculosis H 37 Rv, Rv3906c gene was testified to exhibit glycine (17.8%) and aspartate (23.7%) rich residues using SAPS server (Beg, Thakur & Meena, 2018). The use of multiple structural modeling servers is preferred for prediction of high-quality 3D protein models (Guleria et al., 2016;Lagares et al., 2020). We have used I-TASSER, SWISS-MODEL, RaptorX and Phyre2 to model RFEA1, wherein Phyre2 server predicted superior model with 99% identity and 100% confidence with template '3whiA'. The good quality and reliability of RFEA1 built model was confirmed using RC, ERRAT and Verify3D plots. Such validation parameters have been used in many scientific studies to achieve similar objectives (Beg, Thakur & Meena, 2018;Gupta et al., 2017;Manochitra & Parija, 2017). Further, model-template alignment suggests that RFEA1 exhibits huge dissimilarity in the position of residues and lacks a segment of 13 residues with respect to template (3whiA). Reports also suggest that homologous enzymes might lack few residues, in spite of high identity (Herrera-Zúñiga et al., 2019). COFACTOR -a significant tool used in many computational studies to analyse molecular/biological annotations of proteins (Beg, Thakur & Meena, 2018;Kinyanyi et al., 2018;Naveed et al., 2018), predicted RFEA1 as serine protease. Our analysis further predicts one Ca 2+ (Ca1) binding site in RFEA1, which is involved in regulation of enzyme's fibrinolytic activity and thermostability. RFEA1 showed an increase of 30.87% in activity with enhanced thermal stability (45.65, 41.25, 36.98, 30.17, 26.71, 21.12 and 10.58%) at 20, 30, 40, 50, 60, 70 and 80 • C, in presence of CaCl 2 . Studies report that subtilisin enzymes from Bacillus sp. have two calcium binding sites (high-affinity Ca1 & low-affinity Ca2). Such bound Ca ions (specifically Ca1) have an important role in thermal stability and protection against autolysis. Ca1 site is highly conserved in several subtilisins and contributes to protein stabilization (Uehara et al., 2013;Smith et al., 1999). Fibrinolytic enzymes AprE176 and M179 from Bacillus subtilis HK176 and mutated Bacillus subtilis HK176 showed increased thermostability due to the presence of calcium binding sites. AprE176 retained 11% of its activity at 45 • C after 5 h whereas, M179 retained 36% (Jeong Figure 6 Sequence logo for conserved domain analysis of RFEA1 constructed using Weblogo. Each stack of symbol designates an amino acid residue. The color of stack is displayed according to the hydrophobicity of residues (hydrophilic residues-blue, neutral residues-green and hydrophobic residuesblack). The overall height of stack indicates the degree of conservation whereas symbol height within the stack signifies relative frequency of each residue at that position. The sequence positions in the conserved domains are represented by numbers on the x-axis whereas y-axis denotes the information content estimated in bits. Full-size DOI: 10.7717/peerj.11570/fig-6 et al., 2015). Calcium-bound crystal structure of IS1-ProS221A (3whi) was fully folded and more stable than calcium-free form by 13.  (Syahbanu et al., 2020), suggesting strong interaction between RFEA1 and fibrin. In-vitro results support our In-silico findings of high fibrin affinity of RFEA1 (K m : 1.093 mg/mL and V max : 52.39 ug/mL/min). In addition, RFEA1 has also been assessed for its thrombolytic potential using mammalian blood clot. The clot dissolution efficacy of RFEA1 (complete dissolution within 4 h) is high when compared to current thrombolytic enzymes (Sharma et al., 2020). The enzyme (0.2 and 0.5 µg) from Schizophyllum commune resulted in clot dissolution within 8 h (Lu & Chen, 2010) whereas orally administered nattokinase showed complete dissolution of blood clot within 5 h in a dog model (Yogesh & Halami, 2017). The docking and in-vitro study thus suggested high substrate binding affinity, specificity and thrombolytic potential of RFEA1 than already reported fibrinolytic enzymes.

Conclusion
Fibrinolytic enzyme RFEA1 from Bacillus cereus RSA1 with in-vitro thrombus hydrolysis potential might have tremendous possibilities towards industrial/therapeutic deployment in blood clot removal and treatment of cardiovascular thrombosis, respectively. Therefore, comprehending the structural attributes of RFEA1 is imperative to obtain further insights into its molecular and biological functional characteristics. Prediction of the in-silico 3D structural model is exceedingly challenging but beneficial for examination of structurefunction aspects of a protein. The presented work is thus an effort to analyse structure based functional aspects of RFEA1. SAPS statistical compositional outcome has evidently presented RFEA1 as a serine (13.1%) and alanine (12.6%) rich protein with molecular weight 39.5 kDa. Validation statistics of modelled structure revealed that the Phyre2 server predicted the RFEA1 model better than other servers. Further study testified the presence of a high affinity calcium binding site in RFEA1. The in-silico molecular docking and in-vitro characterization reflects high binding affinity (−21.36 kcal/ mol) and substrate specificity of RFEA1 towards fibrin. Conclusively, this study provides an insight into structural functional characteristics of RFEA1 and might be a significant contribution in computational analysis for detection/identification of such fibrinolytic enzymes. Nevertheless, in-vitro analysis in our previous study (Sharma et al., 2020) and present study has reported similar characteristics of RFEA1, but in-vivo experimentations would be essential to confirm the claims.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
• Arti Nigam conceived and designed the experiments, authored or reviewed drafts of the paper, and approved the final draft.
• Rajni Singh conceived and designed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.

DNA Deposition
The following information was supplied regarding the deposition of DNA sequences: Sequences are available at NCBI: MK288105.

Data Availability
The following information was supplied regarding data availability: All the data are available as Figures and Tables in the main article.

Supplemental Information
Supplemental information for this article can be found online at http://dx.doi.org/10.7717/ peerj.11570#supplemental-information.