Tuberculosis (TB) and its drug-resistant forms are still the primary causes of mortality, surpassing other infectious diseases (Dande & Samant, 2018) and emphasizing the unmet clinical need for new drugs with novel mechanisms. Owing to the indispensable and specific lipids forming the envelope of Mycobacterium tuberculosis (Dubnau et al., 2000), targeting the synthesis and transport pathways of mycolic acids has always been the main route of TB drug discovery (Bhatt et al., 2007; Brennan & Nikaido, 1995; North, Jackson & Lee, 2014; Wilson et al., 2013).
Recently, powerful evidence has verified that Pks13 is an essential enzyme in the mycolic acid biosynthesis pathway (Gavalda et al., 2009; Portevin et al., 2004), and Pks13 has been extensively studied as a drug target for TB (Aggarwal et al., 2017; Thanna et al., 2016). The type-1 polyketide synthase enzyme Pks13 consists of five domains. The medial three are mandatory polyketide synthase domains, namely, the ketoacyl synthase (KS) domain, the acetyltransferase (AT) domain and the acyl carrier protein (ACP) domain. The other ACP domain is adjacent to the KS domain, and the thioesterase (TE) domain is the C-terminal portion of Pks13. The overall Pks13 topological structure has the order ACP-KS-AT-ACP-TE (Fig. 1A).
The residue Ser55 in the N-ACP domain has been identified as a very important active site for initializing the pathway. The sfp gene encodes phosphopantetheinyl transferase, which modifies ACPs by providing a P-pant arm for the general function of carrying the substrate acyl chain via a thioester bond involving its terminal thiol group (Chalut et al., 2006; Gavalda et al., 2009; Wilson et al., 2013). The meromycoloyl chain on the N-ACP domain is transferred to the KS domain, and the intermediate product α-alkyl β-ketothioester is produced by a Claisen-type condensation reaction with another substrate, the carboxyacyl-CoA loaded by the AT domain. The mycolic acid precursor generated by the C-terminal ACP domain is then released by the TE domain (Abrahams & Besra, 2016; Dubey, Sirakova & Kolattukudy, 2002).
Despite increasing insights into the mechanism of Pks13, no full-length structural information has been reported, except that the structures of a few domains belonging to Pks13 have been solved (Bergeret et al., 2012; Herbst et al., 2016).
Here, we report a high-resolution structure of the core motif of the AT domain. First, the full-length Pks13 protein was successfully purified, and an extended crystal screening was performed, in which the initial crystal was obtained. While attempting to phase the diffraction data of the crystal, we found that the crystallized protein suggested a degraded fragment. Then, the crystals were solved, and the N-terminal sequence was identified by mass spectrometry, the results of which were in line with the phase presented by the Se-Met crystal dataset. These results indicated that the crystallized protein was actually proteolyzed to become a fragment (Ala717 to Arg826). The overall crystal structure displayed a fold similar to the reported AT domain, excluding several conformational changes relative to the reported AT domain (Protein Data Bank codes: 3TZW, 3TZZ). The structural alignment performed by the secondary structure matching (SSM) in Coot also showed a superimposition of the core motif and the AT domain with an r.m.s.d. of 1.33 Å, which was mainly attributed to the rearrangement of residues Ala796–Ser801. In addition, the position of residue Ser801 that is reported to be the catalytic residue was shifted away from the active site (Bergeret et al., 2012; Gavalda et al., 2009). Furthermore, a highly conserved arginine residue, Arg826, lost a hydrogen bond with the side chain of Gln773, as observed in our structure. These features might all contribute to the unique state that survived proteolysis.
We believe that comprehensive structural studies of Pks13 will pave the way for structure-based antimycobacterial drug design and drug screening.
Materials and Methods
Cloning, over-expression, and purification
The codon-optimized gene encoding the full-length Pks13 protein originating from M. tuberculosis was ligated into the NdeI and XhoI sites of the pET-28b expression plasmid (Novagen, Madison, WI, USA). The sfp gene, which encodes the P-pant transferase that serves as a kind of cofactor to modify Ser55 in the N-ACP domain of Pks13, from Bacillus subtilis str.168 (Chalut et al., 2006) was also ligated into the NdeI and XhoI sites of the pET-21b expression plasmid (Novagen, Madison, WI, USA), and a terminator codon was added to the C-terminal end. The detailed information on these constructs is shown in Table 1. All constructed plasmids were verified by sequencing.
|Source organism||Mycobacterium tuberculosis(H37Rv)||Bacillus subtilis str.168|
|DNA source||Full-length Pks13||Sfp (P-pant transferase)|
|Expression host||E. coli strain(DE3)||E. coli strain(DE3)|
|Complete amino acid sequence of the construct produced||MADVAESQENAPAERA……IEADRTSEVGKQLE||MKIYGIYMDRPLSQEENERFMSFISPEKREKCR……PGYKMAVCAAHPDFPEDITMVSYEELL|
The constructed plasmid Pks13-pET-28b was cotransformed with sfp-pET-21b into E. coli strain BL21 (DE3). The bacteria containing these recombinant plasmids were grown at 310 K in M9 medium (6 g/L Na2HPO4, 3 g/L KH2PO4, 1 g/L NH4Cl, 0.5 g/L NaCl, and 0.4% glucose) supplemented with 0.05 g/L kanamycin and 0.1 g/L ampicillin. When the OD600 reached 0.5, the medium was supplemented with amino acids (0.1 g/L l-lysine, l-phenylalanine, and l-threonine; 0.05 g/L l-isoleucine, l-leucine, and l-valine; and 0.1 g/L l-Se-methionine). In addition, the protein was overexpressed after the addition of 0.3 mM IPTG at 289 K for approximately 16 h. Cell pellets were harvested by 4,000 rpm centrifugation for 10 min and suspended in a solution of 1 mM PMSF, 150 mM NaCl, and 25 mM Tris/HCl (pH 8.0) suspension buffer. After sonication, we clarified the cell lysate by centrifugation at 15,000g for 30 min. The supernatant containing the modified protein was applied to a nickel-affinity column (Ni-NTA; GE Healthcare, Little Chalfont, UK) preequilibrated with suspension buffer.
The resin was gradient washed with ice-cold washing buffer (25 mM Tris/HCl (pH 8.0) and 150 mM NaCl) containing 20, 30, and 40 mM imidazole, and the proteins were eluted with elution buffer (25 mM Tris/HCl pH 8.0, 150 mM NaCl, and 250 mM imidazole). Before loading onto an anion exchange column (Source Q; GE Healthcare), the eluate with 250 mM imidazole was diluted by half with buffer A (25 mM Tris/HCl (pH 8.0) and 3 mM DTT). Subsequently, the peak fractions were collected for further purification by size-exclusion chromatography (Superdex 200 10/300; GE Healthcare) in 10 mM Tris/HCl (pH 8.0) buffer containing 100 mM NaCl. The purity of the protein was determined by 12% SDS-PAGE gels stained by Coomassie brilliant blue. The eluted protein was concentrated by a 10 kDa centrifugal filter and flash-frozen in liquid nitrogen for crystallization.
The protein encoded by the constructed plasmid and labeled with Se-Met was concentrated to 12 mg/ml. Index (Hampton Research, Aliso Viejo, CA, USA) and PEG/ION (Hampton Research, Aliso Viejo, CA, USA) kits were used for the initial crystallization trials at 293 K by the sitting-drop vapor-diffusion method (Luft & Detitta, 1995). Each drop contained 1 μL of protein solution and an equal volume of reservoir solution.
The initial crystal was obtained from a solution of 300 mM KAc, pH 8.1, and 20% PEG 3,350. Further crystal optimization experiments were performed by systematic variation of the precipitant concentration. Ultimately, the best crystals were screened in a solution consisting of 300 mM KAc, pH 8.1, and 25% PEG 3,350. The crystals grew to full size in 10 days and were flash-frozen in liquid nitrogen with 10% glycerol added as a cryoprotectant before X-ray diffraction.
X-ray diffraction data were collected at 100 K using a Pilatus3 6M detector. All the datasets were obtained at beamline BL19U1 of the Synchrotron Radiation Facility in Shanghai (Wang et al., 2016). A total of 360 images were recorded with 0.5 s exposure at a crystal-to-detector distance of 450 mm, and a total rotation range of 360° was covered using 1.0 oscillation.
Protein N-terminal sequence based on mass spectrometry
Regarding the dataset of the crystalized Pks13, the initial trial did not seem to provide a structure with all of the residues because of the insufficient density for many residues. After X-ray diffraction, the crystals were collected together and analyzed with SDS-PAGE gels stained by Coomassie brilliant blue. The gel with a single low molecular line was processed with the standard in-gel digestion for mass spectrometric characterization to identify the actual location of the degraded fragment in Pks13 (Shevchenko et al., 2006).
All datasets were processed by HKL-2000 (Brodersen et al., 2006). The crystal structure of the motif was solved by single-wavelength anomalous dispersion (SAD) phasing using the anomalous data collected from the Se-Met crystal. The final model was manually built in Coot (Emsley et al., 2010) and refined in PHENIX (Adams et al., 2010). The final models were validated by MolProbity and deposited in the Protein Data Bank (PDB code 5XUO).
Purification and crystallization of Pks13
The full-length Pks13 protein was successfully overexpressed in E. coli BL21 (DE3), and the initial crystal condition (300 mM KAc, pH 8.1, and 20% PEG 3,350) was screened. The mature lump-like crystals were optimized after a series of crystal optimization experiments, including crystallization with different detergents and additives.
X-ray diffraction datasets for the Se-Met-labeled crystals were obtained at beamline BL19U1 of the Synchrotron Radiation Facility in Shanghai with a wavelength of 0.97852 Å. Diffraction images for the crystals were processed using HKL-2000.
Protein N-terminal sequence
The prepared gel was digested by trypsin, and the digestion was purified into freeze-dried peptide powder. Then, the peptide was resolved by an Orbitrap Elite LC-MS/MS for analysis. The sequenced peptides were blasted within the full-length Pks13 protein, and the crystallized fragment protein was located in the range from Ala717 to Arg826 (Table 2).
This value essentially operates as a p-value, where smaller is better.
The crystal belonged to the space group R32, with asymmetric unit cell parameters of a = 93.694, b = 93.694, c = 97.908, α = β = 90, and γ = 120. Additionally, the phases were determined by the SAD method. The final model was manually built in Coot and refined in PHENIX to an Rfree of 26.05% with good stereochemistry. The collected and processed data are presented in Table 3.
|Data set||Core motif of AT domain|
|X-ray source||SSRF BEAMLINE BL19U1|
|Resolution range (Å)||50–2.587|
|Total no. of reflections||90,229|
|No. of unique reflections||5,358(524)|
|Rmsd angles (°)||0.540|
|PDB accession code||5XUO|
Overall architecture and Superimposition with AT domain
The overall structure of the core motif contains a long α helix, five short α helixes and two short η turns, in the order of α1-α2-α3-α4-η1-η2-α5-α6, which constitutes a compact motif (Fig. 1B). The long α helix, α4, distributes in the middle and is surrounded by the other five short α helixes and two short η turns (Fig. 2A). Superimposition with the reported structure of the AT domain (PDB code 3TZZ) (Bergeret et al., 2012) suggested that the core motif was located in the central region of the AT domain (Fig. 2B). The crystallized core motif ranging from Leu717 to Arg826 represents approximately one-third of the AT domain, and the overall crystal structure displays a fold similar to the reported AT domain (Fig. 2C).
Although sequence alignment showed 100% identity between the core motif and the AT domain, the secondary structure elements presented a slight conformational change from residues Ala796 to Ser801, for which refinement indicated two η turns instead of the β strand highlighted by red dashed square line (Fig. 3A). According to the structure of the AT domain reported by Bergeret et al. (2012) there was a parallel six-stranded β-sheet (β13-β12-β4-β5-β10-β11) along with the active site in the reported AT domain, while only the central β strand, β5, was presented in the motif structure and was refined as a completely different secondary element (Figs. 3B and 3C). Previous studies suggested that the conserved Ser801 and Arg826 could serve as a catalytic residue and binding site, respectively. The active site Ser801 of the AT domain is located in the nucleophilic elbow between β5 and helix α10 and could directly contact the lipid substrate. Additionally, the active cite constituted the part of the highly conserved consensus sequence Gly-X-Ser-X-Gly that stabilizes the β5 strand shape (Bergeret et al., 2012; Serre et al., 1995). In our work, the topographic conformation of Ala796 to Ser801 was transformed into two relatively disordered η turns, along with the conformational change of the position of Ser801 dislocating from the substrate. Furthermore, the side chain of the binding site Arg826 was also stretched to the reverse side of α5 and lost its interaction with Gln773. However, this side chain formed direct hydrogen bonds with the negative side chain of the lipid substrate, and the conformation was held in position through a strong hydrogen bond interaction with the side chain of Gln773 in the AT domain (Fig. 4A).
The structural alignment performed by SSM in Coot (Emsley & Cowtan, 2004) also showed that the superimposition of the core motif and the AT domain had an r.m.s.d. of 1.33 Å; along with the conformational changes, this alignment might also suggest a more compact crystal packing state than that of the AT domain. According to a close view of the superimposition, some particular apolar contacts among α1 (His723, Leu730) and the long α4 (Gln773, Ile779, Gln780, and Leu783) and α5 (Ile823) residues all contribute to the stabilization of the unique state (Fig. 4B). Electrostatic calculations of the AT domain (Protein Data Bank code 3TZZ) revealed the presence of an electropositive area corresponding to the floor of the active site cavity due to the presence of Ser801 and Arg826 (Fig. 4C). Comparison of the electrostatic potential surface presentation of the motif indicated that the surface of the active site cavity was transformed to an electronegative state (Fig. 4D).
The synthesis and transport pathways of mycolic acids in M. tuberculosis have always been a critical drug target. These mycolic acids serve as the primary defense to counteract the low permeability of the envelop to many hydrophilic molecules. Many biochemical and structural studies have sought to elucidate the participation of Pks13 in the synthesis of the lipid complex. Obtaining the structure of Pks13 is of great significance in drug screening, as many inhibitors have been reported to target Pks13 or its individual domains.
The structure of the fragment of the AT domain provides a relatively new perspective of a unique state that can evade proteolysis. We have determined the 2.59 Å high-resolution crystal structure of a partial AT domain from the M. tuberculosis Pks13 protein. The overall structure of the core motif of the AT domain is similar to the corresponding part of the reported AT domain, with slight conformational differences. Some conserved residues showed a completely different secondary structure. Residues Ala796, Val797, Ile798, Gly799, Gln800, and Ser801 formed a β strand in the previously reported AT domain (PDB codes 3TZW, and 3TZZ), which instead refined as a flexible loop conformation in the motif structure. In contrast to the typical structure of the whole AT domain containing a palm-shaped parallel six-stranded β sheet, in which β5 is located in the middle of a connection with the other five β strands. In our work, the β-sheet structure was disrupted along with loosing connections among these β strands due to the conformational changes. Actually, there was less possibility of the AT domain remaining the same because of the conformational changes from Ala796 to Ser801, which tend to confirm the speculation that the conformational changes are a tactic to evade proteolysis. With the structural alignment performed by SSM in Coot, the superimposition of the core motif and the AT domain shows an r.m.s.d. of 1.33 Å. The novel packed structure formed by these bundles seems tighter than the AT domain, which is especially reflected in the apolar contacts among α1 (His723, Leu730) and the long α4 (Gln773, Ile779, Gln780, and Leu783) and α5 (Ile823) residues. These apolar contacts among the residues might strengthen the interactions of α4 with other helixes to form a more stable packing state.
Additionally, the active site Ser801, which plays a critical role in catalytic activity, was dislocated away from the substrate cavity to the inner position of the core motif. The nucleophilic elbow of α10 and β5 also transformed from an electropositive state to an electronegative state which indicates an unsuitable state to absorb a substrate. In summary, the conformational change of residues from Ala796 to Ser801 and the rearrangement of residues Gln773, Ser801, and Arg826 might all suggest that the degraded fragment formed a unique crystal packing state to survive proteolysis. In other words, the fragment forms a relatively stable state in contrast to the AT domain in such conditions. This work might provide new insight into the core motif of the AT domain. Our work also provides a structural basis for protein engineering.
However, the overall structure of Pks13 is still unrevealed, and its mechanism is yet unknown. More work should be performed, and we hope that our present work will provide some assistance.