Light chain amyloidosis (AL) is the most common form of systemic amyloidosis, which originates from plasma cell over proliferation. This lethal disease is primarily characterized by an overproduction of immunoglobulin light chains (LC) and followed by pathological deposition of amyloid fibrils in the extracellular space of vital organs causing organ dysfunction. Non-enzymatic post-translational modifications (PTMs) can profoundly affect protein properties and have been shown to contribute to the pathogenesis of several protein misfolding diseases. However, few is known about PTMs effects on LC amyloidogenicity. Here, we investigated the impact of oxidative PTMs, particularly carbonylation by hydroxynonenal (HNE), oxidation and nitration, on the structure, thermodynamic stability and aggregation of Wil, a LC variable domain of the λ6 germline. In order to achieve this, we initially identified the residues that are prone to oxidative chemical modifications by LC-MS/MS analysis performed after pepsin digestion. Subsequently, we noted that HNE-carbonylation at specific His residues and nitration of precise Tyr side chains modulate Wil propensity to self-assemble and to form ThT-positive fibrillar aggregates. Nitration appears to accelerate the formation of aggregates with low cross-b-sheets quaternary structure. This effect has been associated with a decrease in thermodynamic stability. In contrast, HNE-conjugation on specific His imidazole group did not affect the structural stability although it altered the conformational conversion driving the aggregation process. No effect on LC Wil aggregation and structural stability has been noted for oxidation Wil PTMs. Thus, both the thermodynamic stability and the physicochemical and structural properties have to be considered concomitantly when evaluating the amyloidogenic propensity of a LC variable domain in the context of AL.
Natural and designed proteins often possess marginal stability. This limits yield and shelf-life, while reducing the activity and usability of biocatalysts and increasing the likelihood of immunogenicity in biotherapeutics. Computational tools predicting stabilizing point mutations, which can be fast and exhaustive, have been much sought after. Over the last decade a plethora of such tools have been reported. Accuracy is typically claimed to be >80% when tested against known mutations. However, later real-world application of these tools to stabilize proteins shows poor success rates of ~25%. Through a detailed analysis we find that many commonly reported performance metrics can be misleading, with the best tools recognizing stabilizing mutations about half the time. Additionally, datasets used for testing poorly reflect the mutations desired in biotechnology applications. To support future developments we provide guidelines for robust performance metrics and highlight the current tools and approaches that give protein engineers the best outcome.
The Golden Gate strategy is a novel molecular biology approach that has been used with success in the past mainly for synthetic biology applications.1,2 The method entails the use of typeII restriction enzymes, which cut outside of their recognition sequence, allowing for the design of unique fragments that can readily and seamlessly be recombined.
In our work, we demonstrate that this prerogative can be used to our advantage in biocatalysis as a convenient tool for facilitating the directed evolution of enzymes. In particular, a combinatorial approach is adopted by disassembling the enzymes in their composing domains, which are then separately mutated and easily re-assembled at a later stage.
In our work, the lipase Candida antarctica lipase A (Cal-A) was used as a model enzyme. Cal-A has the potential to be applied in the food industry for the hydrolysis of saturated short-chain fatty acids (C4 and C6) vs. saturated C10-C16 fatty acids (a very useful tool for the dairy industry).3
The characteristics that make of this enzyme a good candidate for our strategy are: (i) the availability of a crystal structure, (ii) the possibility to easily subdivide the enzyme sequence into three distinct domains and (iii) its enzymatic activity can be screened spectrophotometrically using commercially available p-nitrophenyl derivatized fatty acids.
Our evolution protocol makes also use of molecular dynamics simulations to guide our choice of hot spots for mutations.
Furthermore, faster and better screening of the mutants was achieved by the implementation of laboratory automation through the development of a high throughput screening on agar plates and a subsequent medium throughput screening in 96 well plates.
1. Engler, C., Gruetzner, R., Kandzia, R. & Marillonnet, S. Golden gate shuffling: a one-pot DNA shuffling method based on type IIs restriction enzymes. PLoS ONE 4, e5553 (2009).
2. Kirchmaier, S., Lust, K. & Wittbrodt, J. Golden GATEway cloning--a combinatorial approach to generate fusion and recombination constructs. PLoS ONE 8, e76117 (2013).
3. Nyyssölä, A. et al. Treatment of milk fat with sn-2 specific Pseudozyma antarctica lipase A for targeted hydrolysis of saturated medium and long-chain fatty acids. International Dairy Journal 41, 16–22 (2015).
Protein engineering towards the development of protein switches requires evolution in opposing directions, producing proteins that are highly active in one state and inactive in the other. Numerous assays exist to aid in evolving protein function, however these assays are of limited use in the development of condition-dependent protein function, as they do not control for absence of function in the off-state. To enable bi-directional evolution of conditional function, we have engineered an in vivo assay with a selection parameter that can be tuned, allowing for enrichment of the on- or off-state as required by external conditions. This assay combines principles from two previously established in vivo assays: the ‘hitchhiker’ assay1,2, a selection assay for protein-protein interactions based on the production of beta-lactamase; and the ‘band-pass’ assay3, an externally tuneable bacterial selection system for beta-lactamase activity. The resulting genetic circuit enables one to select for cells with specific levels of beta-lactamase activity, which correspond to different protein-protein interaction strengths. We demonstrated the function of this assay using designed leucine zipper peptides with a range of interaction strengths, comparing the results with data from in vitro characterization of the peptides. We will use this assay in the development of a synthetic photo-controlled protein, using the ability of the assay to select for cells with the protein switch in the active or inactive state, depending on the experimental set up. It is anticipated that this assay could be used to facilitate the directed evolution of a wide variety of protein switches involving protein-protein interactions.
1. Strauch, E. & Georgiou, G. A bacterial two-hybrid system based on the twin-arginine transporter pathway of E. coli. Protein Sci. 16, 1001–1008 (2007).
2. Waraho, D. & DeLisa, M. P. Versatile selection technology for intracellular protein-protein interactions mediated by a unique bacterial hitchhiker transport mechanism. Proc. Natl. Acad. Sci. U. S. A. 106, 3692–7 (2009).
3. Sohka, T. et al. An externally tunable bacterial band-pass filter. Proc. Natl. Acad. Sci. U. S. A. 106, 10135–40 (2009).
Antibodies are tremendously useful for biotechnological applications, diagnostics and therapy. However, their complex architecture has spurred interest in smaller derivatives such as Fab and scFv that can retain the targeting specificity and be more easily produced. We have constructed two highly diverse (>1E10) libraries based on an autonomous human variable heavy (VH) domain. This scaffold was generated by comprehensive mutational analysis of residues in the former light chain interface to identify structurally compatible hydrophilic substitutions that promote autonomous behavior. We integrated a CDRH1 design biased towards Asp aimed to alleviate aggregation problems that are commonly associated with human domain antibodies.
The libraries have been used to select binders to all 14 human Eph receptors, many of which play roles in cancer. Our aim is to use these binders to investigate effects of blocking or activation of specific Eph receptor homo- or heterodimers. In contrast to Fab fragments raised against the same antigens, the domain antibodies typically bind the ligand-binding domain and compete with ligand for binding. We have solved the structure of one EphA1-binder and propose a model for ligand blocking. Furthermore, we have analysed the influence of CDRH1 charge in a panel of EphA1 binders and also expanded this strategy to CDRH3 to enable selection of heat tolerant clones by phage display.
Moreover, binders to an intracellular GTPase implicated in Ras-dependent pancreatic cancer have been isolated and screened for potential inhibition of assembly of a signaling complex that activates Ras/MAPK. The VH format may enable intracellular delivery to inhibit Ras-driven tumorigenic signaling.
Protein function is determined by the combined contributions of many weak interactions that form and stabilize the native three-dimensional structure. The ability to return to this conformation after partial or complete denaturation, or resistance to proteolysis by enzymes or chemicals, are hallmarks of protein resilience. Intein-mediated ligation of the N- and C-terminal ends of proteins in vivo or in vitro is gaining attention as method of protein engineering to improve these traits. Here, we report N- and C-terminal end ligation of a small, thermolabile antifreeze protein (AFP) using the Npu dnaE split inteins. Before construct synthesis and purification, computational modeling and molecular dynamics simulations were used to assess a number of different extein spacer sequences for their theoretical improvement of the entropic properties of the AFP polypeptide. The top-performing candidate was then expressed in bacteria, and the circular AFP protein was isolated from the spliced intein complex. Gratifyingly, this circular AFP showed improved thermostability compared to the wild type as measured by thermal hysteresis activity. The protein also readily recovered a folded conformation and thermal hysteresis activity following heat denaturation. These studies indicate that circular protein engineering expands possibilities for protein use in non-natural environments, such as in industrial settings, and that selection of correct N- and C-terminal end joining sequence is important for achieving a positive effect on protein stability.
Supported by CIHR
The 7-kDa type III antifreeze protein (AFP) fused to the much larger (42 kDa) maltose-binding protein was originally used to show that AFPs act individually at the ice surface rather than as an aggregate. Unions of this type led to small increases in antifreeze activity in proportion to the size of the fusion protein. Much larger increases in antifreeze activity have been achieved by lengthening the ice-binding site (IBS) or doubling the number of ice-binding sites that can simultaneously interact with the ice surface. To further increase the activity of AFPs we have linked them to a dendrimer to significantly increase both size and number of ice-binding sites. Using a heterobifunctional cross-linker we have been able to attach multiple type III AFPs (6-13) to a second-generation polyamidoamine (G2-PAMAM) dendrimer with 16 reactive termini. The heterogeneous sample of dendrimer-linked type III AFP constructs showed a greater than four-fold increase in antifreeze and ice-recrystallization inhibition activity over monomeric type III AFP. Additionally, attachment of type III AFP to the dendrimer has afforded the AFP an increase in stability in the form of recovery from heat denaturation. Linking AFPs and mixtures of different AFP types together via a dendrimer or other polymers has the potential generate novel reagents for controlling ice growth, and for exploring the relationship between antifreeze activity, ice recrystallization inhibition (IRI), and ice nucleation.
IB acknowledges grants from the Israel Science Foundation, the European Research Council: PLD holds a Canada Research Chair in Protein Engineering and acknowledges research funding from the Canadian Institutes for Health Research, Natural Science and Engineering Canada, as well as a visiting professorship to HUJI, Rehovot, from the Lady Davis Fellowship Trust.
Antifreeze proteins (AFPs) are capable of binding to and inhibiting the growth of ice crystals, which depresses the freezing point of the ice-containing solution. They also inhibit ice recrystallization, which occurs as water migrates over time from small to large ice crystals. AFPs have previously been connected together through conjugation to a variety of branched molecules, which resulted in an increase in activity of the AFP compared to the corresponding monomer. These conjugation reactions have varying levels of efficiency, however, and can result in a heterogeneous product that lacks complete occupancy of the reactive termini. Self-assembling protein cages represent an alternative method to incorporate multiple AFPs into a single structure, and are composed of multiple copies of one or more subunits that bind through non-covalent interactions to form a multimeric structure. AFP multimers were produced by genetically fusing the 7-kDa type III AFP from the ocean pout (Macrozoarces americanus) or the 25-kDa SfIBP from an Antarctic bacterium (Shewanella frigidimarina) to the C termini of a 24-subunit protein cage. These AFP multimers exhibited greater freezing point depression compared to the monomeric AFP across a range of concentrations, in addition to greater ice recrystallization inhibition. AFPs have previously been used to combat ice recrystallization in frozen foods, and their multimerization using protein cages would be advantageous over some other scaffolds because the entire complex would be edible.
Funded by CIHR
The rise of antibiotic resistance is an emergent health crisis due to the speed at which it is developing and its economical and clinical repercussions. The appearance of new enzymatic activities within bacterial cells is one of the most common causes of antibiotic resistance. Therefore, assessing the capacity of an enzyme to evolve towards new activities, its innovability, is important to understand and counteract this issue.
In the present study, we aim to explore the capacity of the primitive enzyme R67 dihydrofolate reductase (R67 DHFR), that confers resistance to the commonly prescribed antibiotic trimethoprim, to develop new activities.
A site-directed saturation mutagenesis library in the residues involved in binding and catalysis in the enzyme R67 DHFR has been screened against several types of antibiotics. Differential survival of clones has allowed to identify a variant which confers weak resistance to an antibiotic chemically unrelated to trimethoprim: tetracycline.
Characterization of this new variant will help to understand the evolution of this primitive protein and its potential as a multi-drug resistant source.
To assist affinity maturation of therapeutic antibodies we have developed a platform combining binding affinity predictions with stepwise experimental validation. Starting from the crystal structure of an antibody-antigen complex, an efficient workflow intertwines computational predictions with experimental validation from single-point to quadruple mutants. Examples of employing ADAPT for maturation of multiple antibodies will be presented.
Iron is an essential nutrient for most bacteria; however, ferric iron (Fe3+) is scarce under aerobic conditions. In mammalian cells, most Fe3+ is bound to proteins such as transferrin. To scavenge ferric iron from the extracellular environment, many bacteria produce and secrete small iron-chelating molecules known as siderophores. By taking up iron-siderophore complexes, bacteria can survive and proliferate in low-iron environments.
Enterobactin is a catecholate-type siderophore that is synthesized in E. coli cytoplasm by seven enzymes EntA-F, EntH. To investigate the Ent protein complexation in the enterobactin biosynthetic pathway, Bacterial Adenylate Cyclase Two-Hybrid (BACTH) was employed to map in vivo Ent protein interactions. Using biophysical techniques, our research group had previously reported an interaction between EntA and EntE. In the current study, we employed the BACTH approach to elucidate subunit orientation within the EntA-EntE complex. These studies were complemented and validated by immunoblotting techniques and computational docking. Finally, follow-up studies will be performed to study the effects of protein interaction interface disruption by site-directed mutagenesis.
The rhomboid family of intramembrane serine proteases has a unique ability to catalyze proteolysis below the surface of the cell membrane that is required in a diverse array of biological processes, including parasitic host cell invasion and growth factor signalling. High-resolution structures of the E. coli GlpG rhomboid indicate that the active site is sequestered away from the membrane environment by two transmembrane a-helices (TM2, TM5). It has been proposed that these helices act as a lateral gate for substrate entry, potentially representing a key control point for proteolysis. Here, we assessed the effect of several point mutations designed to destabilize the gate and increase gate dynamics. In general, these alterations enhanced the activity of the catalytic transmembrane domain core of GlpG (TMD) against a model transmembrane substrate in dodecylphosphocholine, in line with previous observations made using full-length GlpG in both detergents and phospholipid membranes. However, when the activity of these samples was tested against a water-soluble model substrate, this enhancement was not retained for some of these mutants, in one case actually reducing activity relative to wild-type TMD. This suggests that, in addition to substrate gating, proposed gate residues are also required for key interactions that stabilize the catalytic core structure. In agreement with this hypothesis, solution-state NMR and circular dichroism studies on these loss-of-function mutants suggest a disruption to both the structure and dynamics of rhomboid protease beyond the gate region. Taken together, these results describe an expanded role for the gate region and substrate binding in proteolysis.
A broad range of novel engineering methodologies has emerged over the last 40 years. Laboratory evolution is now a state of the art tool that is frequently used to tailor proteins towards a desired property. Mutagenesis and subsequent screening or selection has facilitated the improvement of a wide variety of protein features, including thermostability, catalytic efficiency, and substrate specificity. Recently, the increasing amount of available sequence, structure and biochemical data has enabled the adoption of more sophisticated approaches; combining the wealth of available biochemical data with the vast amount of unexplored sequences available in public databases.
In our recent work, we have adopted the use of sequence similarity networks (SSNs) to harness the full potential of collective biochemical and bioinformatic data. SSNs enable the simultaneous visualization of sequence relationships of every protein within large protein superfamilies, and furthermore, they allow multiple levels of information to be observed simultaneously. For example, networks can be overlaid with structural or functional information, which facilitates the prediction of functional clusters, highlights unexplored sequence space and enables specific profiling of protein attributes. We highlight the power of this new approach by presenting the comprehensive analysis of the FMN-dependent nitroreductase superfamily. (1) We reveal the wealth of unknown sequence space, even in a “well-characterized” enzyme superfamily. The nitroreductase superfamily has historically been categorized into two broad subgroups based upon cofactor usage. We uncover, however, 23 distinct subgroups, including eight with no currently known function. (2) We expose the evolutionary trajectory that gave rise to a catalytically distinct, and mutually exclusive, reaction mechanism. (3) Novel cofactor requirements were revealed by residue profiling and enabled the prediction and validation of unusual cofactor requirements within selected subgroups. (4) Key structural “hotspots” were found that determine substrate specificity and catalytic activity. We connect structural modifications to specialized function and demonstrate their relevance to biotechnological applications.
Red fluorescent proteins (RFPs) are used extensively in chemical biology research as fluorophores for live cell imaging, as partners in FRET pairs, and as signal transducers in biosensors. For all of these applications, brighter RFP variants are desired. Here, we used rational design to increase the quantum yield of monomeric RFPs in order to improve their brightness. We postulated that we could increase quantum yield by restricting the conformational degrees of freedom of the RFP chromophore. To test our hypothesis, we introduced aromatic residues above the chromophore of mRojoA, a dim RFP containing a π-stacked Tyr residue directly beneath the chromophore, in order to reduce chromophore conformational flexibility via improved packing and steric complementarity. The best mutant identified displayed an absolute quantum yield increase of 0.07, representing an over 3-fold improvement relative to mRojoA. Remarkably, this variant was isolated following the screening of only 48 mutants, a library size that is several orders of magnitude smaller than those previously used to achieve equivalent gains in quantum yield in other RFPs. The crystal structure of the highest quantum yield mutant showed that the chromophore is sandwiched between two Tyr residues in a triple-decker motif of aromatic rings. Presence of this motif increases chromophore rigidity, as evidenced by the significantly reduced temperature factors compared to dim RFPs. Overall, the approach presented here paves the way for the rapid development of fluorescent proteins with higher quantum yield and overall brightness.
Lignolytic enzymes are a group of biocatalysts with potential applications in delignification and bioremediation. The enzymatic delignification process is a green chemistry alternative for the pretreatment of lignocellulosic material, providing a means for the efficient removal of lignin and the synthesis of biologically active compounds such as monolignols. Laccases (EC 220.127.116.11) are the most studied enzymes in delignification processes and Basidiomycete fungi are the main source. The aim of this study was to identify the gene sequence and protein structure of a protein band identified with a laccase activity from an enzymatic extract obtained by solid-state fermentation (SSF) of Dictyopanus pusillus. The enzymatic extract from D. pusillus was concentrated and purified by fast protein liquid chromatography (FPLC) using an anion exchanger. SDS-PAGE and native PAGE were used to determine the molecular weight and activity of the bands obtained. Tryptic digestion and Micro-HPLC-MS analyses were performed to identify peptides belonging to the protein band identified with laccase activity. From those peptides, degenerate primers were designed to amplify the coding gene sequence from D. pusillus. Three DNA sequences with high identity were obtained and have been used to elucidate the putative laccase gene. These results confirm the expression of a new laccase in D. pusillus, further allowing overproduction of this enzyme in a heterologous system.
Several programs for computational protein design (CPD) have been developed throughout the years and successfully applied to solve various protein design problems. However, few studies have directly compared the performance of various CPD force fields across identical design objectives, making it difficult to determine which one to use for a specific application. Recently, we developed a CPD platform named Triad that can utilize both the Rosetta and Phoenix force fields developed in the laboratories of David Baker and Stephen Mayo, respectively. Herein, we will present a number of validation studies performed with the Triad CPD platform to assess the performance of the Rosetta and Phoenix score functions, including sequence recovery, side-chain placement, sequence design, and loop prediction. These current validation studies follow the direction of earlier published validation studies and will show the similarities and differences between a knowledge-based force field (Rosetta) and a molecular mechanics-based force field (Phoenix). We will also compare torsion space and Cartesian space search protocols.
Proteins are widely used in research, industry, and medicine for their ability to carry out complex molecular processes with high precision and efficiency. It is generally thought that the three-dimensional structure of proteins dictates their function, but increasing evidence demonstrates that complex functions are equally mediated by dynamics. Traditional rational design methodologies however do not consider this important property when engineering proteins. This is in part due to the lack of a framework for the structure-based rational design of protein dynamics. To address this issue, we have developed a methodology based on multistate design (MSD), an emerging methodology in computational protein design that optimizes sequences in the context of multiple structural states. As a proof-of-concept for our framework, we predicted and experimentally validated sequences that stabilize and facilitate exchange between two non-native conformations of Trp43 in the streptococcal protein G domain β1 (Gβ1) fold. Four candidate sequences predicted to exchange between core and solvent-exposed conformations of Trp43 were identified by MSD across a sequence space of 1296 possible mutants evaluated on an ensemble of 12,648 unique Gβ1 backbones. 15N-HSQC and ZZ-exchange NMR spectroscopy confirmed that all four candidate sequences are dynamic, with two of them exchanging between two distinct conformations on the 10 to 100 millisecond time scale. Solution structures of two Gβ1 mutants displaying decreased dynamics, whose 15N-HSQC spectra showed peaks with highly similar chemical shifts to those from a single conformer of each dynamic variant, were solved. The structures of these mutants, coupled with an analysis of NNOE correlations from the Trp43 Hε1, confirmed that Trp43 adopts conformations corresponding to those designed in the exchanging mutants. Overall, the successful application of our MSD strategy paves the way to the rational design of protein dynamics as we move towards increasingly complex protein functions.
The creation of enzymes displaying desired substrate specificity is an important objective of enzyme engineering. To help achieve this goal, computational protein design (CPD) can be used to identify sequences that can fulfill interactions required to productively bind a desired substrate. Standard CPD protocols find optimal sequences in the context of a single state, for example an enzyme structure with a single substrate bound at its active site. However, many enzymes require multiple substrates to complete their catalytic cycle. Thus, the design of multi-substrate enzyme specificity requires the ability to evaluate sequences in the context of multiple states (i.e., multiple substrates) because mutations designed to change specificity for one substrate may be detrimental to the binding of a second substrate. This design objective can be tackled using multistate design (MSD), an emerging methodology in CPD that allows sequence selection to be driven by the energetic contributions of multiple states simultaneously. Herein, we report the development and validation of a MSD procedure to enable the rational design of multi-substrate enzyme specificity. As a case study, we used our MSD methodology to redesign E. coli branched-chain amino acid aminotransferase (BCAT) to catalyze the transamination of α-ketoglutarate with the non-native substrate L-His. Using our approach, we obtained BCAT mutants displaying up to 10-fold increased kcat/KM for transamination of α-ketoglutarate with L-His relative to the wild type. Additionally, we developed a negative MSD approach to identify BCAT mutants displaying improved activity towards the desired L-His substrate (i.e., the positive state) while simultaneously displaying decreased activity towards the undesired native substrate L-Leu (i.e., the negative state). Receiver operating characteristic analysis of predicted sequences demonstrates that consideration of the negative state during calculation results in improved predictions of substrate specificity. Overall, our approach opens the door to the design of multi-substrate enzymes displaying tailored specificity for any biocatalytic application.
Most chemotherapy targets dividing cells, rather than just tumour cells. This causes the failure of chemotherapy, as dosing limited by the side-effects of the therapy leads to drug resistance and relapse. One strategy to overcome this, referred to as Directed Enzyme Prodrug Therapy (DEPT), involves administration of an inactive prodrug that is activated by a foreign enzyme at the tumour site. In one variant of DEPT, known as Antibody Directed Enzyme Prodrug Therapy (ADEPT), enzyme targeting is achieved using a covalently linked tumour-directed antibody. Clinical trials have demonstrated the effectiveness for ADEPT, but were hampered by the immune response to the foreign enzyme. Paradoxically, the need to use a foreign enzyme to avoid off-target activation by native proteins precludes the use of an immune-compatible human protein. In addition, complex dosing schemes were needed to ensure that the enzyme-antibody complex had cleared circulation and was entirely localized the tumour before the prodrug was administered. Otherwise, the circulating enzyme would activate the prodrug in circulation, resulting in some systemic toxicity.
We are using computational enzyme design to overcome these shortcomings in model ADEPT systems. In one case, the bacterial enzyme, carboxypeptidase G2 (CPG2), is used to activate nitrogen mustard prodrugs. Using the Rosetta software suite, we are designing a related mammalian enzyme, carnosinase, which is not immunogenic, to have CPG2's activity. We have demonstrated modest but clear CPG2-like activity towards the prodrug in our initial carnosinase “fixed backbone” designs. To further improve activity to usable levels, we have developed a “loop transplant” design protocol to transfer the sequence and conformation of CPG2’s substrate specificity loops to carnosinase. We have also developed a high throughput assay to allow us to rapidly screen a large number of variants.
We are also considering a second model ADEPT system, in which cytosine deaminase (CD) is used to activate the non-toxic 5-fluorocytosine to the toxic 5-fluorouracil. We have made an initial set of fixed backbone designs of the closely related mouse adenosine deaminase (mADA), which we will be testing in the lab shortly.
Finally, we are also designing the ADEPT enzymes to act as activatable “proenzymes.” These modified enzymes will be dormant until reaching the tumour microenvironment, thereby avoiding systemic activation of the prodrug by circulating enzyme. We are considering both light activation and tumour-overexpressed protease activation as systems to locally activate the enzymes in the tumour environment.
Computational Protein Design, with Applications to
Predicting Resistance Mutations, and HIV
Bruce R. Donald
Department of Computer Science
Department of Chemistry
Department of Biochemistry
Computational protein design is a transformative field with exciting prospects for advancing both basic science and translational medical research. My laboratory has developed protein design algorithms and used them to design new drugs for leukemia; redesign an enzyme to diversify current antibiotics; design protein-protein interactions; and design drugs for cystic fibrosis. At the heart of this research lies OSPREY, a software package implementing provable design algorithms of considerable intrinsic interest. I will introduce these algorithms for protein design. Then, I will discuss two applications: (1) Predicting MRSA resistance mutations to new antibiotics, and (2) designing antibodies against HIV. Computational, and experimental (in vitro, and in vivo) results will be presented.
If time permits, I will also describe some recent advances in protein design with continuous flexibility, estimating changes in conformational entropy upon binding, multi-state design, and design with general (non-residue-pairwise) energy functions.
Papers on the algorithms and results mentioned above are available at: http://www.ncbi.nlm.nih.gov/pubmed/?term=donald+BR
Proteins can be viewed as interacting networks of amino acid residues. We have previously used NMR methods to delineate amino acid networks in the alpha subunit of tryptophan synthase (aTS). Here, we demonstrate that amino acid substitutions at network positions on the protein surface can both increase or decrease enzyme catalytic activity, despite these perturbations being more than 20 angstroms away from the active site. We also show that covalent modification of network residues can have similar effects. We propose that covalent modification of surface-exposed network residues offers a novel way to control enzyme activity, which may find use in synthetic biology and biosensing applications. aTS is a TIM barrel enzyme, and so the results here are applicable to a wide range of enzyme activities.
Correlation between conformational dynamics and enzyme function has been well established for discrete enzyme systems; however, approaches for characterizing dynamical properties across diverse sequence homologs and their correlation with enzyme activity remain challenging. Members of the pancreatic-type ribonuclease (RNase) superfamily share similarities in structure and fold, but display large variations in catalytic efficiencies and dynamics, making them ideal model systems to probe the relationship between conformational motions and function. Using a combination of bioinformatics, molecular dynamics simulations and NMR approaches, we characterized the dynamical properties of over 20 diverse RNase homologs, whose three-dimensional structures have been determined using X-ray crystallography or NMR approaches, over a wide range of time-scales.Our results show that while the different RNase homologs used for the analysis share a common structural fold, the dynamical properties of these enzymes are significantly different. Clustering these RNase sequences into evolutionarily distinct sub-families showed similar dynamical properties within sub-family members and significant differences between distinct sub-families. Interestingly, sequences sharing the same biological function also display similar dynamical patterns, suggesting that biological function, among other factors, may potentially impact dynamical properties influencing sequence, structure and function.
The iron(II)-, and 2-(oxo)glutarate-dependent (Fe/2OG) oxygenases catalyze a diverse range of chemically difficult and biologically important reactions. For example, the (pro)collagen-modifying prolyl-4-hydroxylase hydroxylates proline and is critical in collagen function. Dysfunction of the enzyme (such as deficient amount of co-factor ascorbic acid) causes scurvy; another Fe/2OG enzyme AlkB reverses DNA methylation by hydroxylating the methyl group to restore the original base and plays a critical rule in DNA metabolism. The mechanism of hydroxylation of Fe/2OG enzymes is well understood. The iron oxygen complex (Fe(IV)-oxo) in the active site abstracts a hydrogen atom from the aliphatic carbon of the substrate, and the resultant OH of the Fe(III)-OH complex rebounds to the substrate radical to generate the hydroxylated product and the co-substrate 2OG is oxidized to succinate and carbon dioxide. Study of the mechanism of other reactions by this family of enzyme, such as dehydrogenation, are in progress. Biochemists traditionally rely on site-directed mutagenesis to understand the function of a specific amino acid in the mechanism of an enzymatic reaction. But due to the complexity and divergence of enzyme activity, rational mutation may not lead to variants that will help solve mechanistic puzzles. Directed evolution, on the other hand, is a powerful way of directing the function of an enzyme as desired, hence presenting a path to break the limits of rational design in studies of enzyme mechanism. Here we applied directed evolution to reprogram an arginine 4,5-desaturase NapI to one that catalyzes C5 hydroxylation. Wildtype NapI catalyzes 4,5-desaturase of arginine mainly and an approximately 5% side reaction to generate glutamate-5-semialdehyde and guanidine, presumably through hydroxylation of C5 and automatic hydrolysis of the product. A proline auxotroph was used to screen the variant library generated by error-prone PCR, and from the first-round of variants we were able to identify two mutants that are approximately two times better in catalyzing C5 hydroxylation than the wildtype. A random mutant library based on the two variants are currently been screened with the same method to aim for enhanced C5 hydroxylation. We hope to achieve an enzyme that will redirect its major reaction to the original side reaction, and by interrogating the mechanism of the new enzyme and comparing to the wildtype, we expect to gain insights into the mechanism of dehydrogenation by this family of enzyme. The combination of directed evolution and enzymatic mechanism study represents a novel way of understanding the fine-tuning of an enzyme reaction and which would facilitate better application of the Fe/2OG oxygenases.
The ability to sequence millions of nucleotides for pennies has revolutionized many areas of biology, and for protein engineering it is no different. My group has developed a robust experimental pipeline to evaluate the sequence effects on function for thousands of protein variants. In particular, my group has developed deep sequencing methods to see all possible single point mutants in a given protein sequence in a single experiment. In my talk I will discuss the overall experimental pipeline and discuss applications to enzymes and protein-protein binders.
For enzymes, I will show the complete sequence determinants to specificity for an amidase against three different aliphatic substrates. Contrary to expectations, the differential fitness effects vary remarkably with substrate, with up to 20% of mutations showing enhanced fitness over wild-type for a non-native substrate. I will also detail our efforts to understand the biophysical determinants of specificity-modulating mutations from principles of protein stability, aggregation, catalytic rate enhancement, and transition state binding stabilization.
I will also discuss my lab’s advances in a new method of coupling deep sequencing to yeast display to generate conformational epitope maps for antibodies. I will show that this method faithfully recovers the quantitative energetics of binding affinity on mutation and the binding epitope for multiple protein partners, including antibody-antigen interactions and interactions involving multimeric proteins. I will discuss advantages of this method over competing approaches, and describe current efforts to improve the cost, sensitivity, and throughput of the methodology. Finally, I will describe and give examples of ways that this epitope mapping procedure can improve and augment computational protein design for developing rational structure-based vaccines against important human pathogens, including Zika and Dengue flaviviruses.
Synapse formation and function in neurons requires the precise spatiotemporal control of protein synthesis (translation). The dysregulation of translation and of eIF4E, a key regulator of translation initiation, in neurons has been implicated in mental disorders such as autism. Unfortunately, the gene knock-ins and knock-outs used to study these disorders offer only crude spatiotemporal control of protein expression (timescales of days/weeks). Optogenetic tools, in contrast, can be controlled in minutes with a spatial scale as small as a single synapse. Here, we present a new optogenetic tool for the control of translation in neurons. To assess function in vivo, we developed an assay using a strain of Saccharomyces cerevisiae which requires human eIF4E for growth. From a panel of 19 structure-based designs, we identified one promising construct: a fusion of a circularly permuted LOV2 domain from Avena sativa (cLOV) and 4E-BP2, an inhibitor of eIF4E and translation initiation. In blue light, yeast expressing the cLOV-4EBP2 inhibitor grew significantly slower compared to the dark. When the primary binding site of 4E-BP2 was mutated to non-binding, inhibition was lost and growth was restored back to wild-type. In vitro studies showed that blue light caused cLOV-4EBP2 to bind to eIF4E while dark state cLOV-4EBP2 was inhibited from binding eIF4E. This process was found to be reversible and repeatable. To improve the light-dark difference of activity, we screened a random library of additional mutants based on the successful structure-based design in the S. cerevisiae strain. One promising construct showing very strong inhibition under low levels of blue light was identified and studies are currently underway to validate function.
Human G protein coupled-receptors (GPCRs) can be functionally linked to the mating pathway of the yeast S. cerevisiae, providing a high-throughput screening method for characterising GPCR function and GPCR-drug interactions. However, this powerful technique has rarely been used to engineer novel function in GPCRs.
P2Y2 is the most sensitive GPCR of extracellular ATP, a potent inflammatory signal in mammals. Human P2Y2 was functionally linked to the yeast mating pathway via a chimeric G alpha protein identified in previous studies, but using a novel mating-responsive fluorescent reporter gene. Fluorescence activated cell sorting (FACS) was used to screen a library of >15,000 P2Y2 mutants, with the goal of discovering P2Y2 mutants with improved sensitivity to extracellular ATP. Following several rounds of FACS, the top ten mutants were found to have a 10 to 1000-fold lower ATP EC50, and up to a 2-fold increase in maximum mating pathway response to ATP.
The mutations conferring this activity were found throughout P2Y2, although specific residues in transmembrane helices 1, 4, and 7 were mutated in eight of the top ten mutants. None of these mutations had previously been identified. Interestingly, no mutations conferring increased ATP sensitivity occurred in the predicted ligand binding pocket. These findings suggest that when ligand binding has already been strongly selected for during evolution, mutations modifying GPCR interactions with G proteins, or residue-specific interactions within transmembrane helices, may have a greater effect on increasing GPCR pathway responsiveness. These ten P2Y2 mutants provide a toolkit of engineered GPCRs with improved sensitivity for extracellular ATP, which will aid in developing novel biosensors of inflammation, and for better understanding of general GPCR structure-function relationships.
Many biological processes involve tight spatial and temporal regulation of protein-protein interactions. Optogenetics, in which a protein of interest are made light sensitive by fusion to a photo-active domain, offers a means to study these processes. Currently, only a handful of naturally occurring light switchable protein-protein interactions are known, and these have numerous limitations. Here, we present a general method to select for proteins that bind specifically to one state (light or dark) of a photo-switchable domain. In our approach, a small binding domain was randomized at specific positions and expressed on phage. Sequences showing differential light-dark binding were identified. These sequences were expressed, purified and protein-protein interactions were tested in vitro using UV-Vis and NMR. Our approach can provide a palette of new light switchable protein-protein interactions, easily customizable for different optogenetic applications.
Glycosidases are amongst the most widely deployed enzyme catalysts in industry, with uses including bioethanol production, food and drink processing, pulp and paper production, detergents and textiles. While large numbers of enzymes are already available for these purposes, new and improved catalysts are always welcomed, both for these established processes and for completely new opportunities.
Largely unexplored in this application however, have been the huge numbers of enzymes available within the “silent majority” of currently unculturable bacteria. These can be accessed through metagenomic analysis of environmental DNA. The problem can be in accessing this diversity in a reasonably efficient manner.
Here we shall describe our use of activity-based, or functional metagenomics to generate a library of over 600 expressed glycosidases. We shall also describe the high-throughput characterisation of these enzymes for substrate specificity, thermal stability, pH profile and mechanism. This library is then used to identify preferred catalysts for cleavage of specific unnaturally modified sugars (e.g. azido sugars) and the generation of "glycosynthase" versions thereof, which can be used to "tag" glycans.
In a specific application we shall describe our efforts to identify and optimise enzymes for removal of the A and B antigens from red blood cells by directed evolution, as a way of generating antigen-null red blood cells with the potential to serve as universal donor blood.
Cyclic depsipeptides are a group of natural products with a vast array of biological activities. They are produced by microbes through massive enzymatic complexes called depsipeptide synthetases, which belong to the family of non-ribosomal peptide synthetases. The production mechanism of depsipeptide synthetases resembles a semi-iterative assembly line, where ketoacids and amino acids are alternatively selected, modified and condensed by a series of protein domains arranged in a linear fashion. The first core of the reactions generates a tetrapeptide precursor. Three of these tetrapeptide precursors are then oligomerized and cyclized by a terminal domain, generating the mature cyclic depsipeptide. Here we present three approaches to elucidate the detailed mechanisms involved in this synthesis scheme. First, we evaluate the structural determinants of ketoacid selection and modification by an integrative approach of structural biology and protein engineering. Second, we present the in vitro reconstitution and biochemical characterization of an intact depsipeptide synthetase, a macromolecular machine of 700 kDa. Third, we evaluate the reactions catalyzed by the terminal domain through a combination of protein engineering, chemical biology and structural biology. Overall, our results shed light into the mechanism and structure of these proteins, which could aid to the ongoing global effort on engineering these enzymes to produce modified products with more potent or new activities: antibiotics, antitumorals, insecticides, among many other useful compounds.
It has been recognized that mutational epistasis, i.e. non-additive interactions between mutational effects, constraints evolutionary trajectories. Thus, epistasis may strongly hinder our ability to engineer novel proteins and enzymes. However, it has been unclear to which extent mutational epistasis impair predictability in protein evolution and our ability in protein engineering, and what are molecular mechanisms underlying mutational epistasis. I will present a global view for the extent of epistasis from our systematic survey of a diverse examples of natural and laboratory evolution. Then, I will discuss a foremost but overlooked problem in protein evolution and engineering, i.e., evolutionary contingency; how the initial choice of starting points or mutations could result in significantly different evolutionary outcomes. I will present several examples of comparative laboratory evolution to investigate evolutionary contingency. Finally, I will discuss molecular mechanisms underlying strong epistasis, and evolutionary contingency to hinder protein engineering and design.
D-Alanine aminotransferase (DAAT) catalyzes the synthesis of several D-amino acids having aliphatic, charged, or polar side chains, making it an attractive biocatalyst for the production of enantiopure D-amino acids. Although DAAT displays broad specificity, its catalytic efficiency towards aromatic amino acids is low. To bolster its biocatalytic applicability, improved DAAT variants displaying increased activity towards non-native aromatic substrates are desired. Previously, we engineered a DAAT active site mutant (V33G) that showed a 3-fold increase in catalytic efficiency (kcat/KM) towards D-phenylalanine. Herein, we report the development of additional DAAT mutants with increased activity towards various D-phenylalanine derivatives. These mutants were prepared via rational design of the active site to accommodate substituents on the ortho, para and meta positions of the phenyl ring. The mutants were screened for enzymatic activity against a library of potential D-amino acid substrates, enabling the identification of several mutants with up to 640-fold enhanced catalytic efficiency towards various D-phenylalanine derivatives.
The filamentous bacteriophage M13 has found use in materials and devices due to its self-assembled architecture and functional groups. Its viral capsid contains five different coat proteins: the minor coat proteins p3, p6, p7 and p9, as well as the major coat protein, p8. The major coat protein p8 is present in 2700 copies per M13 phage and therefore considerable modification to the virus can be done by engineering p8. Both genetic and chemical modifications, as well as maintaining the self-assembly of M13, are important considerations to further development of this platform. The production of M13 with available orthogonally-reactive moieties and viral capsid modification will be discussed.
RNA viruses encoding high- or low-fidelity RNA-dependent RNA polymerases (RdRp) are attenuated. The ability to predict residues of the RdRp required for faithful incorporation of nucleotides represents an essential step in any pipeline intended to exploit perturbed fidelity as the basis for rational design of vaccine candidates. We have previously identified the active site loop known as structural motif D as an important fidelity determinant. In particular, we have shown that poliovirus encoding the T362I motif-D substitution (also found in the Sabin vaccine strain) attenuates the virus, likely because the T362I substitution alters the structural dynamics of motif D, leading to a lower fidelity polymerase. Molecular dynamics simulations have suggested residues that can be targeted to control the dynamics of the motif D loop and hence polymerase fidelity. We present our recent kinetic and NMR data showing how these substitutions modulate the catalytic activity and structural dynamics of the poliovirus RdRp.
The p19 protein is a suppressor of RNA silencing endogenous to tombusviruses,
which binds small RNA duplexes of any sequence with extremely high affinity.
Because of its unique binding properties, recombinant p19 proteins are an
excellent platform for tool development surrounding the RNA silencing pathway.
Herein we present three areas in which we are developing p19 as a biotechnology
tool and are simultaneously gaining insight into p19's mechanism as a suppressor
of RNA silencing. We aim to improve the utility of p19 for detecting and
sequestering human microRNAs from biological samples. We will present the
results of our mutational analysis of p19's binding site, where we observe
mutations which dramatically enhance p19's affinity for human miRNA-122. We go
on to explore the structural implications of these mutations using x-ray
crystallography. Lastly, we have engineered the p19 binding site with an unnatural
amino acid capable of photo-crosslinking p19 with its RNA ligands, using
organisms with an expanded genetic code. These recombinant unnatural proteins
represent irreversible binders with expanded potential applications.
High-energy enzyme states can be found in dynamic equilibrium with stable ground states such as substrate-free and bound states. It has been established for some enzymes that such high-energy states can be indispensable for enzymatic catalysis. High-energy states are inherently difficult to study with spectroscopic methods due to their low abundancy, and there exist needs to develop strategies to enable their detailed functional characterization. We have been able to directly study the function of two high-energy states that are relevant for adenylate kinase catalysis through involvement in induced fit and conformational selection mechanisms of substrate binding. This was, in part, made possible through design of di-sulfide bond that could arrest substrate-free adenylate kinase in an “active like” high-energy state. NMR and biophysical measurements performed on the arrested enzyme showed that the motional amplitude during excursions from the substrate-free state corresponds to the full amplitude for closure and activation of adenylate kinase. In addition, we show that the substrate-binding site is accessible to ligands and that the arrested high-energy state has two orders of magnitude stronger binding affinity compared to the wild-type enzyme. These results provides the structural and functional basis for ligand binding to adenylate kinase with a conformational selection mechanism. In my lab we are also trying to find general rules that can be used to rationally tune the dynamics and hence the activity of enzymes. I will describe our first steps towards this goal. I will also describe our in-cell enzymology assay based on yeast genetics. From the assay we have found that a surprisingly small fraction of the catalytic power of yeast adenylate kinase is required for optimal growth of yeast cells. The result indicates that the kcat value of adenylate kinase has evolved under a selective pressure that is dominated by the organism’s ability to deal with stress conditions.
Reference: Kovermann et al, Nature Communications, 2015, “Structural basis for catalytically restrictive dynamics of a high-energy enzyme state”
In silico modeling tools have suffered a remarkable evolution in the last decades with significant improvement in computational power along with important achievements in software development. These have a direct effect on how we study dynamical systems as is the case of proteins. Our interest in understanding protein function has led to several successful stories in the description of complex processes and, most importantly, the accurate prediction of favorable mutations. I will show how a detailed view of the protein-substrate recognition process, provided by state of the art PELE1,2 simulations, opens new possibilities in computer-aided protein design. Furthermore, recent results, centered on oxidoreductases3,4 involving cooperative experimental and computational effort will be presented.
This work was only possible thanks to the collaboration of: Centro de Investigaciones Biológicas, CSIC; Instituto de Recursos Naturales y Agrobiología de Sevilla, CSIC; Institute of Catalysis, CSIC; University of Naples Federico II; Wageningen University; Novozymes A/S; JenaBios GmbH and TU Dresden.
Funded by the INDOX (KBBE-2013-7-613549) European Project and CTQ2013-48287 Ministerio de Educación y Ciencia.
1- Borrelli, K. W.; Vitalis, A.; Alcantara, R.; Guallar, V., PELE: Protein Energy Landscape Exploration. A Novel Monte Carlo Based Technique. J. Chem. Theo. Comp. 1, 1304 (2005)
2- Madadkar-Sobhani, A.; Guallar, V., PELE web server: atomistic study of biomolecular systems at your fingertips. Nucleic Acids Research, 41, W322-W328 (2013)
3- Monza, E., Lucas, M.F., Camarero, S., Alejaldre, L., Martínez, A., Guallar, V., Insights into Laccase Engineering from Molecular Simulations: Toward a Binding-Focused Strategy, The Journal of Physical Chemistry Letters, 6 (8), 1447 (2015)
4- Acebes, S., Fernandez-Fueyo, E., Monza, E., Lucas, M.F., Ruiz-Dueñas, F., Lund, H., Martinez, A.T., Guallar, V., Rational enzyme engineering through biophysical and biochemical modeling, ACS Catalysis, 6, 1624 (2016)
Gag polypeptide is the major structural protein of the retroviruses and capable of inducing by its sole expression the formation of virus-like particles (VLPs). The myristoyl group added onto its second amino acid targets the polypeptide to the cell membrane where the formation of VLPs occurs by the budding of cell membrane. Interestingly, VLPs can be used as a vehicle to deliver proteins into cells. Incorporation of a protein of interest within VLPs is possible by making a chimeric protein with Gag. However, nuclear proteins may be incorporated less efficiently than cytoplasmic proteins because their nuclear retention signals reduce Gag ability to drag the proteins at the VLP budding site. In this study, we have fused the C-Terminus of Gag from HIV-1 to the green fluorescent protein (GFP) and different transcription factors: the chimeric transactivator (cTA) from CR5 inducible promoter and human reprogramming factors (KLF4, Oct4, Sox2 and c-Myc). VLPs were produced by transient transfection of a stable cell line (293SF-pacLV). Analysis of the supernatants from producing cells by western blot confirmed the presence of Gag-GFP, -cTA, -KLF4, -Oct4, -Sox2 and c-Myc. Confocal microscopy showed that the vast majority of the cells (> 90 %) treated with VLP-GFP was successfully transduced. Insertion of a nuclear localisation signal (NLS) targeted Gag-GFP to the cell nucleus rather than to cytoplasm, thus demonstrating that the localization of VLP-delivered proteins can be changed by protein engineering. Transduction of cells containing a CR5-GFP reporter cassette with VLP-cTA showed activation of the reporter up to 1,100-fold as measured by flow cell cytometry. We also created reporter cells responsive to KLF4, but could not detect KLF4 activity in VLP transduced cells. Therefore, we attempted to improve KLF4 transcriptional activity and VLP production by protein engineering. Addition of a VP16 activation domain improves KLF4 activity 6.25-fold. Insertion of a nuclear export signal (NES) decreased its activity but improved the production to a higher extent, thus increasing the overall activity of VLPs by 2.4-fold. A VP16 and NES signal were also inserted in Oct4, Sox2 and c-Myc. Although the amounts of hyper active versions of Oct4 and c-Myc were importantly decreased in VLP preparations (i.e., more than 10-fold), the amount of hyperactive Sox2 was decreased only by about 3 to 4-fold. In summary, VLPs based on HIV-1 Gag are potent tools to deliver transcription factors. Furthermore, protein engineering can enhance their activity in addition of improving VLP production. In the next months, we have planned to test whether or not VLPs containing hyper active versions of the transcription factors can reprogram cells.
Evolution is a unifying theme in the urgent medical and public health problems we face today including cancer, the rise of antibiotic resistance, and the spread of pathogens. But the ability to predict evolution remains a major challenge because it requires bridging several scales of biological organization. Potential evolutionary pathways are determined by the “fitness landscape” (the genotype-phenotype relationship), but how this landscape is explored depends on microbial population dynamics.
In the first half of the talk, I describe our recent work where we showed that the fitness landscape of norovirus escaping a neutralizing antibody can be projected onto two traits, the capsid folding stability and its binding affinity to the antibody. We then developed a theory based on protein biophysics and population genetics to predict how the fitness landscape might be explored. Using a droplet-based microfluidics “Evolution Chip”, we propagated millions of independent viral sub-populations, and showed that by tuning viral population size per drop, we could control the direction of viral evolution. In the second half of the talk, I will describe how this combined framework of biophysics and evolutionary biology also applies to bacterial evolution due to horizontal gene transfer. Altogether, these stories demonstrate the broad applicability of the techniques and concepts from protein engineering to fundamental problems in evolution and genetics.
I will discuss recent advances we have made in integrating genome mining, molecular modeling, and synthetic biology to discover enzymes of novel function. These tools will be compared with established screening approaches and the resulting enzymes from each demonstrated to enable the engineering of biosynthetic pathways. We believe these tools are broadly applicable and can allow the effecient selection of enzymes to experimentally characterize from rapidly growing genomic databases.
Bacterial cell division requires formation of the division septum at mid-cell position in a process involving MinD and MinE. These proteins undergo dynamic localization driven by MinD-catalyzed ATP hydrolysis stimulated by the MinE anti-MinCD domain (αCD). αCD is buried in a closed MinE structure, but is liberated for interactions with MinD to give rise to an open state; the molecular triggers for this conformational transition are unknown. We show that MinE-membrane interactions induce a structural change mimicking the open state. Mutants deficient in lipid binding showed higher MinD ATP hydrolysis rates than WT MinE suggesting membrane interactions and conformational change inhibit MinD ATPase activity. In the absence of lipid binding, interactions between MinD and a αCD residue in the closed structure is required for conformational change. This suggests that MinE has two modes of MinD-interaction; one that is independent of membrane binding, and one promoted by the membrane.
Xylanases catalyze the hydrolysis of xylan, an abundant carbon and energy source with important commercial ramifications. Despite tremendous efforts devoted to the catalytic improvement of xylanases, success remains limited due to relatively poor understanding of their molecular properties. Previous reports suggested the potential role of atomic-scale residue dynamics in modulating the catalytic activity of GH11 xylanases; however, dynamics in these studies was probed on timescales orders of magnitude faster than the catalytic time frame. Here, we used NMR titration, chemical shift projection analysis (CHESPA) and relaxation dispersion experiments (15N-CPMG) in combination with computational simulations to probe conformational motions occurring on the catalytically relevant millisecond time frame in xylanase B2 (XlnB2) and its catalytically impaired mutant E87A from Streptomyces lividans 66. Our results show distinct dynamical properties for the apo and ligand-bound states of the enzymes. The apo form of XlnB2 experiences conformational exchange for residues in the fingers and palm regions of the catalytic cleft while the catalytically impaired E87A variant only displays millisecond dynamics in the fingers, demonstrating the long-range effect of mutation on flexibility. Ligand binding induces enhanced conformational exchange of residues interacting with the ligand in the fingers and thumb loop regions, emphasizing the potential role of residue motions in the fingers and thumb loop regions for recognition, positioning, and/or stabilization of ligands in XlnB2. To the best of our knowledge, this work represents the first experimental characterization of millisecond dynamics in a GH11 xylanase family member. These results offer new insights into the potential role of conformational exchange in GH11 xylanases, providing essential dynamic information to help improve protein engineering and design applications.
This study is the first comprehensive investigation of enzyme-producing bacteria isolated from four sludge samples (primary, secondary, press and machine) collected in a Kraft paper mill. Overall, 41 strains encompassing 11 different genera were identified by 16S rRNA gene analysis and biochemical testing. Both biodiversity and enzymatic activities were correlated with sludge composition. Press sludge hosted the largest variety of bacterial strains and enzymatic activities, which included hydrolytic enzymes such as cellulase, xylanase, lipase, esterease and ligninolytic enzymes such as lignin peroxidase, laccase and manganese peroxidase. In contrast, strains isolated from secondary sludge were devoid of several enzymatic activities. Most strains were found to metabolize Kraft liquor at its alkaline pH and to decolorize industrial lignin-mimicking dyes Resistance to lignin or the ability to metabolize this substrate appears to be a prerequisite to survival in any paper mill sludge type. Some strains have potential for unrelated applications, and preliminary data show that they can grow and metabolize used engine oil. This study revealed that the bacteria found in a typical Kraft paper mill represent a source of novel enzymes for both industrial applications and bioremediation.
Transition metals are crucial components of several metabolic pathways and are critical for DNA, RNA and protein synthesis. However, when found in excess, these metal ions are toxic. The Ferric uptake regulator (Fur) protein is an important regulator of iron homeostasis, however its functions extend beyond iron metabolism. Fur protein regulates a wide variety of functionally diverse pathways including flagellar and capsule biogenesis, energy metabolism, oxidative stress defense, uptake of other metal ions such as molybdate, tungsten, zinc and nickel as well as the regulation of non-coding RNAs via four general mechanisms namely apo-Fur activation/repression and holo-Fur activation/repression. Given that the Fur protein employs diverse regulatory mechanisms, we hypothesized that the ability of Fur to adopt different structural conformations underlies these peculiar functional differences. To address this important question, we solved the crystal structure of the Fur protein from Campylobacter jejuni. Structural analysis revealed that protein adopts a V-shaped conformation harboring an evolutionary conserved cluster of positively charged residues on the surface. Using an extensive library of mutants and electrophoretic mobility shift analysis, we found that substituting residues forming the positively charged surface is detrimental for Fur interaction with DNA. Furthermore, our in vivo studies suggest that these positively charged residues are important for the regulation of CjFur target genes and that different mechanisms modulate the activity of Fur family of metalloregulators depending on the number of occupied metal binding sites. Finally, we showed that the disruption of metal binding sites of CjFur significantly reduces DNA binding in vitro and is deleterious for the repression of Fur target genes in C. jejuni and the colonization of animal’s gut. Overall, our structural studies suggest that Fur protein employs a common surface and requires intact metal binding sites to bind DNA that regulate gene expression and contribute to bacterial pathogenicity.
The association of the side chains of cysteine and methionine with those of phenylalanine, tyrosine, tryptophan, and histidine is common in protein structures and is believed to contribute to protein function and stability. Little is however known about the structural and energetic properties of these interactions, especially in aqueous solutions and in protein matrices. Given the computational cost of ab initio quantum mechanical calculations, accurate force fields are required for reliable determination of the roles of these interactions in proteins. We are performing ab initio quantum mechanical calculations on complexes of hydrogen sulfide (H2S), methanethiol (MeSH), dimethyl sulfide (Me2S), dimethyl sulfoxide (Me2SO), and dimethyl sulfone (Me2SO2) with benzene, indole, phenol, phenolate anion, imidazole, and imidazolium cation as well as on complexes of H2O with these sulfur and aromatic ligands in the gas phase at the MP2(full)/6-311++G(d,p) level of theory. For the first time, we report all stable conformers for each complex. Results show that the oxidation of methionine, protonation of imidazole, or deprotonation of phenol strengthen considerably the sulfur-aromatic interaction. Potential energy curves are generated for each complex and used to calibrate all-atom non-polarizable force fields. Optimized models are used in molecular dynamics simulations to investigate the strength of these sulfur-aromatic interactions in aqueous solution by calculating the potential of mean force between the sulfur ligands and the aromatic molecules in water. In addition to elucidating the strength and directionality of the sulfur-aromatic complexes in the gas phase and in water, the optimized models are important input for reliable molecular dynamics simulation of proteins.
Computational protein design allows for the in silico evaluation of protein sequences on a scale that is experimentally impossible to achieve. Traditionally, these calculations are performed using a single-state design (SSD) approach whereby sequences are searched and evaluated in the context of a single protein structure. While this approach has been successfully applied in a number of protein engineering efforts , the methodology lacks the ability to routinely predict protein properties with quantitative accuracy. To address this shortcoming, we have developed a number of computational procedures based on multistate design (MSD), an emerging methodology in computational protein design that allows for the evaluation of protein sequences in the context of multiple chemical and/or conformational states . Here, we present a comparative study on the application of SSD and MSD to three protein engineering objectives, specifically the prediction of specificity , stability , and dynamics . In each case, predictions made with MSD more accurately reflect experimental results than those made with SSD. We expect that the MSD strategies presented here will lay the foundation for the routine and robust application of protein design calculations in future protein engineering efforts.
1. Alvizo O, Allen BD & Mayo SL (2007) Biotechniques, 42(1):33
2. Davey JA & Chica RA (2012) Protein Science, 21(9):1241
3. Lanouette S, Davey JA, Elisma F, Ning Z, Figeys D, Chica RA & Couture JF (2015) Structure, 23(1):206
4. Davey JA, Damry AM, Euler CK, Goto NK & Chica RA (2015) Structure, 23(11):2011
5. Davey JA, Damry AM, Goto NK & Chica RA (2016) Manuscript in Preparation
Using NMR methods to probe protein motions, we have identified dynamic networks of communication within the prototypical cyclophilin family member, cyclophilin-A, which are directly related to its function. By engineering these pathways, we show that distal regions of the enzyme allosterically influence the active site and modulate enzymatic turnover via altering the inherent conformational sampling responsible for substrate binding and release. These studies demonstrate a direct link between enzyme dynamics and substrate turnover and show how dynamics can be engineered to control enzyme function.
Deep mutational scanning is a foundational tool for addressing functional consequences of large numbers of mutants, yet a more efficient and accessible method for construction of user-defined libraries is needed. Here we present nicking mutagenesis, a single day, single pot saturation mutagenesis method using routinely prepped dsDNA as input substrate.
Correspondence should be addressed to Tim A. Whitehead (email@example.com).
Generalized approaches for the photo-control of protein-protein interactions can permit the development of a variety of useful biomolecular tools for biology. Antibody-like affinity reagents provide a general platform for finding protein binding partners, but these are not photo-controlled. Using azobenzene-based photoswitches to control the binding ability of antibodies could allow photo-control of binding to any target protein of interest. In the present work, we introduce a photoswitchable cross-linker on a protein multimerizing element in an effort to optically control binding via avidity effects. We fused a hexameric coiled-coil forming peptide to the sequence to a fynomer scaffold that was selected for binding to human chymase. Two cysteine residues were introduced into the coiled coil sequence to permit attachment of a thiol reactive photo-cross-linker. The darkadapted (trans azo) cross-linker is designed to promote loss of helicity and thereby monomerization of the fynomer. Upon irradiation, the cross-linker isomerizes to its cis form and promotes hexamer formation. Progress in the development of this photo-controlled multimeric antibody system will be described.
Remote control of protein function by light is a powerful technique for manipulating biological processes in living organisms. One general method to obtain photoswitchable proteins is to couple the photoisomerization of azobenzene derivatives to conformational changes in the protein of interest. The applicability of these compounds in vivo, however, depends largely on the excitation wavelength that is required for their switching. Short wavelengths of visible light are highly scattered and do not penetrate well in cells and live tissues. In the past few years, novel azo derivatives which operate with red wavelengths and are stable in water were designed in our group. We have now applied these compounds to fynomers: small proteins based on Fyn SH3 scaffold that are developed via phage display selection techniques to target biologically relevant proteins. We chose a fynomer that was optimized for inhibiting the activity of human chymase by binding near its active site. Chymase is a serine protease that is secreted by mast cells and is shown to be involved in cardiovascular diseases as well as pathological inflammatory conditions. Two cysteine residues were introduced by point mutations in the sequence of the fynomer. Subsequent crosslinking of the inhibitor with azobenzenes at those residues allows only the cis isomer of the azo moiety to be compatible with the well-folded inhibitor. The dark-adapted (trans azo) crosslinked fynomer was partially unfolded and showed reduced inhibitory activity. Upon its irradiation with red light (635 nm), the inhibitor was largely folded and better suppressed the activity of chymase. Since the Fyn SH3 scaffold can be broadly utilized to target virtually any protein, these results demonstrate the promise of red light switchable azobenzenes for in vivo functional studies and photopharmacology.
The ability to control protein structure and function, by a combination of recombinant DNA and chemical approaches, allows for the production of new molecular scaffolds having precise positioning of functional groups even in biomaterials having nanometer dimensions. Investigations into the design and fabrication of self-assembling protein-based scaffolds capable of serving as zero-dimensional nanostructured biomaterials will be discussed. Recent studies on the controlled encapsulation of various guest molecules within these proteins as well as their surface modification will be presented.
A computationally-guided semi-rational protein design approach will be used to improve the enzymatic selectivity and catalytic efficiency of the lipase B from Pseudozyma antarctica (CalB) to synthesize methyl salicylate. This fatty acid ester is a flavoring and fragrance compound with significant relevance in the biotechnological industry. CalB is one the most widely used lipases for the enzymatic hydrolysis and synthesis of esters [1,2,3,4,5], offering potential for the biological production of flavoring agents. However, the relatively confined organization of its active site precludes the recognition of more complex substrates. To overcome this limitation, in silico docking analyses of the best clones obtained from a previous mutant library generated in the Doucet lab will be undertaken. This will allow identification of the most significant amino acid residues involved in methyl salicylate precursor binding and recognition. These “hot spots” will be subjected to combinatorial mutagenesis to synthesize a ‘second generation’ library of CalB variants, which will further be screened for the desired activity. Finally, up scaling production of the most efficient variants will be tested to help develop a biocatalyst for the proper industrial enzymatic synthesis of this flavor.
 Faber K. 2011. Biotransformations in Organic Chemistry. DOI 10.1007/978-3-642-17393-6_2, # Springer-Verlag Berlin Heidelberg.
 Jaeger, K. E. & T. Eggert. (2002). Lipases for biotechnology. Current Opinion in Biotechnology. 3:390-397.
 Reetz, M. T. (2002). Lipases as practical biocatalysts. Current Opinion in Biotechnology 6:145-150.
 Lee, M. Y. & Dordick, J. S. (2002). Enzyme activation for nonaqueous media. Current Opinion in Biotechnology 13:376-384.
 Gotor-Fernandez, V., Busto, E. & Gotor, V. (2006). Candida antarctica lipase B; an ideal biocatalyst for the preparation of nitrogenated organic compounds. Advanced Synthesis & Catalysis 348:797-812.
Cell-surface glycans, found on glycolipids and glycoproteins, play an important role in cell recognition, cell signaling, and cell-cell interactions. A great deal of information can be encoded on the oligosaccharides presented on cells by virtue of the many combinations by which different sugars can be linked. Deciphering and manipulating this sugar-encoded information has important implications in health and in understanding biology. Proper communication of this information is important within an organism for normal biological processes and healthy development, whereas miscommunication of glycan signals can have deleterious effects, for example autoimmune disease. Glycan-mediated communication can also be used for deception in the case of pathogen-host interactions when pathogenic microbes hijack and exploit the glycan signaling of host cells.
We show two examples where we have engineered carbohydrate-active enzymes by directed evolution or mechanism-based targeted mutation towards their use in manipulating glycan structure. By directed evolution, we engineered a blood-antigen cleaving glycosidase to improve its removal of carbohydrate A- and B- antigens that prevent the transfusion of blood between mismatched donors and recipients. By mechanism-based targeted mutation, we are also developing "glycosynthases” as enzymatic tools for building keratan sulfate glycosaminoglycan structures towards deciphering the cell signals that these structures communicate, which are important in neuronal function and development yet have also been demonstrated to inhibit neuronal plasticity and regeneration after spinal cord injury.
Galectins are small soluble lectins that bind beta-galactosides via their carbohydrate recognition domain (CRD). Their ability to dimerize is critical for the crosslinking of glycoprotein receptors and subsequent cellular signaling. This is particularly important for their immunomodulatory role via the induction of T-cell apoptosis. Because galectins play a central role in many pathologies, including cancer, they represent valuable therapeutic targets for drugs or as biomarkers. At present, most inhibitors have been directed towards the CRD, a challenging task in terms of specificity given the high structural homology of the CRD among galectins. However, while the CRD β-galactoside binding site remains highly similar throughout galectin homologues, they display little sequence identity. This observation raises the possibility of targeting various galectins through the use of unusual ligands that would specifically bind galectins in a carbohyrate-independent manner. Here, we report non-carbohydrate ligands, porphyrin compounds functionalized with zinc ions, that specifically bind human galectin-7 (hGal-7). The medical appeal and relevance of porphyrins as photosensitizers in cancer treatment has been amply demonstrated, especially in tumor imaging and photodynamic therapy, potentially providing a means to use these binding affinities and intrinsic physicochemical imaging properties as hGal-7 markers in cancerous tissue progression. We used a combination of fluorescence and NMR titration experiments to specifically define and map the low-micromolar, non-carbohydrate binding sites of porphyrins on the surface of hGal-7. We found that these porphyrin ligands offer limited selectivity with respect to charge and metal, and that their binding affinity to hGal-7(~20µM) is stronger than the previously characterized interactions mediated by glycan-binding residues in the CRD pocket, suggesting that the distinctively high porphyrin affinity to hGal-7 may be biologically significant. To our knowledge, these results highlight the first distinct and structurally characterized non-carbohydrate binding site on the surface of hGal-7, in addition to portraying the only structural characterization of porphyrin binding to human galectins to date.
Rhamnolipids are non-toxic and biodegradable surfactants mainly produced by Pseudomonas aeruginosa. They demonstrate an excellent potential as substitutes for synthetic surfactants and are currently found in formulations of household cleaning products. Numerous other applications for these biosurfactants are currently evaluated in cosmetics, detergents and in the bioremediation of soils. The bacterial strains that produce rhamnolipids generate a mixture of congeners with different lipophilic chain lengths. In Pseudomonas aeruginosa, the predominant chain length is 10 carbons. Since the physicochemical properties of rhamnolipids are directly influenced by their molecular structure, modification or improvement of their surfactant properties can be acquired by controlling the length of the carbon chains. In Pseudomonas aeruginosa, the enzyme RhlA is responsible for diverting the 10-carbon hydroxy fatty acids to the β-oxidation pathway, linking them to form dimer precursors of rhamnolipids. We have shown that it is possible to control the length of the fatty acid chain by changing the recognition pattern of the RhlA substrate through semi-rational mutagenesis. The main objective of this project is the functional and structural characterization of RhlA, improving its catalytic efficiency and changing its affinity for hydroxy fatty acids with different chain lengths to produce rhamnolipids with distinct surfactant properties. Preliminary mutagenesis results of RhlA will be presented.
Protein design is a rigorous test of our understanding of protein structure and stability. Almost all efforts in de novo protein design have been focused on creating idealized proteins composed of canonical structural elements. These studies are excellent for exploring the minimal determinants of protein structure, but idealized structures may not be the most effective starting points for engineering novel protein functions. Functional sites in proteins are often located in pockets, grooves or loops that are created from assemblies of secondary structure that are not forming canonical or symmetric patterns. We have developed a computer-based strategy for designing proteins, called SEWING, that is not focused on creating a particular idealized structure, but rather can produce a diverse array of structures that all meet a set of predefined requirements. With SEWING, tertiary structures are assembled from structural motifs found in naturally occurring proteins. Motifs are stitched together by superimposing regions of structural similarity in two motifs. Advantages of this approach include the use of building blocks that are inherently designable and the ability to incorporate functional motifs from naturally occurring proteins, for instance protein and ligand binding sites. With SEWING we have been able to bind novel helical bundle proteins with very accuracy (< 1 Å RMSD between the design model and the crystal structure) and we are currently using the technique to embed protein-binding motifs in folded protein structures.
One of the key elements for proper directed evolution of proteins is the cyclic use of mutagenesis and selection processes, giving rise to libraries containing millions of mutants. However, analyzing such an important number of mutants is not a trivial task, as the identification of active variants among millions of possibilities quickly becomes exhaustive and inefficient. Here we describe a semi-rational combinatorial approach supported by virtual docking to generate smaller and smarter libraries. Because of its ability to perform the synthesis of esters in organic media, lipase B from Pseudozyma antarctica (CalB) was used as an industrially-relevant model system. Since CalB displays very low activity towards bulky substrates, the main goal of this project was aimed at the development of CalB variants with enhanced synthetic activity towards bulky substrates. Substrate-imprinted docking was used to uncover target positions involved with the stabilization of the enzyme-substrate complex, identifying “hot spots” that are most likely to yield active improvements for desired ligands. The Iterative Saturation Mutagenesis strategy was employed to sequentially incorporate favorable mutations, further increasing our chances of selecting improved variants with a concomitant reduction in screening effort. We tested a limited number of 164 mutants that explored 6 residue positions in the active-site cavity of CalB. For a single round of mutagenesis and selection against 2 different substrates, a number of variants showed up to 5-fold increase in activity relative to WT CalB. These results represent the first stage in the development of additional CalB variants with improved activity towards bulky esters.
Xylan has been identified as a physical barrier which limits cellulose accessibility by covering the outer surface of fibers and interfibrillar space. Therefore, tracking xylan is a prerequisite for understanding and optimizing lignocellulosic biomass processes.
In this study, we developed a novel xylan tracking approach using a two-domain probe called OC15 which consists of a fusion of Cellvibrio japonicus carbohydrate-binding domain 15 with the fluorescent protein mOrange2. The new probe specifically binds to xylan with an affinity similar to that of CBM15. The sensitivity of the OC15-xylan detection approach was compared to that of standard methods such as X-ray photoelectron spectroscopy (XPS) and chemical composition analysis (NREL/TP-510-42618). All three approaches were used to analyze the variations of xylan content of kraft pulp fibers. XPS, which allows for surface analysis of fibers, did not clearly indicate changes in xylan content. Chemical composition analysis responded to the changes in xylan content, but did not give any specific information related to the fibers surface. Interestingly, only the OC15 probe enabled the highly sensitive detection of xylan variations at the surface of kraft pulp fibers. At variance with the other methods, the OC15 probe can be used in a high throughput format.
We developed a rapid and high throughput approach for the detection of changes in xylan exposure at the surface of paper fibers. The introduction of this method into the lignocellulosic biomass-based industries should revolutionize the understanding and optimization of most wood biomass processes.
Carbohydrate-binding module, Fluorescent protein, Kraft pulp, X-ray photoelectron spectroscopy, Xylan, Xylanase.
A central challenge for protein engineering and evolution is to understand how genetic variation translates into a protein’s function. This is a non-trivial challenge because of both the large number of potential genotypes (e.g. variation at only 3 amino acid positions implies 20x20x20=8000 possible combinations) and the existence of non-additive functional interactions between sites (i.e. epistasis). We have implemented an analytical method that extracts the genetic determinants of a quantitative function via the construction and comparison of nested linear models from complete sets of measured genetic variants. This analysis quantifies the “first order” effects (i.e. the average effect of specific genetic states at specific sites across all genetic backgrounds), as well as “higher order” epistatic interaction effects. Furthermore, it statistically assesses whether epistatic interactions are significant determinants of function. We have applied this analytical method to the natural evolution of steroid receptor transcriptional regulatory modules and the laboratory evolution of organophosphate-hydrolyzing enzymes. In each case, critical epistatic interactions determine these protein’s functions. Additionally, we show how environmental variation can be integrated into this linear modeling approach such that non-additive genotype-by-environment interactions can also be statistically quantified and assessed for functional significance. Our findings have implications not only for the evolutionary trajectories that are available to these proteins, but also for the design of functionally optimized variants, for which the identification and understanding of higher order epistatic interactions is particularly vital.
Members of the ribonuclease (RNase) A superfamily have been associated with a great variety of biological functions, in addition to their strictly conserved ribonucleolytic activities. For example, some RNases have antibacterial, cytotoxic, angiogenic, immunosuppressive, antitumoral and antiviral properties. In humans, there are 8 members of the RNase A superfamily. These enzymes have rapidly evolved and possess various degrees of homology and enzymatic activity. The first crystal structure of RNase 6 (or RNase k6) has been only recently resolved in presence of sulfate anions that bind at two distinct sites on the enzyme. We have crystallized RNase 6 in presence of phosphate anions, thus also demonstrating two distinct binding sites with phosphate. One of these sites is located in loop 4 and has never been identified in any other member of the human RNases. The biophysical properties of the RNases A, 4 and 6 were also analyzed by nuclear magnetic resonance (NMR) titration and by isothermal calorimetric titration (ITC) with two ligands: 3´-UMP and 5´-AMP. The similarities and differences between these analyses will be presented. The crystal structure of RNase 6 will also be compared with those of RNase A and RNase 4. Finally, the structural differences that may partially explain their functional identity will be discussed, therefore offering many essential clues towards the understanding of their biological functions.
John Haddad, Veronique Tremblay and Jean-Francois Couture
In eukaryotes, the SET1 family of methyltransferases carries out the methylation of Lysine 4 on Histone H3 (H3K4). Alone, these enzymes exhibit low enzymatic activity and require the presence of additional regulatory proteins, which include RbBP5, Ash2L, WDR5 and DPY-30, to stimulate their catalytic activity. While previous structural studies established the structural basis underlying the interaction between RbBP5, Ash2L and WDR5, the molecular underpinnings controlling the formation of the Ash2L/DPY-30 complex have remained largely unexplored. Here we report the crystal structure of the Ash2L/DPY-30 complex solved at 2.2Å. The structure shows that Ash2L C-terminus folds in two distinct domains that include a b-sandwich composed of 12 b-strands and a long a-helix located at the C-terminal of the protein. This amphipathic a-helix makes several hydrophobic interactions with DPY-30 homodimer. Disruption of these interactions is deleterious for the Ash2L/DPY-30 complex formation in vitro and in erythroid cells. Interestingly, close inspection of the Ash2L/DPY-30 heterotrimer reveals a large positive fourier map located on the surface of the complex structurally analogous to a lipid. Binding assays show that the Ash2L/DPY-30 complex binds to anionic lipids in vitro with a preference for cardiolipin, a phospholipid found in mitochondrial membranes. Altogether, these results show that hydrophobic interactions are pitoval for the formation of the Ash2L/DPY-30 complex and that lipids may play a role in epigenetic signaling in modulating histone H3K4 methylation.
Structural, functional and evolutionary insights into domain-swapping mechanisms within the antiviral IFIT2 and IFIT3 proteins
The interferon induced proteins with tetratricopeptide repeats (IFITs) are a family of highly inducible, antiviral effectors whose expression is triggered downstream of interferon stimulation or viral infection. In humans, they consist of four well characterized paralogues: IFIT1, IFIT2, IFIT3, and IFIT5, all of which are cytoplasmic and structurally related. IFITs are composed of multiple copies of the tetratricopeptide repeat motif (TPR), a helix-turn-helix motif which usually mediates protein-protein interactions, through which IFITs are implicated in forming a multiprotein complex made up of IFIT1, IFIT2, IFIT3, and other host factors. The TPRs of IFIT proteins have also adapted to recognize RNA. Thus, IFIT1 and IFIT5 can target virus-derived 5´-triphosphate RNA, to inhibit the replication of some negative sense ssRNA viruses; IFIT1 can bind and sequester viral mRNA that is lacking ribose 2´-O methylation, targeting it for translational inhibition; and IFIT2 can bind double-stranded, AU-rich RNAs. IFIT3 has no known RNA binding. The crystal structures of full-length IFIT5, full-length IFIT2, and N-terminal IFIT1 have been determined, which altogether have uncovered a novel helical fold, and shed light on their RNA binding activities. Importantly, the crystal structure of IFIT2 revealed an N-terminal domain swapped interface, which was not present in IFIT1 or IFIT5.
The IFIT complex is poorly characterized, with little or no information on the mechanisms regulating complex formation, or its role in recognizing foreign viral RNA. To gain more insight into the role of IFIT3 within this complex, we determined the crystal structure of N-terminal IFIT3, which reveals an IFIT2-like domain-swapped dimer. Sequence conservation and structural analysis suggests a mechanism for domain swapping in IFIT3 and IFIT2 proteins, which likely arises through deletions in conserved hinge-loops. Consistent with this, mutational analysis of IFIT3 and IFIT2 loops reverses domain swapped-dimerization. Altogether, the data suggests that IFIT proteins can be grouped into IFIT2-like (which are domain swapped), and IFIT1-like (not-domain-swapped). We propose that IFIT2 and IFIT3 form a complex in human cells through domain swapped hetero-dimerization, and we speculate that IFIT3 may modulate RNA binding by IFIT2.
The enzymes of the ribonuclease (RNase) family possess identical or similar active site residues and conserved fold architecture. The enzyme members of this family preserve varying degree of ribonucleolytic activity but contribute to different biological functions. Also their catalytic efficiency of ribonucleolytic activity differ by 10-5–10-6 fold. The role of structure in enzyme catalysis has been investigated for some time. However, only recently insights into the role of internal protein motions (protein dynamics) in enzyme catalysis have become available. It is now believed that dynamics and structure together play critical role in the function of biomolecules including enzymes. Using theoretical modeling and atomistic molecular simulations at microsecond time scale, we are investigating the role of functionally important conformational sub-states in relation to optimal catalysis by the members of the pancreatic RNase enzyme family. Previous studies indicate that there is no preference for a common substrate by all members. To obtain better insights into the mechanism of ribonucleotyic activity we have modeled 7 human RNases and bovine RNase A each with two different model substrates (ACAC and AUAU). The results in ground (reactant) state indicate significant variations in the interactions of the human RNases family members with model substrate. For some members these model substrates remains strongly bound to the active-site, while for other members they are ejected within 10-20 nanoseconds. Overall, these studies are providing us structural and dynamical insights into affinity of substrate by various members of this family.
Bacteria have evolved several dedicated and sophisticated assemblies to transport proteins across their biological membranes. Recent advances in our understanding of the molecular details governing the specific actions of these protein secretion systems has benefited from an integrated x-ray crystallography, NMR, mass spectroscopy, electron microscopy, and Rosetta-based molecular modeling toolbox. Highlights of recent advances in our piece wise structure/function analysis of the Type III Secretion system “injectisome” will be presented. A molecular understanding of the Type III systems being garnered from these studies provides the foundation for the development of new classes of vaccines and antimicrobials to combat infection in the clinic and community.
Tumor-associated calcium signal transducer 2 (TROP2 aka TACSTD2) is a homodimeric type-I transmembrane glycoprotein that induces an invasive and migratory phenotype in tumor cells upon activation (1,2). The ligand and mechanism by which TROP2 becomes activated are unknown. In order to assess the function of TROP2, we have performed a series of deep mutational scanning and cell-based experiments. We surface displayed the extracellular domain of TROP2 (TROP2Ex) and used high throughput conformational epitope mapping to identify the epitope of the neutralizing mAb m7e6 (3). We have determined that m7E6 binds to the membrane distal portion of TROP2 in such a way that each arm of the antibody may bind to a separate subunit in the dimer. Thus, m7E6 may inhibit the dissociation of the TROP2 monomers blocking cleavage of TROP2 by regulated intramembrane proteolysis preventing TROP2 signaling. We will report biophysical assays in support of this hypothesis. To further probe the mechanisms of TROP2 activation, we have constructed a panel of scFvs with nanomolar affinity to TROP2Ex. Application of this panel to further elucidate the mechanism and function of TROP2 will be discussed.
1. Stoyanova T, Goldstein AS, Cai H, Drake JM, Huang J, Witte ON. Regulated proteolysis of Trop2 drives epithelial hyperplasia and stem cell self-renewal via β-catenin signaling. Genes Dev. 2012;26(20):2271–85.
2. Vidmar T, Pavšič M, Lenarčič B. Biochemical and preliminary X-ray characterization of the tumor-associated calcium signal transducer 2 (Trop2) ectodomain. Protein Expr Purif [Internet]. 2013;91(1):69–76. Available from: http://linkinghub.elsevier.com/retrieve/pii/S1046592813001320
3. Kowalsky CA, Faber MS, Nath A, Dann HE, Kelly VW, Liu L, et al. Rapid Fine Conformational Epitope Mapping Using Comprehensive Mutagenesis and Deep Sequencing. J Biol Chem [Internet]. 2015;jbc.M115.676635. Available from: http://www.jbc.org/lookup/doi/10.1074/jbc.M115.676635
"The origin of life cannot be discovered, it has to be re-invented." This statement by Albert Eschenmoser emphasizes that we cannot go back in time to observe how life originated. Instead, we must devise alternative systems to probe what might be possible. As an initial step toward life reinvented, we designed libraries of artificial proteins encoded by millions of synthetic genes. Many of these novel proteins fold into stable 3-dimensional structures and many bind biologically relevant molecules. Several of the novel proteins function in vivo providing essential functions necessary to sustain the growth of E. coli. These results suggest that (i) the molecular toolkit for life need not be limited to proteins that already exist in nature; (ii) artificial genomes and proteomes can be built from non-natural sequences; and (iii) synthetic organisms (life reinvented?) sustained by non-biological sequences may soon be possible.
The recently discovered squid-ring-tooth (SRT) protein family represents an attractive model system for the study of proteinaceous materials. These silk-like proteins, which make up the tough, flexible sucker rings on the tentacles of diverse squid species, can be produced recombinantly and manufactured into a variety of macroscopic objects via solution-phase or thermal processing. SRT sequences consist of alternating crystal-forming and amorphous domains, imparting a semicrystalline structure to the resulting biological material. However, significant diversity is observed in the amino-acid compositions, lengths, arrangements, and repeat numbers of these modules, and the dependence of the structure and material properties on such parameters is not yet known. To initiate more expansive studies on this system, we designed a minimal amorphous domain and crystalline domain based on a consensus of naturally occurring SRT sequences, and developed a new method to construct libraries of tandem-repeat genes from such building blocks. The resulting synthetic genes yield proteins that behave similarly to naturally occurring SRT proteins isolated from native and recombinant sources. To study the effect of repeat number (i.e., total protein length) on structure and material properties while holding all other parameters fixed, we prepared a panel of synthetic SRT proteins with a series of different lengths. Although the crystallinity, secondary structure, and yield strength of these samples were very similar, their ability to deform plastically ("stretchiness") increased robustly with protein length, suggesting possible molecular mechanisms for this property. These results validate our approach to probing sequence-structure-property relationships in protein materials, and show that synthetic SRTs represent a platform for designing materials with tunable properties, paving the way for high-end applications such as medical implants and photonics.
A long-standing question in evolutionary biology asks whether adaptation relies on stochastic events due to historical contingency, or rather follows a deterministic path, repeatedly imposed by specific constraints. The recent development of high-throughput techniques to determine fitness and sequence is now enabling the systematic quantitative exploration of protein fitness landscapes, providing insights into the repeatability of evolutionary trajectories. A growing body of evidence strongly suggests the non-random character of such trajectories, and identified epistasis, i.e. non- additive interactions between mutations, as a key molecular constraint on the evolutionary routes across a fitness landscape. How do initial mutations affect the fixation (identity and order) of later mutations from a specific genetic background? What are the molecular mechanisms, at the genetic and structural levels, underlying alternative evolutionary trajectories?
To explore these issues, we “replayed” the evolution of a phosphotriesterase from P. diminuta (PTEWT) towards its promiscuous arylesterase substrate (2NH). Previously, this trajectory yielded a key mutation H254R at the first round, which led to a complete specificity swap upon the addition of 22 subsequent mutations (PTER/R18). By contrast, we used a phenotypically similar variant that carried a single amino acid substitution, H254S, as an evolutionary starting point. We found that over seven rounds of evolution, high arylesterase activity could also be achieved (PTES/R7, 70-fold increase in kcat/KM). In the 254S trajectory, three key mutations were fixed at identical positions (306, 233 and 271) and in a similar order, compared with the 254R trajectory, suggesting that evolution is largely repeatable. Yet surprisingly, these three positions were mutated to distinct amino acids. A detailed mutational analysis of the evolved variants disclosed that extensive epistasis would prevent the accumulation of the S254 trajectory mutations on the R254 background, highlighting genetic incompatibility and contingency. Furthermore, structural characterization has revealed a complete reshaping of the active site in PTES/R7 compared to PTER/R18, enabling a fully new binding orientation of the 2NH substrate that escapes a steric conflict existing in the 254R trajectory. Our findings demonstrate that epistasis stemming from alternative initial substitutions forces evolution to follow a distinct mutational path and supports the hypothesis that the adaptive landscape of PTE is highly rugged where distinct functional sequences may constitute separate fitness peaks.
By binding to ice, antifreeze proteins (AFPs) depress the freezing point of a solution below its melting temperature while also being able to inhibit ice recrystallization if freezing does occur. Previous work showed that the activity of an AFP was incrementally increased by fusing it to another protein. Even larger increases in activity were achieved by doubling the number of ice-binding sites through dimerization. Here, we have taken two approaches to linking multiple AFPs together. Firstly, using a highly branched polymer known as a dendrimer with 16 reactive termini and a heterobifunctional cross-linker, we attached between 6 and 11 type III AFPs together. This heterogeneous sample of dendrimer-linked type III constructs showed a greater than 4-fold increase in freezing point depression over that of monomeric type III AFP. Additionally, attachment to the dendrimer has afforded the AFP superior recovery from heat denaturation. Alternatively, we used self-assembling protein cages to incorporate multiple AFPs into a single structure. These cages are composed of multiple copies of one or more subunits that bind through non-covalent interactions to form a multimeric structure with specific point group symmetry. AFPs were genetically fused to the termini of these protein-based structures, eliminating the need for conjugation reactions with varying levels of efficiency. Type III AFP bound to protein cages showed a greater increase in freezing point depression when compared to the monomeric AFP or dendrimer-linked AFP. Both versions of multimerized AFPs were particularly effective at ice recrystallization inhibition, likely because they can simultaneously bind multiple ice surfaces. Linking AFPs together via proteins or polymers can generate novel reagents for controlling ice growth and recrystallization.
Funded by CIHR and ERC
The selective degradation of many short-lived proteins in eukaryotic cells is carried out by the ubiquitin system. In eukaryotes, specific degradation signals, called degrons, target proteins for polyubiquitination and subsequent degradation. Specific degradation signals (degrons) include a set of N-terminal signals called N-degrons. These signals are recognized by the N-recognins, a class of E3 ubiquitin ligases that bind to specific destabilizing N-terminal residues of protein substrates. N-degrons, defined by the Arg/N-end rule, relate the in vivo half-life of a protein substrate to the identity of its N-terminal residue and can be classified in two groups: type 1, composed of basic residues (arg, lys and his), and type 2, composed of bulky hydrophobic residues (Phe, Leu, Trp, Tyr and Ile). N-recognins share an ~70-residue zinc finger–like motif termed the ubiquitin-recognin (UBR) box. The UBR boxes of UBR1 and UBR2 are highly conserved and they are thought to have a similar role in the N-end rule pathway. Previous work done by our lab revealed the basis for the arginine recognition by the UBR box. However, the mechanisms of recognition of type 1 N-degrons lysine and histidine as well as the role of the second and third positions in substrate recognition remained elusive. Here, we report the crystal structure of the UBR box from UBR2 in complex with a destabilizing peptide bearing an N-terminal histidine. We defined the recognition mechanism of N-terminal lysine and histidine, which explained their lower affinities for the domain in comparison to arginine. We determined the specificity of the second and third positions of the substrate and found that proline in the second position abrogates binding to the UBR box even in the presence of N-terminal arginine. Moreover, we show that hydrophobic residues in the second position improve binding to the UBR box. Our results explain the loss of function of UBR1 seen in the Val122Leu mutation in the Johansson Blizard syndrome, which directly affects the hydrophobic pocket that binds the second position. Analysis of high-resolution crystal structures of UBR2 in complex with different N-degrons (0.79 and 1.1Å) further defined the particularities of the N-degron recognition. Finally, we showed that methylation of N-terminal arginine and lysine do not disrupt binding to the UBR box, opening the door for possible additional regulatory roles of the Arg/N-end rule pathway in mammals or for the rational design of inhibitors of the pathway. Our studies expand the understanding of the N-degron recognition by the UBR family of proteins and redefine the N-end rule as a regulatory mechanism that not only depends on the N-terminal residue.
Enzymes are increasingly being used in pharmaceutical and industrial environments, particularly as greener and more efficient alternatives to chemical catalysts. However, engineering new enzyme reactions is an arduous and inefficient process, mainly because the predictable outcome of protein engineering on 3D structure, function and dynamics remains elusive. Recent experimental evidence suggests that conformational exchange may be involved in promoting catalysis in many enzyme systems, but the mechanisms underlying this atomic flexibility remain unclear. It is still unknown whether sequence and/or structure are evolutionarily conserved to promote flexibility events linked to biological function among protein homologs. Understanding phenomena underlying protein dynamics is thus an important step in facilitating protein engineering. In order to tackle these interrogations, we have used NMR to characterize the millisecond timescale conformational exchange in various members of the ribonuclease A superfamily. While these enzymes display very similar structure, their evolutionary distance and diversified biological activities complicate flexibility-function analyses. To solve this issue, we have investigated mammalian homologs of human ribonuclease 3 (Eosinophil Cationic Protein, ECP), comparing the human enzyme with its close ECP homologs from Pongo pygmaeus and Macaca fascicularis. Our findings show that conformational exchange in the monkey enzymes strongly resembles that of their human counterpart, providing insights into the effects of sequence and phylogenetic diversity on protein dynamics. Further experiments are required to determine the exact biological roles of these enzymes and their dependence on atomic flexibility.
Nonribosomal peptide synthetases (NRPSs) are true macromolecular machines, using modular assembly-line logic, a complex catalytic cycle, moving parts and many active sites to make their bio-active products. We have determined a series of crystal structures of the initiation module of an antibiotic-producing NRPS, linear gramicidin synthetase. This module includes the specialized tailoring formylation domain, and we capture states that represent every major step of the assembly-line synthesis in the initiation module. The structures show how the formylation domain is incorporated into the NRPS architecture and how it has evolved to act in concert with the other domains in the synthetic cycle. The transitions between the sequential conformations in the cycle are very large, with both the peptidyl carrier protein and the adenylation subdomain undergoing huge movements to transport substrate between distal active sites. The structures highlight the great versatility of NRPSs, as small domains repurpose and recycle their limited interfaces to interact with their various binding partners. Together our published and unpublished work presents a holistic view of the function of this elegant NRPS initiation module.
Our understanding of the contribution of protein dynamics to function is still emergent. In a protein engineering context, do we need to take into account the dynamics in order to maximize the fitness and function of the resulting proteins? Using high resolution crystal structures, NMR relaxation dispersion and µs molecular dynamics simulations, we compare two naturally evolved homologous class A β-lactamases, TEM-1 and PSE-4 which share a high degree of structural and functional conservation. We observed a conservation of restricted dynamics on a catalytically relevant timescale. This is consistent with dynamics being an evolutionarily conserved feature. However, laboratory-engineered chimeric enzymes obtained by recombination of the two homologs exhibit striking dynamic differences, despite the function and structure being conserved. The laboratory-engineered chimeras are thus functionally and structurally tolerant to modified dynamics on the timescale of the catalytic turnover. This tolerance of β-lactamases to dynamic changes could be linked to the high fitness of the naturally evolved proteins and implies that maintenance of native-like protein dynamics may not be essential when engineering functional proteins.