All Publications


  • Unsupervised evolution of protein and antibody complexes with a structure-informed language model. Science (New York, N.Y.) Shanker, V. R., Bruun, T. U., Hie, B. L., Kim, P. S. 2024; 385 (6704): 46-53

    Abstract

    Large language models trained on sequence information alone can learn high-level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here, we show that a general protein language model augmented with protein structure backbone coordinates can guide evolution for diverse proteins without the need to model individual functional tasks. We also demonstrate that ESM-IF1, which was only trained on single-chain structures, can be extended to engineer protein complexes. Using this approach, we screened about 30 variants of two therapeutic clinical antibodies used to treat severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We achieved up to 25-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants of concern BQ.1.1 and XBB.1.5, respectively. These findings highlight the advantage of integrating structural information to identify efficient protein evolution trajectories without requiring any task-specific training data.

    View details for DOI 10.1126/science.adk8946

    View details for PubMedID 38963838

  • Inverse folding of protein complexes with a structure-informed language model enables unsupervised antibody evolution. bioRxiv : the preprint server for biology Shanker, V. R., Bruun, T. U., Hie, B. L., Kim, P. S. 2023

    Abstract

    Large language models trained on sequence information alone are capable of learning high level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here we show that a general protein language model augmented with protein structure backbone coordinates and trained on the inverse folding problem can guide evolution for diverse proteins without needing to explicitly model individual functional tasks. We demonstrate inverse folding to be an effective unsupervised, structure-based sequence optimization strategy that also generalizes to multimeric complexes by implicitly learning features of binding and amino acid epistasis. Using this approach, we screened ~30 variants of two therapeutic clinical antibodies used to treat SARS-CoV-2 infection and achieved up to 26-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants-of-concern BQ.1.1 and XBB.1.5, respectively. In addition to substantial overall improvements in protein function, we find inverse folding performs with leading experimental success rates among other reported machine learning-guided directed evolution methods, without requiring any task-specific training data.

    View details for DOI 10.1101/2023.12.19.572475

    View details for PubMedID 38187780

    View details for PubMedCentralID PMC10769282

  • Efficient evolution of human antibodies from general protein language models. Nature biotechnology Hie, B. L., Shanker, V. R., Xu, D., Bruun, T. U., Weidenbacher, P. A., Tang, S., Wu, W., Pak, J. E., Kim, P. S. 2023

    Abstract

    Natural evolution must explore a vast landscape of possible sequences for desirable yet rare mutations, suggesting that learning from natural evolutionary strategies could guide artificial evolution. Here we report that general protein language models can efficiently evolve human antibodies by suggesting mutations that are evolutionarily plausible, despite providing the model with no information about the target antigen, binding specificity or protein structure. We performed language-model-guided affinity maturation of seven antibodies, screening 20 or fewer variants of each antibody across only two rounds of laboratory evolution, and improved the binding affinities of four clinically relevant, highly mature antibodies up to sevenfold and three unmatured antibodies up to 160-fold, with many designs also demonstrating favorable thermostability and viral neutralization activity against Ebola and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pseudoviruses. The same models that improve antibody binding also guide efficient evolution across diverse protein families and selection pressures, including antibiotic resistance and enzyme activity, suggesting that these results generalize to many settings.

    View details for DOI 10.1038/s41587-023-01763-2

    View details for PubMedID 37095349

    View details for PubMedCentralID 4410700

  • PROTEASOME REARRANGED PEPTIDES ARE A NOVEL SOURCE OF ANTIGENS FOR THE PEPTIDE VACCINE IMMUNOTHERAPY OF GLIOBLASTOMAS Fidanza, M., Gupta, P., Shanker, V., Wong, N., Skirboll, S., Lim, M., Wong, A. OXFORD UNIV PRESS INC. 2022: 138
  • Enhancing proteasomal processing improves survival for a peptide vaccine used to treat glioblastoma. Science translational medicine Fidanza, M., Gupta, P., Sayana, A., Shanker, V., Pahlke, S., Vu, B., Krantz, F., Azameera, A., Wong, N., Anne, N., Xia, Y., Rong, J., Anne, A., Skirboll, S., Lim, M., Wong, A. J. 2021; 13 (598)

    Abstract

    Despite its essential role in antigen presentation, enhancing proteasomal processing is an unexploited strategy for improving vaccines. pepVIII, an anticancer vaccine targeting EGFRvIII, has been tested in several trials for glioblastoma. We examined 20 peptides in silico and experimentally, which showed that a tyrosine substitution (Y6-pepVIII) maximizes proteasome cleavage and survival in a subcutaneous tumor model in mice. In an intracranial glioma model, Y6-pepVIII showed a 62 and 31% improvement in median survival compared to control animals and pepVIII-vaccinated mice. Y6-pepVIII vaccination altered tumor-infiltrating lymphocyte subsets and expression of PD-1 on intratumoral T cells. Combination with anti-PD-1 therapy cured 45% of the Y6-pepVIII-vaccinated mice but was ineffective for pepVIII-treated mice. Liquid chromatography-tandem mass spectrometry analysis of proteasome-digested pepVIII and Y6-pepVIII revealed that most fragments were similar but more abundant in Y6-pepVIII digests and 77% resulted from proteasome-catalyzed peptide splicing (PCPS). We identified 10 peptides that bound human and murine MHC class I. Nine were PCPS products and only one peptide was colinear with EGFRvIII, indicating that PCPS fragments may be a component of MHC class I recognition. Despite not being colinear with EGFRvIII, two of three PCPS products tested were capable of increasing survival when administered independently as vaccines. We hypothesize that the immune response to a vaccine represents the collective contribution from multiple PCPS and linear products. Our work suggests a strategy to increase proteasomal processing of a vaccine that results in an augmented immune response and enhanced survival in mice.

    View details for DOI 10.1126/scitranslmed.aax4100

    View details for PubMedID 34135109