U of T researchers' AI model designs proteins to deliver gene therapy

Dubbed ProteinVAE, the model can be trained to learn the characteristics of a long protein using limited data

Michael Garton, left, an associate professor of biomedical engineering, and PhD candidate Suyue Lyu, right, used AI to custom-design variants of hexons that are distinct from natural sequences to help evade the immune system (photo by Qin Dai)

Researchers at the University of Toronto used an artificial intelligence framework to redesign a crucial protein involved in the delivery of gene therapy.

The study, published in Nature Machine Intelligence, describes new work optimizing proteins to mitigate immune responses, thereby improving the efficacy of gene therapy and reducing side effects.

“Gene therapy holds immense promise, but the body’s pre-existing immune response to viral vectors greatly hampers its success. Our research zeroes in on hexons, a fundamental protein in adenovirus vectors, which – but for the immune problem – hold huge potential for gene therapy,” says Michael Garton, an assistant professor at the Institute of Biomedical Engineering in the Faculty of Applied Science & Engineering.

“Immune responses triggered by serotype-specific antibodies pose a significant obstacle in getting these vehicles to the right target; this can result in reduced efficacy and severe adverse effects.”

To address the issue, Garton’s lab used AI to custom-design variants of hexons that are distinct from natural sequences.

“We want to design something that is distant from all human variants and is, by extension, unrecognizable by the immune system,” says PhD candidate Suyue Lyu, who is lead author of the study.

Traditional methods of designing new protein often involve extensive trial and error as well as mounting costs. By using an AI-based approach for protein design, researchers can achieve a higher degree of variation, reduce costs and quickly generate simulation scenarios before homing in on a specific subset of targets for experimental testing.

While numerous protein-designing frameworks exist, it can be challenging for researchers to properly design new variants because of the lack of natural sequences available and hexons’ relatively large size – consisting, on average, of 983 amino acids.

With this in mind, Lyu and Garton developed a different AI framework. Dubbed ProteinVAE, the model can be trained to learn the characteristics of a long protein using limited data. Despite its compact design, ProteinVAE exhibits a generative capability comparable to larger available models.

“Our model takes advantage of pre-trained protein language models for efficient learning on small datasets. We also incorporated many tailored engineering approaches to make the model suitable for generating long proteins,” says Lyu, adding that ProteinVAE was intentionally designed to be lightweight. “Unlike other, considerably larger models that demand high computational resources to design a long protein, ProteinVAE supports fast training and inference on any standard GPUs. This feature could make the model more friendly for other academic labs.

“Our AI model, validated through molecular simulation, demonstrates the ability to change a significant percentage of the protein’s surface, potentially evading immune responses.”

The next step is experimental testing in a wet lab, Lyu adds.

Garton believes the AI-model can be utilized beyond gene therapy protein design and could likely be expanded to support protein design in other disease cases as well.

“This work indicates that we are potentially able to design new subspecies and even species of biological entities using generative AI,” he says, “and these entities have therapeutic value that can be used in novel medical treatments.”

The research was supported by the Canadian Institute of Health Research and the Natural Sciences and Engineering Research Council of Canada.