Project

Evolutron: Deep Learning for Protein Design

Groups

Technological advances in the past decade have allowed us to take a close look at the proteomes of living organisms. As a result, more than 120,000 solved protein structures are readily available, and we are still on an exponential growth curve. By looking at the proteomes of current living organisms, we are essentially taking snapshots of the successful results in this evolutionary process of continuous adaptation to the environment. Could we process the information available to us from nature to design new proteins, without the need for millions of years of Darwinian evolution?

To answer this question, we are developing an integrated Deep Learning framework for the evolutionary analysis, search, and design of proteins, which we call Evolutron. Evolutron is based on a hierarchical decomposition of proteins into a set of functional motif embeddings. Two of our strongest motivations for this work are gene therapy and drug discovery. In both cases, protein analysis and design play a fundamental role in the implementation of safe and effective therapeutics.