• Login
  • Register

Work for a Member company and need a Member Portal account? Register here with your company email address.

Project

RiboGen: RNA Sequence and Structure Co-Generation

Copyright

Jian Fan, iStock

Jian Fan

RNA is a fundamental biomolecule that stands at the intersection of modern biology and the origins of life. It carries genetic instructions, regulates cellular processes, and participates in countless biological mechanisms through its complex 3D structures. Therefore, understanding and designing RNA is important for various therapeutic applications and biotechnological innovation. 

In our most recent paper we introduce  RiboGen, the first deep learning model to simultaneously generate RNA sequence and all-atom 3D structure. While previous approaches could predict independent different components of RNA sequence-structure, RiboGen co-generates sequence, backbone and atomic features all at once. This allows for novel exploration of the sequence-structure landscape in ways that weren't previously possible.

RiboGen uses Flow Matching and Discrete Flow Matching, techniques that allow it to process multimodal data—handling both the sequence (a string of nucleotides) and the 3D geometry of RNA molecules. At its core, RiboGen employs Euclidean Equivariant Neural Networks which respects the symmetries and spatial relationships within molecular structures, creating powerful new tool for RNA design and exploration.

Our model represents an RNA molecule in three dimensions:

  1. Sequence - The series of nucleotides (A, C, G, U) that form the RNA chain
  2. Coordinates - The 3D positioning of each nucleotide's center
  3. Geometric features - The relative positions of all nucleotide heavy atoms

Our experiments demonstrate that RiboGen can efficiently generate chemically plausible and self-consistent RNA samples. We evaluated our model's performance using:

  • Chemical validity analysis - Examining critical dihedral angles and ribose puckering that define RNA backbone and base conformations
  • Self-consistency validation - Comparing our generated structures to those predicted by RhoFold, a state-of-the-art RNA structure prediction tool

The Future of RNA Design

Our findings suggest that co-generation of sequence and structure is a competitive approach for modeling RNA and opens new possibilities for RNA-based innovations. This is just the beginning,  here we demonstrate an important step forward in our ability to understand and manipulate RNA, with potential applications ranging from drug development to synthetic biology.

This research was made possible through the support of the Eleven Eleven Foundation, the Center for Bits and Atoms, and the MIT Media Lab Consortium.