RiboGen:基于等变多流模型的RNA序列与结构联合生成方法
RiboGen: RNA Sequence and Structure Co-Generation with Equivariant MultiFlow
摘要 Abstract
核糖核酸(RNA)在生物系统中扮演着基础角色,从携带遗传信息到执行酶功能均有重要作用。理解和设计RNA能够推动新型治疗手段的应用以及生物技术的创新。为提升RNA设计能力,本文介绍了一种名为RiboGen的深度学习模型,这是首个能够同时生成RNA序列及其全原子三维结构的模型。RiboGen结合了标准流匹配与离散流匹配的多模态数据表征方式。该模型基于欧几里得等变神经网络,以高效处理和学习三维几何结构。实验结果显示,RiboGen可以高效生成化学上合理且自洽的RNA样本。我们的研究结果表明,序列与结构的联合生成是一种有竞争力的RNA建模方法。
Ribonucleic acid (RNA) plays fundamental roles in biological systems, from carrying genetic information to performing enzymatic function. Understanding and designing RNA can enable novel therapeutic application and biotechnological innovation. To enhance RNA design, in this paper we introduce RiboGen, the first deep learning model to simultaneously generate RNA sequence and all-atom 3D structure. RiboGen leverages the standard Flow Matching with Discrete Flow Matching in a multimodal data representation. RiboGen is based on Euclidean Equivariant neural networks for efficiently processing and learning three-dimensional geometry. Our experiments show that RiboGen can efficiently generate chemically plausible and self-consistent RNA samples. Our results suggest that co-generation of sequence and structure is a competitive approach for modeling RNA.