摘要 Abstract
RNA结构-功能关系近期引起了深度学习社区的极大关注,并随着核酸结构模型的进步,其重要性有望进一步提升。然而,RNA三维结构上缺乏标准化且易于访问的基准测试,阻碍了RNA功能特性模型的发展。在这项工作中,我们引入了一组七个用于RNA结构-功能预测的基准数据集,旨在填补这一空白。我们的库基于已建立的Python库rnaglib构建,提供便捷的数据分发与编码、拆分器及评估方法,为比较模型提供了一个方便的一站式框架。数据集以完全模块化和可重现的方式实现,便于社区贡献和定制化。最后,我们使用图神经网络为所有任务提供了初步的基准结果。源代码:https://github.com/cgoliver/rnaglib 文档:https://rnaglib.org
The RNA structure-function relationship has recently garnered significant attention within the deep learning community, promising to grow in importance as nucleic acid structure models advance. However, the absence of standardized and accessible benchmarks for deep learning on RNA 3D structures has impeded the development of models for RNA functional characteristics. In this work, we introduce a set of seven benchmarking datasets for RNA structure-function prediction, designed to address this gap. Our library builds on the established Python library rnaglib, and offers easy data distribution and encoding, splitters and evaluation methods, providing a convenient all-in-one framework for comparing models. Datasets are implemented in a fully modular and reproducible manner, facilitating for community contributions and customization. Finally, we provide initial baseline results for all tasks using a graph neural network. Source code: https://github.com/cgoliver/rnaglib Documentation: https://rnaglib.org