基于跨注意力融合超图神经网络的药物-靶点结合亲和力预测:HCAF-DTA模型

HCAF-DTA: drug-target binding affinity prediction with cross-attention fused hypergraph neural networks

摘要 Abstract

药物与靶蛋白之间结合亲和力的准确预测是计算机辅助药物设计的核心任务。现有的深度学习方法往往忽略了药物分子内部亚结构特征以及药物-靶点相互作用的信息,导致预测性能有限。本文提出了一种基于跨注意力融合超图神经网络的药物-靶点关联预测模型HCAF-DTA。该模型在特征提取阶段创新性地引入了超图表示:基于树分解算法构建药物分子超图,并通过跳跃连接将超图神经网络与图形神经网络融合,提取亚结构和全局特征,其中超边能够高效表征功能基团及其他关键化学特征;对于蛋白质特征提取,基于ESM模型预测的接触图构建加权图,并采用多层图神经网络捕获空间依赖关系。在预测阶段,设计了双向多头跨注意力机制,从原子和氨基酸的双重视角建模分子间相互作用,并通过注意力机制融合具有相关性的跨模态特征。在Davis和KIBA等基准数据集上的实验表明,HCAF-DTA在均方误差(MSE)等三项性能评估指标上均优于现有最先进的方法,分别达到0.198和0.122,相比最优基线提升了高达4%。

Accurate prediction of the binding affinity between drugs and target proteins is a core task in computer-aided drug design. Existing deep learning methods tend to ignore the information of internal sub-structural features of drug molecules and drug-target interactions, resulting in limited prediction performance. In this paper, we propose a drug-target association prediction model HCAF-DTA based on cross-attention fusion hypergraph neural network. The model innovatively introduces hypergraph representation in the feature extraction stage: drug molecule hypergraphs are constructed based on the tree decomposition algorithm, and the sub-structural and global features extracted by fusing the hypergraph neural network with the graphical neural network through hopping connections, in which the hyper edges can efficiently characterise the functional functional groups and other key chemical features; for the protein feature extraction, a weighted graph is constructed based on the residues predicted by the ESM model contact maps to construct weighted graphs, and multilayer graph neural networks were used to capture spatial dependencies. In the prediction stage, a bidirectional multi-head cross-attention mechanism is designed to model intermolecular interactions from the dual viewpoints of atoms and amino acids, and cross-modal features with correlated information are fused by attention. Experiments on benchmark datasets such as Davis and KIBA show that HCAF-DTA outperforms state of the arts in all three performance evaluation metrics, with the MSE metrics reaching 0.198 and 0.122, respectively, with an improvement of up to 4% from the optimal baseline.