针对图神经网络提取攻击的主动防御方法（ADAGE）

Research

arXiv

ADAGE: Active Defenses Against GNN Extraction

Jing Xu ,

Franziska Boenisch ,

Adam Dziedzic

论文信息在线阅读PDF

摘要 Abstract

图神经网络（GNNs）在药物发现、交通状态预测和推荐系统等实际应用中表现出色。然而，构建强大的GNN需要大量训练数据、强大的计算资源以及专业知识，这使得这些模型成为模型窃取攻击的诱人目标。先前的研究表明，针对GNN的窃取攻击威胁向量种类繁多，攻击者可以利用从节点标签到高维节点嵌入的各种异构信号，以远低于原始训练成本的方式创建目标GNN的本地副本。这种威胁向量的多样性使得设计有效的通用防御方法具有挑战性，而现有的防御措施通常仅针对特定的窃取设置。此外，它们仅提供识别被盗模型副本的方法，而非阻止攻击的发生。为了解决这一问题，我们提出了首个通用的针对图神经网络提取攻击的主动防御方法（ADAGE）。通过分析对GNN的查询，跟踪其在底层图中不同社区邻近程度的多样性，并随着已查询社区比例的增长而增强防御强度，ADAGE能够在所有常见的攻击设置下防止窃取行为。我们的实验评估基于六个基准数据集、四种GNN模型和三种自适应攻击者类型，结果显示，ADAGE使攻击者受到惩罚，以至于无法完成窃取行为，同时不会影响合法用户的预测性能。因此，ADAGE为未来安全共享有价值的GNN做出了贡献。

Graph Neural Networks (GNNs) achieve high performance in various real-world applications, such as drug discovery, traffic states prediction, and recommendation systems. The fact that building powerful GNNs requires a large amount of training data, powerful computing resources, and human expertise turns the models into lucrative targets for model stealing attacks. Prior work has revealed that the threat vector of stealing attacks against GNNs is large and diverse, as an attacker can leverage various heterogeneous signals ranging from node labels to high-dimensional node embeddings to create a local copy of the target GNN at a fraction of the original training costs. This diversity in the threat vector renders the design of effective and general defenses challenging and existing defenses usually focus on one particular stealing setup. Additionally, they solely provide means to identify stolen model copies rather than preventing the attack. To close this gap, we propose the first and general Active Defense Against GNN Extraction (ADAGE). By analyzing the queries to the GNN, tracking their diversity in terms of proximity to different communities identified in the underlying graph, and increasing the defense strength with the growing fraction of communities that have been queried, ADAGE can prevent stealing in all common attack setups. Our extensive experimental evaluation using six benchmark datasets, four GNN models, and three types of adaptive attackers shows that ADAGE penalizes attackers to the degree of rendering stealing impossible, whilst not harming predictive performance for legitimate users. ADAGE, thereby, contributes towards securely sharing valuable GNNs in the future.