一张图片就足够了:通过单一图像对视觉文档检索增强生成进行投毒攻击

One Pic is All it Takes: Poisoning Visual Document Retrieval Augmented Generation with a Single Image

摘要 Abstract

多模态检索增强生成(M-RAG)最近作为一种方法出现,旨在通过事实知识库(KB)抑制大型多模态模型(LMMs)的幻觉现象。然而,M-RAG也引入了新的攻击向量,对手可以通过向知识库中注入恶意条目来破坏系统。在这项工作中,我们针对M-RAG提出了一个针对视觉文档检索应用的投毒攻击,其中知识库包含文档页面的图像。我们的目标是设计一个单一的图像,能够被各种不同的用户查询检索到,并且一致地影响生成模型产生的输出,从而对M-RAG系统造成一种通用的拒绝服务(DoS)攻击。我们证明,尽管我们的攻击对广泛使用的各种最先进的检索器(嵌入模型)和生成器(LMMs)都有效,但它也可能对鲁棒的嵌入模型无效。我们的攻击不仅揭示了M-RAG管道对投毒攻击的脆弱性,还揭示了一个根本性的弱点,这可能在良性环境中阻碍其性能。

Multimodal retrieval augmented generation (M-RAG) has recently emerged as a method to inhibit hallucinations of large multimodal models (LMMs) through a factual knowledge base (KB). However, M-RAG also introduces new attack vectors for adversaries that aim to disrupt the system by injecting malicious entries into the KB. In this work, we present a poisoning attack against M-RAG targeting visual document retrieval applications, where the KB contains images of document pages. Our objective is to craft a single image that is retrieved for a variety of different user queries, and consistently influences the output produced by the generative model, thus creating a universal denial-of-service (DoS) attack against the M-RAG system. We demonstrate that while our attack is effective against a diverse range of widely-used, state-of-the-art retrievers (embedding models) and generators (LMMs), it can also be ineffective against robust embedding models. Our attack not only highlights the vulnerability of M-RAG pipelines to poisoning attacks, but also sheds light on a fundamental weakness that potentially hinders their performance even in benign settings.

一张图片就足够了:通过单一图像对视觉文档检索增强生成进行投毒攻击 - arXiv