一张图片就足够了：通过单一图像对视觉文档检索增强生成进行投毒攻击

Research

arXiv

One Pic is All it Takes: Poisoning Visual Document Retrieval Augmented Generation with a Single Image

Dan Ristea ,

摘要 Abstract

多模态检索增强生成（M-RAG）最近作为一种方法出现，旨在通过事实知识库（KB）抑制大型多模态模型（LMMs）的幻觉现象。然而，M-RAG也引入了新的攻击向量，对手可以通过向知识库中注入恶意条目来破坏系统。在这项工作中，我们针对M-RAG提出了一个针对视觉文档检索应用的投毒攻击，其中知识库包含文档页面的图像。我们的目标是设计一个单一的图像，能够被各种不同的用户查询检索到，并且一致地影响生成模型产生的输出，从而对M-RAG系统造成一种通用的拒绝服务（DoS）攻击。我们证明，尽管我们的攻击对广泛使用的各种最先进的检索器（嵌入模型）和生成器（LMMs）都有效，但它也可能对鲁棒的嵌入模型无效。我们的攻击不仅揭示了M-RAG管道对投毒攻击的脆弱性，还揭示了一个根本性的弱点，这可能在良性环境中阻碍其性能。

Multimodal retrieval augmented generation (M-RAG) has recently emerged as a method to inhibit hallucinations of large multimodal models (LMMs) through a factual knowledge base (KB). However, M-RAG also introduces new attack vectors for adversaries that aim to disrupt the system by injecting malicious entries into the KB. In this work, we present a poisoning attack against M-RAG targeting visual document retrieval applications, where the KB contains images of document pages. Our objective is to craft a single image that is retrieved for a variety of different user queries, and consistently influences the output produced by the generative model, thus creating a universal denial-of-service (DoS) attack against the M-RAG system. We demonstrate that while our attack is effective against a diverse range of widely-used, state-of-the-art retrievers (embedding models) and generators (LMMs), it can also be ineffective against robust embedding models. Our attack not only highlights the vulnerability of M-RAG pipelines to poisoning attacks, but also sheds light on a fundamental weakness that potentially hinders their performance even in benign settings.