一切补丁皆重要,更多补丁更佳:通过全景补丁学习增强AI生成图像检测
All Patches Matter, More Patches Better: Enhance AI-Generated Image Detection via Panoptic Patch Learning
摘要 Abstract
AI生成图像(AIGIs)的指数增长凸显了对鲁棒且可泛化的检测方法的迫切需求。本文通过系统分析建立了AIGI检测的两条关键原则:\textbf{(1)一切补丁皆重要}:与传统图像分类中判别特征集中在对象中心区域不同,由于统一生成过程,AIGIs中的每个补丁都内在地包含合成伪影,表明每个补丁都是重要的伪影来源;\textbf{(2)更多补丁更佳}:利用分布在更多补丁中的分散伪影可以捕获互补的取证证据,并减少对特定补丁的过度依赖,从而提高鲁棒性和泛化能力。然而,我们的反事实分析揭示了一个令人不安的现象:未经适当训练的检测器往往表现出一种\textbf{少数补丁偏见},即基于少数补丁区分真实图像和合成图像。我们确定\textbf{懒惰学习者}为根本原因:检测器倾向于优先学习有限补丁中的显眼伪影,而忽略更广泛的伪影分布。为了解决这种偏见,我们提出了\textbf{全景补丁学习(PPL)框架},包括:(1)随机补丁替换,随机用真实补丁替换合成补丁,迫使模型识别利用率较低区域中的伪影,鼓励更广泛地利用更多补丁;(2)补丁级对比学习,确保所有补丁具有一致的判别能力,保证所有补丁的均匀利用。在多个基准数据集上的多种实验验证了我们的方法的有效性。
The exponential growth of AI-generated images (AIGIs) underscores the urgent need for robust and generalizable detection methods. In this paper, we establish two key principles for AIGI detection through systematic analysis: \textbf{(1) All Patches Matter:} Unlike conventional image classification where discriminative features concentrate on object-centric regions, each patch in AIGIs inherently contains synthetic artifacts due to the uniform generation process, suggesting that every patch serves as an important artifact source for detection. \textbf{(2) More Patches Better}: Leveraging distributed artifacts across more patches improves detection robustness by capturing complementary forensic evidence and reducing over-reliance on specific patches, thereby enhancing robustness and generalization. However, our counterfactual analysis reveals an undesirable phenomenon: naively trained detectors often exhibit a \textbf{Few-Patch Bias}, discriminating between real and synthetic images based on minority patches. We identify \textbf{Lazy Learner} as the root cause: detectors preferentially learn conspicuous artifacts in limited patches while neglecting broader artifact distributions. To address this bias, we propose the \textbf{P}anoptic \textbf{P}atch \textbf{L}earning (PPL) framework, involving: (1) Random Patch Replacement that randomly substitutes synthetic patches with real counterparts to compel models to identify artifacts in underutilized regions, encouraging the broader use of more patches; (2) Patch-wise Contrastive Learning that enforces consistent discriminative capability across all patches, ensuring uniform utilization of all patches. Extensive experiments across two different settings on several benchmarks verify the effectiveness of our approach.