KernelFusion：基于补丁扩散的无假设盲超分辨率方法

Research

arXiv

KernelFusion: Assumption-Free Blind Super-Resolution via Patch Diffusion

摘要 Abstract

传统的超分辨率（SR）方法假设高分辨率（HR）图像和低分辨率（LR）图像之间存在“理想”的降采样SR核（例如，双三次降采样）。一旦LR图像以不同方式生成，这些方法就会失效。当前的盲超分辨率方法旨在消除这一假设，但仍然局限于相对简单的降采样SR核（例如，各向异性高斯核），在更复杂的（分布外）降采样退化情况下表现不佳。然而，选择正确的SR核往往比采用复杂的SR算法更为重要。在“KernelFusion”中，我们提出了一种零样本扩散驱动的方法，无需对核进行任何假设。我们的方法直接从LR输入图像恢复特定于图像的SR核，同时恢复其对应的HR图像。KernelFusion利用了这样一个原则：正确的SR核是能够在LR图像的不同尺度间最大化补丁相似性的核。首先，我们在单一LR输入图像上训练一个特定于图像的基于补丁的扩散模型，捕捉其独特的内部补丁统计特性。然后，在保持HR和LR图像之间跨尺度关系的同时，使用相同的学习补丁分布重建更大的HR图像，并同时恢复正确的降采样SR核。实证结果表明，KernelFusion在处理复杂的降采样退化时，大幅优于现有的所有SR基准方法，而现有的SotA盲超分辨率方法则完全失败。通过摆脱预定义核假设的束缚，KernelFusion推动盲超分辨率进入了一个全新的无假设范式，处理了之前被认为不可能解决的降采样核问题。

Traditional super-resolution (SR) methods assume an ``ideal'' downscaling SR-kernel (e.g., bicubic downscaling) between the high-resolution (HR) image and the low-resolution (LR) image. Such methods fail once the LR images are generated differently. Current blind-SR methods aim to remove this assumption, but are still fundamentally restricted to rather simplistic downscaling SR-kernels (e.g., anisotropic Gaussian kernels), and fail on more complex (out of distribution) downscaling degradations. However, using the correct SR-kernel is often more important than using a sophisticated SR algorithm. In ``KernelFusion'' we introduce a zero-shot diffusion-based method that makes no assumptions about the kernel. Our method recovers the unique image-specific SR-kernel directly from the LR input image, while simultaneously recovering its corresponding HR image. KernelFusion exploits the principle that the correct SR-kernel is the one that maximizes patch similarity across different scales of the LR image. We first train an image-specific patch-based diffusion model on the single LR input image, capturing its unique internal patch statistics. We then reconstruct a larger HR image with the same learned patch distribution, while simultaneously recovering the correct downscaling SR-kernel that maintains this cross-scale relation between the HR and LR images. Empirical results show that KernelFusion vastly outperforms all SR baselines on complex downscaling degradations, where existing SotA Blind-SR methods fail miserably. By breaking free from predefined kernel assumptions, KernelFusion pushes Blind-SR into a new assumption-free paradigm, handling downscaling kernels previously thought impossible.