高斯扩散模型与通用量化在图像压缩中的桥梁

Research

arXiv

Bridging the Gap between Gaussian Diffusion Models and Universal Quantization for Image Compression

Lucas Relic ,

Yang Zhang ,

摘要 Abstract

生成式神经图像压缩能够在极低比特率下支持数据表示，在客户端合成细节，并始终生成高度逼真的图像。通过利用量化误差与加性噪声之间的相似性，可以使用潜在扩散模型构建基于扩散的生成式图像压缩编解码器，以“去噪”量化引入的伪影。然而，我们发现先前遵循这一范式的三种关键差距（即噪声水平、噪声类型和离散化差距），导致量化的数据脱离扩散模型已知的数据分布。在这项工作中，我们提出了一种具有理论基础的新量化前向扩散过程，解决了上述三个差距。我们通过精心设计的量化调度和使用均匀噪声训练的扩散模型实现这一点。与之前的工作相比，我们的方法在非常低的比特率下也能产生一致的逼真且详细的重建结果。在这种情况下，我们在率失真逼真度性能方面表现出色，优于先前的相关工作。

Generative neural image compression supports data representation at extremely low bitrate, synthesizing details at the client and consistently producing highly realistic images. By leveraging the similarities between quantization error and additive noise, diffusion-based generative image compression codecs can be built using a latent diffusion model to "denoise" the artifacts introduced by quantization. However, we identify three critical gaps in previous approaches following this paradigm (namely, the noise level, noise type, and discretization gaps) that result in the quantized data falling out of the data distribution known by the diffusion model. In this work, we propose a novel quantization-based forward diffusion process with theoretical foundations that tackles all three aforementioned gaps. We achieve this through universal quantization with a carefully tailored quantization schedule and a diffusion model trained with uniform noise. Compared to previous work, our proposal produces consistently realistic and detailed reconstructions, even at very low bitrates. In such a regime, we achieve the best rate-distortion-realism performance, outperforming previous related works.