PTDiffusion:基于相位迁移扩散模型的光学幻觉隐藏图像生成的免费午餐
PTDiffusion: Free Lunch for Generating Optical Illusion Hidden Pictures with Phase-Transferred Diffusion Model
摘要 Abstract
光学幻觉隐藏图像是一个有趣的视觉感知现象,其中一幅图像被巧妙地融入到另一幅图像中,使得观看者不易立即察觉。基于现成的文字到图像(T2I)扩散模型,我们提出了一种新颖的无训练文本引导图像到图像(I2I)转换框架,称为“相位迁移扩散模型”(PTDiffusion),用于隐藏艺术合成。PTDiffusion 将输入的参考图像和谐地嵌入到由文本提示描述的任意场景中,生成展示参考图像隐藏视觉线索的幻觉图像。我们的方法核心在于一种即插即用的相位迁移机制,该机制动态且逐步地将去噪过程中扩散特征的相位谱移植到重构参考图像,从而在扩散模型潜在空间中实现参考结构信息和文本语义信息的深度融合。此外,我们提出了异步相位迁移,以灵活控制隐藏内容可辨识的程度。我们的方法无需任何模型训练和微调过程,同时在图像生成质量、文本保真度、视觉可辨性和上下文自然性方面显著优于相关文本引导的 I2I 方法,这通过广泛的定性和定量实验得到了证明。我们的项目可在以下网页公开获取:\href{https://xianggao1102.github.io/PTDiffusion_webpage/}{此网页}。
Optical illusion hidden picture is an interesting visual perceptual phenomenon where an image is cleverly integrated into another picture in a way that is not immediately obvious to the viewer. Established on the off-the-shelf text-to-image (T2I) diffusion model, we propose a novel training-free text-guided image-to-image (I2I) translation framework dubbed as \textbf{P}hase-\textbf{T}ransferred \textbf{Diffusion} Model (PTDiffusion) for hidden art syntheses. PTDiffusion harmoniously embeds an input reference image into arbitrary scenes described by the text prompts, producing illusion images exhibiting hidden visual cues of the reference image. At the heart of our method is a plug-and-play phase transfer mechanism that dynamically and progressively transplants diffusion features' phase spectrum from the denoising process to reconstruct the reference image into the one to sample the generated illusion image, realizing deep fusion of the reference structural information and the textual semantic information in the diffusion model latent space. Furthermore, we propose asynchronous phase transfer to enable flexible control to the degree of hidden content discernability. Our method bypasses any model training and fine-tuning process, all while substantially outperforming related text-guided I2I methods in image generation quality, text fidelity, visual discernibility, and contextual naturalness for illusion picture synthesis, as demonstrated by extensive qualitative and quantitative experiments. Our project is publically available at \href{https://xianggao1102.github.io/PTDiffusion_webpage/}{this web page}.