MAD：基于跨域扩散模型的一体化化妆技术

Research

arXiv

MAD: Makeup All-in-One with Cross-Domain Diffusion Model

摘要 Abstract

现有的化妆技术往往需要设计多个模型来处理不同的输入，并在不同领域之间对齐特征以完成不同的化妆任务（例如，美颜滤镜、妆容迁移和卸妆），这导致了复杂性的增加。另一个局限性在于缺乏文本引导的虚拟试妆功能，这种方式更友好，无需参考图像即可实现。在本研究中，我们首次尝试使用单一模型完成多种化妆任务。具体来说，我们将不同的化妆任务表述为跨域转换，并利用跨域扩散模型完成所有任务。与现有依赖单独编码器-解码器配置或基于循环机制的方法不同，我们提出使用不同的领域嵌入来促进领域控制。这种方法仅需通过更改嵌入即可实现无缝的领域切换，从而减少了对不同任务附加模块的依赖。此外，为了支持精确的文本到化妆应用，我们通过扩展MT数据集并添加文本注释，引入了MT-Text数据集，推动了化妆技术的实用性。

Existing makeup techniques often require designing multiple models to handle different inputs and align features across domains for different makeup tasks, e.g., beauty filter, makeup transfer, and makeup removal, leading to increased complexity. Another limitation is the absence of text-guided makeup try-on, which is more user-friendly without needing reference images. In this study, we make the first attempt to use a single model for various makeup tasks. Specifically, we formulate different makeup tasks as cross-domain translations and leverage a cross-domain diffusion model to accomplish all tasks. Unlike existing methods that rely on separate encoder-decoder configurations or cycle-based mechanisms, we propose using different domain embeddings to facilitate domain control. This allows for seamless domain switching by merely changing embeddings with a single model, thereby reducing the reliance on additional modules for different tasks. Moreover, to support precise text-to-makeup applications, we introduce the MT-Text dataset by extending the MT dataset with textual annotations, advancing the practicality of makeup technologies.