联邦学习中的全局干预与蒸馏方法用于分布外泛化

Research

arXiv

Global Intervention and Distillation for Federated Out-of-Distribution Generalization

Zhuang Qi ,

Runhui Zhang ,

Lei Meng ,

Wei Wu ,

Yachong Zhang ,

Xiangxu Meng

论文信息在线阅读PDF

摘要 Abstract

联邦学习中的属性偏移会导致本地模型专注于学习非因果关联，引导其朝向不一致的优化方向，从而不可避免地导致性能下降和不稳定收敛。现有方法通常利用数据增强来提高样本多样性或采用知识蒸馏来学习不变表示。然而，生成数据质量的不稳定性和缺乏领域信息限制了其在未见样本上的表现。为了解决这些问题，本文提出了一种名为FedGID的全局干预与蒸馏方法，该方法利用多样化的属性特征进行后门调整，以打破背景与标签之间的虚假关联。该方法包括两个主要模块：全局干预模块自适应地解耦图像中的对象和背景，并将背景信息注入随机样本中以干预样本分布，将背景与所有类别联系起来，防止模型将背景-标签关联视为因果关系；全局蒸馏模块利用统一的知识库指导客户端模型的表示学习，防止本地模型过度拟合到特定客户端的属性上。在三个数据集上的实验结果表明，FedGID增强了模型在未见数据中关注主体的能力，并在协作建模方面优于现有方法。

Attribute skew in federated learning leads local models to focus on learning non-causal associations, guiding them towards inconsistent optimization directions, which inevitably results in performance degradation and unstable convergence. Existing methods typically leverage data augmentation to enhance sample diversity or employ knowledge distillation to learn invariant representations. However, the instability in the quality of generated data and the lack of domain information limit their performance on unseen samples. To address these issues, this paper presents a global intervention and distillation method, termed FedGID, which utilizes diverse attribute features for backdoor adjustment to break the spurious association between background and label. It includes two main modules, where the global intervention module adaptively decouples objects and backgrounds in images, injects background information into random samples to intervene in the sample distribution, which links backgrounds to all categories to prevent the model from treating background-label associations as causal. The global distillation module leverages a unified knowledge base to guide the representation learning of client models, preventing local models from overfitting to client-specific attributes. Experimental results on three datasets demonstrate that FedGID enhances the model's ability to focus on the main subjects in unseen data and outperforms existing methods in collaborative modeling.