摘要 Abstract
我们引入了一种名为分裂式遗忘(Split Unlearning)的新颖机器学习遗忘技术,该技术专为分裂式学习(Split Learning, SL)设计,实现了在SL框架中首次提出的分片化(Sharded)、隔离化(Isolated)、切片化(Sliced)和聚合化(Aggregated)遗忘(SISA)。特别地,现有SL框架中客户端与服务器之间的紧密耦合导致频繁的双向数据流以及针对所有客户端的迭代训练,这违反了“隔离化”的原则,使得它们难以实现独立且高效的SISA遗忘。为了解决这一问题,我们提出了SplitWiper,采用一种新的单向一次性传播方案,利用SL固有的“分片化”结构,解耦客户端与服务器之间的神经信号传播,从而即使在存在缺失客户端的情况下也能实现有效的SISA遗忘。我们进一步设计了SplitWiper+以增强客户端标签隐私,通过集成差分隐私和标签扩展策略,防御服务器和其他潜在对手对客户端标签隐私的攻击。实验结果表明,在不同数据分布和任务下,SplitWiper对被遗忘标签达到了0%的准确率,并且相比非SISA遗忘方法,在保留标签上的准确率提高了8%。此外,单向一次性传播保持了恒定开销,计算和通信成本降低了99%。SplitWiper+在与服务器共享掩码标签时仍能保留90%的标签隐私。
We introduce Split Unlearning, a novel machine unlearning technology designed for Split Learning (SL), enabling the first-ever implementation of Sharded, Isolated, Sliced, and Aggregated (SISA) unlearning in SL frameworks. Particularly, the tight coupling between clients and the server in existing SL frameworks results in frequent bidirectional data flows and iterative training across all clients, violating the "Isolated" principle and making them struggle to implement SISA for independent and efficient unlearning. To address this, we propose SplitWiper with a new one-way-one-off propagation scheme, which leverages the inherently "Sharded" structure of SL and decouples neural signal propagation between clients and the server, enabling effective SISA unlearning even in scenarios with absent clients. We further design SplitWiper+ to enhance client label privacy, which integrates differential privacy and label expansion strategy to defend the privacy of client labels against the server and other potential adversaries. Experiments across diverse data distributions and tasks demonstrate that SplitWiper achieves 0% accuracy for unlearned labels, and 8% better accuracy for retained labels than non-SISA unlearning in SL. Moreover, the one-way-one-off propagation maintains constant overhead, reducing computational and communication costs by 99%. SplitWiper+ preserves 90% of label privacy when sharing masked labels with the server.