摘要 Abstract
本文研究了场景亲和力(AIScene),即场景内的连贯性和场景间的相关性,用于驾驶场景中的半监督LiDAR语义分割。采用教师-学生训练方式,AIScene 使用教师网络从无标注数据中生成伪标注场景,然后利用这些伪标注场景监督学生网络的学习。不同于大多数方法在前向传播时包含伪标注场景中的所有点,而在反向传播时仅使用伪标注点,AIScene 移除了没有伪标注的点,从而确保场景内前向和反向传播的一致性。这种简单的点删除策略有效防止了未标注且语义模糊的点(在反向传播中被排除)对伪标注点学习的影响。此外,AIScene 结合基于块的数据增强技术,在场景和实例层面混合多个场景。与现有增强技术通常在两个场景之间进行场景级混合相比,我们的方法增强了标注(或伪标注)场景的语义多样性,从而提高了分割模型的半监督性能。实验表明,AIScene 在两个流行基准数据集的四种设置下优于先前的方法,在最具挑战性的 1% 标注数据下取得了显著的 1.9% 和 2.1% 的提升。
This paper explores scene affinity (AIScene), namely intra-scene consistency and inter-scene correlation, for semi-supervised LiDAR semantic segmentation in driving scenes. Adopting teacher-student training, AIScene employs a teacher network to generate pseudo-labeled scenes from unlabeled data, which then supervise the student network's learning. Unlike most methods that include all points in pseudo-labeled scenes for forward propagation but only pseudo-labeled points for backpropagation, AIScene removes points without pseudo-labels, ensuring consistency in both forward and backward propagation within the scene. This simple point erasure strategy effectively prevents unsupervised, semantically ambiguous points (excluded in backpropagation) from affecting the learning of pseudo-labeled points. Moreover, AIScene incorporates patch-based data augmentation, mixing multiple scenes at both scene and instance levels. Compared to existing augmentation techniques that typically perform scene-level mixing between two scenes, our method enhances the semantic diversity of labeled (or pseudo-labeled) scenes, thereby improving the semi-supervised performance of segmentation models. Experiments show that AIScene outperforms previous methods on two popular benchmarks across four settings, achieving notable improvements of 1.9% and 2.1% in the most challenging 1% labeled data.