鲁棒且有效的半监督现实世界目标检测构建模块

Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection

摘要 Abstract

基于伪标签的半监督目标检测(SSOD)通过有效利用标注数据和未标注数据显著减少了对大规模标注数据集的依赖。然而,SSOD在现实世界中的应用经常面临类别不平衡、标签噪声和标注错误等关键挑战。我们深入分析了现实条件下SSOD的问题,揭示了次优伪标签产生的原因以及标签质量和数量之间的关键权衡。基于我们的研究结果,我们提出了四个可以无缝集成到SSOD框架中的构建模块。稀有类别拼贴(RCC):一种数据增强方法,通过创建稀有物体的拼贴来增强稀有类别的表示。稀有类别聚焦(RCF):一种分层批量采样策略,确保训练过程中所有类别的平衡表示。真实标签校正(GLC):一种标签精化方法,通过利用教师模型预测的一致性来识别并修正虚假、缺失和嘈杂的真实标签。伪标签选择(PLS):一种通过新颖的指标估计遗漏检测率并考虑类别稀有性来去除低质量伪标签图像的选择方法。我们在自动驾驶数据集上进行了全面实验,验证了我们的方法,使SSOD性能提高了多达6%。总体而言,我们的研究和提出的以数据为中心且广泛适用的构建模块使复杂的现实场景中的鲁棒且有效的SSOD成为可能。代码可在https://mos-ks.github.io/publications获取。

Semi-supervised object detection (SSOD) based on pseudo-labeling significantly reduces dependence on large labeled datasets by effectively leveraging both labeled and unlabeled data. However, real-world applications of SSOD often face critical challenges, including class imbalance, label noise, and labeling errors. We present an in-depth analysis of SSOD under real-world conditions, uncovering causes of suboptimal pseudo-labeling and key trade-offs between label quality and quantity. Based on our findings, we propose four building blocks that can be seamlessly integrated into an SSOD framework. Rare Class Collage (RCC): a data augmentation method that enhances the representation of rare classes by creating collages of rare objects. Rare Class Focus (RCF): a stratified batch sampling strategy that ensures a more balanced representation of all classes during training. Ground Truth Label Correction (GLC): a label refinement method that identifies and corrects false, missing, and noisy ground truth labels by leveraging the consistency of teacher model predictions. Pseudo-Label Selection (PLS): a selection method for removing low-quality pseudo-labeled images, guided by a novel metric estimating the missing detection rate while accounting for class rarity. We validate our methods through comprehensive experiments on autonomous driving datasets, resulting in up to 6% increase in SSOD performance. Overall, our investigation and novel, data-centric, and broadly applicable building blocks enable robust and effective SSOD in complex, real-world scenarios. Code is available at https://mos-ks.github.io/publications.