摘要 Abstract
实现轻松且精确的分割,并最大程度减少临床医生的工作量,可以极大地优化临床工作流程。受METAs Segment Anything启发的最新交互式分割模型取得了显著进展,但在三维放射学领域面临一些关键限制,例如对三维数据进行二维模型逐层操作的人机交互需求不切实际,以及缺乏迭代优化。先前的研究由于评估协议不足,导致性能评估不可靠,不同研究之间的结果也不一致。RadioActive基准通过为临床相关场景中的交互式分割方法提供严格且可重复的评估框架解决了这些问题。它具有多样化的数据集、广泛的靶结构范围,以及最先进的二维和三维交互式分割方法,所有这些都包含在一个灵活且可扩展的代码库中。我们还引入了先进的提示技术,减少了交互步骤,从而实现了二维和三维模型之间的公平比较。令人惊讶的是,在仅需少量交互即可为三维体积生成提示的设置中,SAM2在所有专门的医学二维和三维模型中表现最佳。这挑战了现有的假设,并证明了通用模型优于专门的医学方法。通过开源RadioActive,我们邀请研究人员集成他们的模型和提示技术,确保对三维医学交互模型进行持续且透明的评估。
Effortless and precise segmentation with minimal clinician effort could greatly streamline clinical workflows. Recent interactive segmentation models, inspired by METAs Segment Anything, have made significant progress but face critical limitations in 3D radiology. These include impractical human interaction requirements such as slice-by-slice operations for 2D models on 3D data and a lack of iterative refinement. Prior studies have been hindered by inadequate evaluation protocols, resulting in unreliable performance assessments and inconsistent findings across studies. The RadioActive benchmark addresses these challenges by providing a rigorous and reproducible evaluation framework for interactive segmentation methods in clinically relevant scenarios. It features diverse datasets, a wide range of target structures, and the most impactful 2D and 3D interactive segmentation methods, all within a flexible and extensible codebase. We also introduce advanced prompting techniques that reduce interaction steps, enabling fair comparisons between 2D and 3D models. Surprisingly, SAM2 outperforms all specialized medical 2D and 3D models in a setting requiring only a few interactions to generate prompts for a 3D volume. This challenges prevailing assumptions and demonstrates that general-purpose models surpass specialized medical approaches. By open-sourcing RadioActive, we invite researchers to integrate their models and prompting techniques, ensuring continuous and transparent evaluation of 3D medical interactive models.