荧光显微镜中通道无关的掩码自编码器的分布外评估

Out-of-distribution evaluations of channel agnostic masked autoencoders in fluorescence microscopy

摘要 Abstract

高内涵筛选的计算机视觉开发由于实验条件、扰动剂和荧光标记的变化而产生多种分布偏移,这使得模型开发具有挑战性。基于迁移学习的典型模型评估方法混淆了不同来源的分布偏移,限制了对模型设计和训练如何影响泛化的解释。我们提出了一种使用JUMP-CP数据集隔离分布偏移源的评估方案,使研究人员能够针对特定的分布偏移源评估泛化能力。随后,我们介绍了通道无关的掩码自编码器$\mathbf{Campfire}$,通过共享解码器处理所有通道,有效扩展到包含多种荧光标记的数据集,并展示了其在分布外实验批次、扰动剂和荧光标记上的泛化能力,同时证明了从一种细胞类型到另一种细胞类型的成功迁移学习。

Developing computer vision for high-content screening is challenging due to various sources of distribution-shift caused by changes in experimental conditions, perturbagens, and fluorescent markers. The impact of different sources of distribution-shift are confounded in typical evaluations of models based on transfer learning, which limits interpretations of how changes to model design and training affect generalisation. We propose an evaluation scheme that isolates sources of distribution-shift using the JUMP-CP dataset, allowing researchers to evaluate generalisation with respect to specific sources of distribution-shift. We then present a channel-agnostic masked autoencoder $\mathbf{Campfire}$ which, via a shared decoder for all channels, scales effectively to datasets containing many different fluorescent markers, and show that it generalises to out-of-distribution experimental batches, perturbagens, and fluorescent markers, and also demonstrates successful transfer learning from one cell type to another.

荧光显微镜中通道无关的掩码自编码器的分布外评估 - arXiv