一种用于单元稳健协方差估计的平滑多组高斯混合模型

A smooth multi-group Gaussian Mixture Model for cellwise robust covariance estimation

摘要 Abstract

由专家意见或医学诊断预先定义的数据组是否对应于基于统计建模的组?为什么观测值可能不一致?本文通过提出一种新颖的多组高斯混合模型来回答这两个问题,该模型在考虑给定组上下文的同时允许高度灵活性。这是通过对特定组的观测值不是来自单一分布而是来自所有组分布的高斯混合分布假设实现的。此外,该模型对单元异常值具有鲁棒性,即对观测值的异常数据单元具有鲁棒性。目标函数可以表述为一个似然问题并高效优化。我们还推导了估计量的理论破裂点,在此背景下这是一个创新的结果,用于量化对单元异常值的鲁棒程度。模拟结果表明其优异性能以及相对于替代模型和估计器的优势。来自不同领域的应用展示了该方法的力量,特别是在研究位于不同组重叠区域的观测值时。

Are data groups which are pre-defined by expert opinions or medical diagnoses corresponding to groups based on statistical modeling? For which reason might observations be inconsistent? This contribution intends to answer both questions by proposing a novel multi-group Gaussian mixture model that accounts for the given group context while allowing high flexibility. This is achieved by assuming that the observations of a particular group originate not from a single distribution but from a Gaussian mixture of all group distributions. Moreover, the model provides robustness against cellwise outliers, thus against atypical data cells of the observations. The objective function can be formulated as a likelihood problem and optimized efficiently. We also derive the theoretical breakdown point of the estimators, an innovative result in this context to quantify the degree of robustness to cellwise outliers. Simulations demonstrate the excellent performance and the advantages to alternative models and estimators. Applications from different areas illustrate the strength of the method, particularly in investigating observations which are on the overlap of different groups.

一种用于单元稳健协方差估计的平滑多组高斯混合模型 - arXiv