过参数化线性模型多分类的精确渐近泛化研究
Precise Asymptotic Generalization for Multiclass Classification with Overparameterized Linear Models
摘要 Abstract
我们研究了在高斯协变量双层模型下过参数化线性模型对多分类问题的渐近泛化性能,该模型由Subramanian等人在‘22年提出,其中数据点数量、特征数量和类别数量共同增长。我们完全解决了Subramanian等人提出的猜想,匹配了预测的泛化区间。此外,我们的新下界类似于信息论中的强逆结论:它们表明误分类率在渐近情况下趋近于0或1。我们结果的精确性带来了一个令人惊讶的后果,即在最小范数插值回归器已知为最优的区域中,最小范数插值分类器相对于非插值分类器在渐近情况下可能次优。我们分析的关键在于一种新的Hanson-Wright不等式变体,它对于稀疏标签的多分类问题具有广泛的应用价值。作为一种应用,我们展示了相同的分析方法也可以用于在相同的双层模型下分析相关的多标签分类问题。
We study the asymptotic generalization of an overparameterized linear model for multiclass classification under the Gaussian covariates bi-level model introduced in Subramanian et al.~'22, where the number of data points, features, and classes all grow together. We fully resolve the conjecture posed in Subramanian et al.~'22, matching the predicted regimes for generalization. Furthermore, our new lower bounds are akin to an information-theoretic strong converse: they establish that the misclassification rate goes to 0 or 1 asymptotically. One surprising consequence of our tight results is that the min-norm interpolating classifier can be asymptotically suboptimal relative to noninterpolating classifiers in the regime where the min-norm interpolating regressor is known to be optimal. The key to our tight analysis is a new variant of the Hanson-Wright inequality which is broadly useful for multiclass problems with sparse labels. As an application, we show that the same type of analysis can be used to analyze the related multilabel classification problem under the same bi-level ensemble.