地理加权回归中基于数学规划的变量子集选择与带宽估计集成算法

Integrated Subset Selection and Bandwidth Estimation Algorithm for Geographically Weighted Regression

摘要 Abstract

本文提出了一种基于数学规划的算法,用于地理加权回归(一种允许核带宽和回归系数在研究区域变化的局部回归方法)中的变量子集选择和带宽估计集成问题。与文献中标准方法不同的是,后者基于不同的标准分别针对每个焦点点估计带宽和回归参数,我们的模型通过回归似然函数和方差建模,使用单一目标函数对所有焦点点的回归和带宽参数进行集成估计。该模型进一步整合了一个为所有焦点点选择单一独立变量子集的过程,而现有方法可能返回各焦点点间异质的子集。我们随后提出了一种替代方向法来求解非凸数学模型,并证明其收敛到部分极小值。计算实验表明,所提出的算法具有较强的解释能力,能够稳定地捕捉空间变化模式,具备选择最佳子集并考虑额外约束的能力。

This study proposes a mathematical programming-based algorithm for the integrated selection of variable subsets and bandwidth estimation in geographically weighted regression, a local regression method that allows the kernel bandwidth and regression coefficients to vary across study areas. Unlike standard approaches in the literature, in which bandwidth and regression parameters are estimated separately for each focal point on the basis of different criteria, our model uses a single objective function for the integrated estimation of regression and bandwidth parameters across all focal points, based on the regression likelihood function and variance modeling. The proposed model further integrates a procedure to select a single subset of independent variables for all focal points, whereas existing approaches may return heterogeneous subsets across focal points. We then propose an alternative direction method to solve the nonconvex mathematical model and show that it converges to a partial minimum. The computational experiment indicates that the proposed algorithm provides competitive explanatory power with stable spatially varying patterns, with the ability to select the best subset and account for additional constraints.