集体智慧:基于策略平均的方法及其在报童问题中的应用

Collective Wisdom: Policy Averaging with an Application to the Newsvendor Problem

摘要 Abstract

我们提出了一种策略平均方法(Policy Averaging Approach, PAA),该方法综合了现有方法的优点,为随机优化问题构建更可靠、灵活且有据可依的策略。PAA的一个重要组成部分是风险多样化,以减少策略的随机性。另一个组成部分模仿统计学中的模型平均技术。第三个组成部分涉及利用交叉验证来多样化并优化候选策略之间的权重。我们展示了PAA在报童问题中的应用。对于这一问题,基于模型的方法通常依赖于独立同分布(i.i.d.)需求或特征相关需求的特定且可能不可靠的假设,而数据驱动的方法,包括样本平均和利用协变量函数设定订购量,往往存在过拟合问题,并对推荐策略的合理性提供有限见解。通过整合统计学和金融学的概念,PAA避免了这些问题。通过理论分析、模拟研究和实证研究,我们证明PAA优于上述早期方法。PAA展示出的好处包括降低期望成本、提高性能稳定性以及改善对推荐策略合理性的见解。此外,还讨论了考虑尾部风险和分层抽样的扩展方法。除了报童问题,PAA还可广泛应用于不确定性条件下的各种决策问题。

We propose a Policy Averaging Approach (PAA) that synthesizes the strengths of existing approaches to create more reliable, flexible and justifiable policies for stochastic optimization problems. An important component of the PAA is risk diversification to reduce the randomness of policies. A second component emulates model averaging from statistics. A third component involves using cross-validation to diversify and optimize weights among candidate policies. We demonstrate the use of the PAA for the newsvendor problem. For that problem, model-based approaches typically use specific and potentially unreliable assumptions of either independently and identically distributed (i.i.d.) demand or feature-dependent demand with covariates or autoregressive functions. Data-driven approaches, including sample averaging and the use of functions of covariates to set order quantities, typically suffer from overfitting and provide limited insights to justify recommended policies. By integrating concepts from statistics and finance, the PAA avoids these problems. We show using theoretical analysis, a simulation study, and an empirical study, that the PAA outperforms all those earlier approaches. The demonstrated benefits of the PAA include reduced expected cost, more stable performance, and improved insights to justify recommendations. Extensions to consider tail risk and the use of stratified sampling are discussed. Beyond the newsvendor problem, the PAA is applicable to a wide variety of decision-making problems under uncertainty.