L2-有界状态空间模型的自由参数化

Free Parametrization of L2-bounded State Space Models

摘要 Abstract

结构化状态空间模型(SSMs)已成为机器学习和控制领域的一种强大架构,其特点在于堆叠的每一层均由一个线性时不变(LTI)离散时间系统及其后的非线性环节组成。尽管SSMs在计算效率方面具有优势,并且在长序列预测中表现出色,但由于难以确保其稳定性和鲁棒性等特性,其在系统辨识和最优控制等实际应用中的广泛采用受到阻碍。我们引入了L2RU,这是一种新型的SSMs参数化方法,通过为所有参数值强制施加指定的L-范数界限来保证输入输出稳定性及鲁棒性。这种设计消除了复杂的约束条件,使得可以通过标准方法(如梯度下降法)对L2RU进行无约束优化。借助系统理论和凸优化工具,我们推导出具有指定L2-范数界限的平方离散时间LTI系统的非保守参数化方法,从而奠定了L2RU架构的基础。此外,我们还通过一种专门设计的初始化策略优化了其在长输入序列上的性能表现。通过系统辨识任务,我们验证了L2RU的优越性能,展示了其在学习和控制应用中的潜力。

Structured state-space models (SSMs) have emerged as a powerful architecture in machine learning and control, featuring stacked layers where each consists of a linear time-invariant (LTI) discrete-time system followed by a nonlinearity. While SSMs offer computational efficiency and excel in long-sequence predictions, their widespread adoption in applications like system identification and optimal control is hindered by the challenge of ensuring their stability and robustness properties. We introduce L2RU, a novel parametrization of SSMs that guarantees input-output stability and robustness by enforcing a prescribed L-bound for all parameter values. This design eliminates the need for complex constraints, allowing unconstrained optimization over L2RUs by using standard methods such as gradient descent. Leveraging tools from system theory and convex optimization, we derive a non-conservative parametrization of square discrete-time LTI systems with a specified L2-bound, forming the foundation of the L2RU architecture. Additionally, we enhance its performance with a bespoke initialization strategy optimized for long input sequences. Through a system identification task, we validate L2RU's superior performance, showcasing its potential in learning and control applications.