LSEAttention是时间序列预测所需要的全部

LSEAttention is All You Need for Time Series Forecasting

摘要 Abstract

基于Transformer的架构在自然语言处理和计算机视觉领域取得了显著的成功。然而,在多变量长期预测任务中,其表现往往不如一些简单的线性基准模型。先前的研究指出,传统的注意力机制是限制其在该领域效果的关键因素之一。为了解决这一问题,我们提出了LATST,这是一种旨在缓解基于Transformer的时间序列预测中常见的熵坍塌和训练不稳定性的新方法。我们在多个真实世界多变量时间序列数据集上对LATST进行了严格评估,证明了其能够超越现有的最先进的Transformer模型。值得注意的是,在某些数据集上,LATST在参数量少于一些线性模型的情况下实现了具有竞争力的表现,凸显了其高效性和有效性。

Transformer-based architectures have achieved remarkable success in natural language processing and computer vision. However, their performance in multivariate long-term forecasting often falls short compared to simpler linear baselines. Previous research has identified the traditional attention mechanism as a key factor limiting their effectiveness in this domain. To bridge this gap, we introduce LATST, a novel approach designed to mitigate entropy collapse and training instability common challenges in Transformer-based time series forecasting. We rigorously evaluate LATST across multiple real-world multivariate time series datasets, demonstrating its ability to outperform existing state-of-the-art Transformer models. Notably, LATST manages to achieve competitive performance with fewer parameters than some linear models on certain datasets, highlighting its efficiency and effectiveness.