VecTrans:一种用于高性能CPU上更好自动向量化的大语言模型转换框架
VecTrans: LLM Transformation Framework for Better Auto-vectorization on High-performance CPU
摘要 Abstract
大型语言模型(LLMs)在代码生成方面展现出了巨大的能力,但由于幻觉问题以及缺乏领域特定推理等原因,其在编译器优化中的有效应用仍是一个开放性的挑战。向量化是一种提升代码性能的关键优化手段,但由于编译器难以识别复杂的代码模式,向量化往往失败,而这些复杂模式通常需要大量的经验知识。LLMs因其能够捕捉复杂的模式,为解决这一挑战提供了有前景的解决方案。本文提出了一种名为VecTrans的新框架,该框架利用LLMs增强基于编译器的代码向量化。VecTrans首先利用编译器分析来识别潜在的可向量化代码区域,然后利用LLM将这些区域重构为更易于编译器自动向量化的模式。为了确保语义正确性,VecTrans还在中间表示(IR)层面上集成了混合验证机制。通过上述努力,VecTrans结合了LLMs的适应性和编译器向量化精度,从而有效地打开了向量化的机会。实验结果显示,在Clang、GCC和BiShengCompiler都无法向量化的全部50个TSVC函数中,VecTrans成功向量化了23个案例(46%),并实现了平均2.02倍的速度提升,大大超越了最先进的性能。
Large language models (LLMs) have demonstrated great capabilities in code generation, yet their effective application in compiler optimizations remains an open challenge due to issues such as hallucinations and a lack of domain-specific reasoning. Vectorization, a crucial optimization for enhancing code performance, often fails because of the compiler's inability to recognize complex code patterns, which commonly require extensive empirical expertise. LLMs, with their ability to capture intricate patterns, thus providing a promising solution to this challenge. This paper presents VecTrans, a novel framework that leverages LLMs to enhance compiler-based code vectorization. VecTrans first employs compiler analysis to identify potentially vectorizable code regions. It then utilizes an LLM to refactor these regions into patterns that are more amenable to the compiler's auto-vectorization. To ensure semantic correctness, VecTrans further integrates a hybrid validation mechanism at the intermediate representation (IR) level. With the above efforts, VecTrans combines the adaptability of LLMs with the precision of compiler vectorization, thereby effectively opening up the vectorization opportunities. Experimental results show that among all 50 TSVC functions unvectorizable by Clang, GCC, and BiShengCompiler, VecTrans successfully vectorizes 23 cases (46%) and achieves an average speedup of 2.02x, greatly surpassing state-of-the-art performance.