摘要 Abstract
高级大型语言模型(LLMs)在通信网络等领域引发了革新浪潮,推动了新应用和服务的出现,并显著改进了解决方案。然而,大多数LLMs通常需要巨大的计算资源,导致能耗极高。因此,本研究提出了一种端到端的管道,探讨通信网络故障工单分析过程中LLMs在能效与模型性能之间的权衡问题。此外,该研究利用两个真实世界的数据集评估了这一管道在通信网络根因分析和响应反馈任务中的性能。结果表明,结合量化和剪枝技术能够在显著提升模型性能的同时降低能耗。
Advanced Large Language Models (LLMs) have revolutionized various fields, including communication networks, sparking an innovation wave that has led to new applications and services, and significantly enhanced solution schemes. Despite all these impressive developments, most LLMs typically require huge computational resources, resulting in terribly high energy consumption. Thus, this research study proposes an end-to-end pipeline that investigates the trade-off between energy efficiency and model performance for an LLM during fault ticket analysis in communication networks. It further evaluates the pipeline performance using two real-world datasets for the tasks of root cause analysis and response feedback in a communication network. Our results show that an appropriate combination of quantization and pruning techniques is able to reduce energy consumption while significantly improving model performance.