重复并非完全相同:支撑语言模型重复的不同机制

Repetitions are not all alike: distinct mechanisms sustain repetition in language models

摘要 Abstract

由语言模型(LMs)生成的文本可能会陷入重复循环,其中相同的词序列被持续不断地重复。先前的研究通常将重复视为单一现象。然而,在不同任务和情境下出现的重复序列表明,它可能受到多种潜在因素的驱动。本文通过实验探讨了语言模型中的重复可能是由不同的机制引起的,这些机制反映了模型使用的不同文本生成策略。我们考察了在两种促使重复出现的条件下语言模型的内部运作:一种是在人类撰写文本后自然出现重复序列的情况,另一种是在情境学习(ICL)设置中明确诱导重复的情况。我们的分析揭示了这两种条件之间的关键差异:模型表现出不同程度的信心,依赖不同的注意力头,并且在受控扰动下的响应模式也有所不同。这些发现表明,不同的内部机制可能相互作用以推动重复现象的发生,这对重复的理解及其缓解策略具有重要意义。更广泛地说,我们的结果强调了语言模型中的相同表面行为可能由不同的潜在过程维持,这些过程可以独立或组合发挥作用。

Text generated by language models (LMs) can degrade into repetitive cycles, where identical word sequences are persistently repeated one after another. Prior research has typically treated repetition as a unitary phenomenon. However, repetitive sequences emerge under diverse tasks and contexts, raising the possibility that it may be driven by multiple underlying factors. Here, we experimentally explore the hypothesis that repetition in LMs can result from distinct mechanisms, reflecting different text generation strategies used by the model. We examine the internal working of LMs under two conditions that prompt repetition: one in which repeated sequences emerge naturally after human-written text, and another where repetition is explicitly induced through an in-context learning (ICL) setup. Our analysis reveals key differences between the two conditions: the model exhibits varying levels of confidence, relies on different attention heads, and shows distinct pattens of change in response to controlled perturbations. These findings suggest that distinct internal mechanisms can interact to drive repetition, with implications for its interpretation and mitigation strategies. More broadly, our results highlight that the same surface behavior in LMs may be sustained by different underlying processes, acting independently or in combination.

重复并非完全相同:支撑语言模型重复的不同机制 - arXiv