魔法学:通过符号突变增强定理证明能力

Alchemy: Amplifying Theorem-Proving Capability through Symbolic Mutation

摘要 Abstract

形式化证明即使对经验丰富的专家而言也极具挑战性。神经定理证明(NTP)的最新进展显示出加速这一过程的潜力。然而,互联网上的形式化语料库与普通文本相比极为有限,这对NTP构成了显著的数据稀缺性挑战。为了解决这一问题,本研究提出了一种名为Alchemy的一般数据合成框架,该框架通过符号突变构造形式化定理。具体来说,对于Mathlib中的每个候选定理,我们确定所有可以用于重写或应用到该定理的可调用定理。随后,通过将陈述中的相应项替换为其等价形式或前提条件,对候选定理进行突变。结果表明,我们的方法使Mathlib中的定理数量增加了两个数量级,从11万增加到600万。此外,我们在这个扩充语料库上对大型语言模型进行了持续预训练和有监督微调。实验结果表明,我们的方法的有效性,在Leandojo基准测试中实现了4.70%的绝对性能提升。此外,基于合成数据的分布外miniF2F基准测试显示,我们的方法获得了2.47%的绝对性能增益。为了提供进一步的见解,我们对合成数据组成和训练范式进行了全面分析,为开发强大的定理证明器提供了宝贵的指导。

Formal proofs are challenging to write even for experienced experts. Recent progress in Neural Theorem Proving (NTP) shows promise in expediting this process. However, the formal corpora available on the Internet are limited compared to the general text, posing a significant data scarcity challenge for NTP. To address this issue, this work proposes Alchemy, a general framework for data synthesis that constructs formal theorems through symbolic mutation. Specifically, for each candidate theorem in Mathlib, we identify all invocable theorems that can be used to rewrite or apply to it. Subsequently, we mutate the candidate theorem by replacing the corresponding term in the statement with its equivalent form or antecedent. As a result, our method increases the number of theorems in Mathlib by an order of magnitude, from 110k to 6M. Furthermore, we perform continual pretraining and supervised finetuning on this augmented corpus for large language models. Experimental results demonstrate the effectiveness of our approach, achieving a 4.70% absolute performance improvement on Leandojo benchmark. Additionally, our approach achieves a 2.47% absolute performance gain on the out-of-distribution miniF2F benchmark based on the synthetic data.To provide further insights, we conduct a comprehensive analysis of synthetic data composition and the training paradigm, offering valuable guidance for developing a strong theorem prover.

魔法学:通过符号突变增强定理证明能力 - arXiv