加权与无权树编辑距离及APSP等价的更快算法

Faster Weighted and Unweighted Tree Edit Distance and APSP Equivalence

摘要 Abstract

给定两个根有序且具有$n$个节点的树,其节点从字母表$\Sigma$中标记,树编辑距离(TED)是通过插入、删除和重新标记节点的一系列有效操作将一棵树转换为另一棵树所需的最小代价。树编辑距离是字符串编辑距离的一个著名推广,并自1970年代以来被广泛研究。多年来,不断改进使得TED问题的复杂度降低至$O(n^3)$[DMRW 2010]。细粒度复杂性研究表明,TED的真正亚立方时间算法的存在性等价于All-Pairs Shortest Paths (APSP)的真正亚立方时间算法的存在性[BGMW 2020]。因此,在流行的APSP假设下,不存在真正亚立方时间的TED算法。然而,与许多基于APSP条件硬度的问题不同,这些问题是与APSP等价的,TED是否可以归约为APSP仍然未知。本文解决了这一问题:我们不仅证明了TED在细粒度复杂性下等价于APSP,而且我们的归约足够紧致,结合目前最快的APSP算法[Williams 2018],得到了第一个真正亚立方时间的TED算法,运行时间为$n^3/2^{\Omega(\sqrt{\log{n}})}$。此外,我们还考虑了无权树编辑距离问题,其中每种编辑操作的成本为1。对于无权TED,由于Mao [Mao 2022]的工作,已经存在一个真正亚立方时间的算法,后来由D\"{u}rr [D\"{u}rr 2023]稍作改进,运行时间为$O(n^{2.9148})$。他们的算法依赖于有界单调min-plus积作为关键子程序,而该积的最佳运行时间是$\tilde{O}(n^{\frac{3+\omega}{2}})\leq O(n^{2.6857})$(其中$\omega$是快速矩阵乘法的指数)。在本工作中,我们填补了这一差距,给出了一个运行时间为$\tilde{O}(n^{\frac{3+\omega}{2}})$的无权TED算法。

The tree edit distance (TED) between two rooted ordered trees with $n$ nodes labeled from an alphabet $\Sigma$ is the minimum cost of transforming one tree into the other by a sequence of valid operations consisting of insertions, deletions and relabeling of nodes. The tree edit distance is a well-known generalization of string edit distance and has been studied since the 1970s. Years of steady improvements have led to an $O(n^3)$ algorithm [DMRW 2010]. Fine-grained complexity casts light onto the hardness of TED showing that a truly subcubic time algorithm for TED implies a truly subcubic time algorithm for All-Pairs Shortest Paths (APSP) [BGMW 2020]. Therefore, under the popular APSP hypothesis, a truly subcubic time algorithm for TED cannot exist. However, unlike many problems in fine-grained complexity for which conditional hardness based on APSP also comes with equivalence to APSP, whether TED can be reduced to APSP has remained unknown. In this paper, we resolve this. Not only we show that TED is fine-grained equivalent to APSP, our reduction is tight enough, so that combined with the fastest APSP algorithm to-date [Williams 2018] it gives the first ever subcubic time algorithm for TED running in $n^3/2^{\Omega(\sqrt{\log{n}})}$ time. We also consider the unweighted tree edit distance problem in which the cost of each edit is one. For unweighted TED, a truly subcubic algorithm is known due to Mao [Mao 2022], later improved slightly by D\"{u}rr [D\"{u}rr 2023] to run in $O(n^{2.9148})$. Their algorithm uses bounded monotone min-plus product as a crucial subroutine, and the best running time for this product is $\tilde{O}(n^{\frac{3+\omega}{2}})\leq O(n^{2.6857})$ (where $\omega$ is the exponent of fast matrix multiplication). In this work, we close this gap and give an algorithm for unweighted TED that runs in $\tilde{O}(n^{\frac{3+\omega}{2}})$ time.