摘要 Abstract
我们推广了Acharya等人在2015年首次提出的从字符串的子串组成中重构字符串的问题,该问题受到利用质谱技术的聚合物先进数据存储系统的启发。具体来说,我们将字符串视为带标号的路径图,并尝试重构带标号的图。对于给定整数t,子图组成要么包含每个t阶(t-多重集组成)连通子图的标号向量,要么包含所有t阶连通子图的所有标号之和(t-和组成)。我们研究了在已知图结构并能查询组成信息的预言机条件下,能否重构图的标号。如果可以,则该图可重构;否则,它是不可区分的,具有相同组成的两个带标号图称为等组成图。我们证明通过暴力算法进行重构是非常低效的,然后给出了使用尽可能少的组成重构若干图类的方法。我们也得到了一些否定结果,找到了最小的不可区分图和树,以及具有大量非同构等组成图的族。一个有趣的结果出现在对路径的一片叶子进行孪生操作时:某些路径是不可区分的,将一片叶子变为孪生体使图在可重构和不可区分之间交替变化,具体取决于路径的奇偶性;而将一片叶子变为假孪生体则使得图在任何情况下都可以仅通过和组成来重构。
We generalize the problem of reconstructing strings from their substring compositions first introduced by Acharya et al. in 2015 motivated by polymer-based advanced data storage systems utilizing mass spectrometry. Namely, we see strings as labeled path graphs, and as such try to reconstruct labeled graphs. For a given integer t, the subgraph compositions contain either vectors of labels for each connected subgraph of order t (t-multiset-compositions) or the sum of all labels of all connected subgraphs of order t (t-sum-composition). We ask whether, given a graph of which we know the structure and an oracle whom you can query for compositions, we can reconstruct the labeling of the graph. If it is possible, then the graph is reconstructable; otherwise, it is confusable, and two labeled graphs with the same compositions are called equicomposable. We prove that reconstructing through a brute-force algorithm is wildly inefficient, before giving methods for reconstructing several graph classes using as few compositions as possible. We also give negative results, finding the smallest confusable graphs and trees, as well as families with a large number of equicomposable non-isomorphic graphs. An interesting result occurs when twinning one leaf of a path: some paths are confusable, creating a twin out of a leaf sees the graph alternating between reconstructable and confusable depending on the parity of the path, and creating a false twin out of a leaf makes the graph reconstructable using only sum-compositions in all cases.