近似求解Klee测度问题及并集体积估计的下界

Approximating Klee's Measure Problem and a Lower Bound for Union Volume Estimation

摘要 Abstract

并集体积估计是一个经典的算法问题。给定一组对象 \( O_1, \ldots, O_n \subseteq \mathbb{R}^d \),我们希望近似计算它们并集的体积。在特殊情况下,当所有对象均为盒子(也称为超矩形)时,该问题被称为Klee测度问题。现有最先进的算法[Karp, Luby, Madras '89]在常数维度 \( d \) 下,通过使用总查询次数 \( O(n/\varepsilon^2) \)(包括三种形式:(i) 查询 \( O_i \) 的体积;(ii) 从 \( O_i \) 中均匀采样一点;(iii) 查询某点是否包含于 \( O_i \))以常数成功概率提供了一个 \( (1+\varepsilon) \)-近似解。我们证明了如果只能通过上述三种查询方式与对象进行交互,则[Karp, Luby, Madras '89]的查询复杂度是最佳的,即 \( \Omega(n/\varepsilon^2) \) 次查询是必要的。我们的下界甚至适用于二维空间中等面积的轴对齐多边形的并集估计,并且即使允许算法检查从多边形中采样的点的坐标,仍然成立,即便允许查询任意未采样的点的包含关系。受此下界启发,我们为Klee测度问题提供了一种更高效的近似算法,将时间复杂度从 \( O(n/\varepsilon^2) \) 改进到 \( O((n+\frac{1}{\varepsilon^2}) \cdot \log^{O(d)}n) \)。通过利用Klee测度问题的几何特性,我们实现了这一改进:(1) 由于可以访问盒子的坐标,我们可以将盒子分为形状相似的类;(2) 在每一类中,我们展示了如何通过正交范围搜索从所有盒子的并集中采样;(3) 我们利用不同类别的盒子之间的小交集特性,对于大多数类对成立。

Union volume estimation is a classical algorithmic problem. Given a family of objects $O_1,\ldots,O_n \subseteq \mathbb{R}^d$, we want to approximate the volume of their union. In the special case where all objects are boxes (also known as hyperrectangles) this is known as Klee's measure problem. The state-of-the-art algorithm [Karp, Luby, Madras '89] for union volume estimation and Klee's measure problem in constant dimension $d$ computes a $(1+\varepsilon)$-approximation with constant success probability by using a total of $O(n/\varepsilon^2)$ queries of the form (i) ask for the volume of $O_i$, (ii) sample a point uniformly at random from $O_i$, and (iii) query whether a given point is contained in $O_i$. We show that if one can only interact with the objects via the aforementioned three queries, the query complexity of [Karp, Luby, Madras '89] is indeed optimal, i.e., $\Omega(n/\varepsilon^2)$ queries are necessary. Our lower bound already holds for estimating the union of equiponderous axis-aligned polygons in $\mathbb{R}^2$, and even if the algorithm is allowed to inspect the coordinates of the points sampled from the polygons, and still holds when a containment query can ask containment of an arbitrary (not sampled) point. Guided by the insights of the lower bound, we provide a more efficient approximation algorithm for Klee's measure problem improving the $O(n/\varepsilon^2)$ time to $O((n+\frac{1}{\varepsilon^2}) \cdot \log^{O(d)}n)$. We achieve this improvement by exploiting the geometry of Klee's measure problem in various ways: (1) Since we have access to the boxes' coordinates, we can split the boxes into classes of boxes of similar shape. (2) Within each class, we show how to sample from the union of all boxes, by using orthogonal range searching. And (3) we exploit that boxes of different classes have small intersection, for most pairs of classes.