AgRowStitch:一种针对地面农业图像的高保真图像拼接管道
AgRowStitch: A High-fidelity Image Stitching Pipeline for Ground-based Agricultural Images
摘要 Abstract
农业成像通常需要将单个图像拼接成最终的镶嵌图进行分析。然而,由于重复纹理导致特征匹配困难、植物非平面化以及由大量图像构建的镶嵌图可能累积误差造成漂移,农业图像拼接尤其具有挑战性。尽管这些问题可以通过使用地理配准图像或在高海拔处拍摄图像得以缓解,但对于贴近作物拍摄的图像却没有通用解决方案。为了解决这一问题,我们创建了一个用户友好且开源的管道,用于拼接基于地面的一排作物图像,而无需依赖额外数据。首先,我们使用SuperPoint和LightGlue在小批次图像内提取并匹配特征。然后,在对相机移动施加约束的情况下,依次拼接每个批次的图像。在对每个批次镶嵌图进行拉直和缩放后,将所有批次镶嵌图依次拼接在一起,并最终拉直为一个完整的镶嵌图。我们在两条不同的农业机器人沿72米长作物行收集的图像以及手动沿行携带相机拍摄的图像上测试了该管道。在所有三种情况下,该管道均产生了高质量的镶嵌图,能够以20厘米的平均绝对误差对现实世界中的位置进行地理配准。这种方法为需要在行内粗略地理配准位置但无法获得准确位置数据或复杂成像系统的用户提供了一种可访问的叶片级拼接方案。
Agricultural imaging often requires individual images to be stitched together into a final mosaic for analysis. However, agricultural images can be particularly challenging to stitch because feature matching across images is difficult due to repeated textures, plants are non-planar, and mosaics built from many images can accumulate errors that cause drift. Although these issues can be mitigated by using georeferenced images or taking images at high altitude, there is no general solution for images taken close to the crop. To address this, we created a user-friendly and open source pipeline for stitching ground-based images of a linear row of crops that does not rely on additional data. First, we use SuperPoint and LightGlue to extract and match features within small batches of images. Then we stitch the images in each batch in series while imposing constraints on the camera movement. After straightening and rescaling each batch mosaic, all batch mosaics are stitched together in series and then straightened into a final mosaic. We tested the pipeline on images collected along 72 m long rows of crops using two different agricultural robots and a camera manually carried over the row. In all three cases, the pipeline produced high-quality mosaics that could be used to georeference real world positions with a mean absolute error of 20 cm. This approach provides accessible leaf-scale stitching to users who need to coarsely georeference positions within a row, but do not have access to accurate positional data or sophisticated imaging systems.