协同排名：结合小规模与大规模排序代理的协同排名方法

Research

arXiv

CoRanking: Collaborative Ranking with Small and Large Ranking Agents

Wenhan Liu ,

Xinyu Ma ,

Yutao Zhu ,

Lixin Su ,

Shuaiqiang Wang ,

Dawei Yin ,

Zhicheng Dou

论文信息在线阅读PDF

摘要 Abstract

大型语言模型（LLMs）在列表式排序任务中表现出色，但其优越性能往往依赖于大规模参数（如GPT-4）以及重复滑动窗口过程，这带来了显著的效率挑战。本文提出了一种名为\textbf{CoRanking}的新颖协同排名框架，该框架结合了小规模和大规模排序模型，以实现高效且有效的排序。CoRanking首先利用一个小规模重排序器对所有候选段落进行预排序，将相关段落提升至列表的顶部部分（例如，前20名）。随后，大规模LLM列表式重排序器仅针对这些排名靠前的段落进行重新排序，而非整个列表，从而大幅提高整体排序效率。尽管更加高效，但先前研究表明，大规模LLM列表式重排序器对输入段落的顺序存在显著偏倚。直接将小规模重排序器选出的顶级段落输入LLM可能导致其性能下降。为解决这一问题，我们引入了一种通过强化学习训练的段落顺序调整器，用于重新排列来自小规模重排序器的顶级段落，使其更符合LLM对段落顺序的偏好。在三个信息检索基准数据集上的大量实验表明，CoRanking不仅显著提高了效率（将排序延迟减少约70%），同时相较于仅使用大规模LLM列表式重排序器，其排序效果更为出色。

Large Language Models (LLMs) have demonstrated superior listwise ranking performance. However, their superior performance often relies on large-scale parameters (\eg, GPT-4) and a repetitive sliding window process, which introduces significant efficiency challenges. In this paper, we propose \textbf{CoRanking}, a novel collaborative ranking framework that combines small and large ranking models for efficient and effective ranking. CoRanking first employs a small-size reranker to pre-rank all the candidate passages, bringing relevant ones to the top part of the list (\eg, top-20). Then, the LLM listwise reranker is applied to only rerank these top-ranked passages instead of the whole list, substantially enhancing overall ranking efficiency. Although more efficient, previous studies have revealed that the LLM listwise reranker have significant positional biases on the order of input passages. Directly feed the top-ranked passages from small reranker may result in the sub-optimal performance of LLM listwise reranker. To alleviate this problem, we introduce a passage order adjuster trained via reinforcement learning, which reorders the top passages from the small reranker to align with the LLM's preferences of passage order. Extensive experiments on three IR benchmarks demonstrate that CoRanking significantly improves efficiency (reducing ranking latency by about 70\%) while achieving even better effectiveness compared to using only the LLM listwise reranker.