摘要 Abstract
我们提出了BASKET,一个用于精细技能评估的大规模篮球视频数据集。BASKET包含来自世界各地的32,232名篮球运动员的4,477小时视频。与以往的技能评估数据集相比,我们的数据集在性别、年龄、技能水平、地理位置等方面包含了数量庞大的参与者,并具有前所未有的多样性。BASKET涵盖了20种精细的篮球技能,挑战现代视频识别模型通过深入视频分析捕捉球员技能的细微差别。给定某位球员的长段精彩视频(8-10分钟),模型需要预测该球员在20种篮球技能中的每一项技能水平(例如,优秀、良好、一般、尚可、较差)。我们的实证分析表明,当前最先进的视频模型在这项任务上表现不佳,显著落后于人类基准。我们认为,BASKET可以成为开发具有先进长距离、精细识别能力的新视频模型的有用资源。此外,我们希望我们的数据集能够为公平的篮球选秀、个性化球员发展等特定领域应用提供帮助。数据集和代码可在https://github.com/yulupan00/BASKET获取。
We present BASKET, a large-scale basketball video dataset for fine-grained skill estimation. BASKET contains 4,477 hours of video capturing 32,232 basketball players from all over the world. Compared to prior skill estimation datasets, our dataset includes a massive number of skilled participants with unprecedented diversity in terms of gender, age, skill level, geographical location, etc. BASKET includes 20 fine-grained basketball skills, challenging modern video recognition models to capture the intricate nuances of player skill through in-depth video analysis. Given a long highlight video (8-10 minutes) of a particular player, the model needs to predict the skill level (e.g., excellent, good, average, fair, poor) for each of the 20 basketball skills. Our empirical analysis reveals that the current state-of-the-art video models struggle with this task, significantly lagging behind the human baseline. We believe that BASKET could be a useful resource for developing new video models with advanced long-range, fine-grained recognition capabilities. In addition, we hope that our dataset will be useful for domain-specific applications such as fair basketball scouting, personalized player development, and many others. Dataset and code are available at https://github.com/yulupan00/BASKET.