TikTok算法内容放大动态研究

Dynamics of Algorithmic Content Amplification on TikTok

摘要 Abstract

智能算法日益影响我们在线上遇到和参与的内容。TikTok的“为你推荐”(For You)信息流体现了极端的算法驱动型内容筛选机制,几乎完全基于用户对平台的显式和隐式互动来定制视频内容流。尽管这一现象受到了越来越多的关注,但TikTok算法驱动的内容放大机制仍缺乏量化研究。TikTok的算法如何快速且在多大程度上放大与用户兴趣相符的内容?为回答这些问题,我们开展了一项机器人审计实验,部署了具有不同兴趣的虚拟账号(sock-puppet)与“为你推荐”信息流进行交互。研究发现,与机器人兴趣一致的内容经历了强烈的放大效应,通常在观看前200个视频时就出现了迅速强化的现象。虽然所有兴趣类型的内容均表现出放大效应,但其强度因兴趣类别而异,表明了主题特定偏见的出现。时间序列分析和马尔可夫模型揭示了推荐机制的多个阶段动态,包括持续的内容强化以及随时间推移内容多样性逐渐下降的现象。尽管TikTok算法在一定程度上维持了内容多样性,但我们发现放大效应与探索行为之间存在显著的负相关:随着与用户兴趣一致内容的放大效应增强,对未见标签(hashtag)的参与度却有所下降。这些研究结果有助于探讨数字时代社会-算法反馈回路的问题,并为个性化推荐与内容多样性之间的权衡提供见解。

Intelligent algorithms increasingly shape the content we encounter and engage with online. TikTok's For You feed exemplifies extreme algorithm-driven curation, tailoring the stream of video content almost exclusively based on users' explicit and implicit interactions with the platform. Despite growing attention, the dynamics of content amplification on TikTok remain largely unquantified. How quickly, and to what extent, does TikTok's algorithm amplify content aligned with users' interests? To address these questions, we conduct a sock-puppet audit, deploying bots with different interests to engage with TikTok's "For You" feed. Our findings reveal that content aligned with the bots' interests undergoes strong amplification, with rapid reinforcement typically occurring within the first 200 videos watched. While amplification is consistently observed across all interests, its intensity varies by interest, indicating the emergence of topic-specific biases. Time series analyses and Markov models uncover distinct phases of recommendation dynamics, including persistent content reinforcement and a gradual decline in content diversity over time. Although TikTok's algorithm preserves some content diversity, we find a strong negative correlation between amplification and exploration: as the amplification of interest-aligned content increases, engagement with unseen hashtags declines. These findings contribute to discussions on socio-algorithmic feedback loops in the digital age and the trade-offs between personalization and content diversity.