大型语言模型生成结构逼真的社会网络但高估政治同质性

Research

arXiv

LLMs generate structurally realistic social networks but overestimate political homophily

Serina Chang ,

Alicja Chaszczewicz ,

Emma Wang ,

摘要 Abstract

生成社会网络对于流行病建模和社会模拟等诸多应用至关重要。生成式人工智能，尤其是大型语言模型（LLMs）的出现为社会网络生成提供了新的可能性：LLMs可以在无需额外训练或定义网络参数的情况下生成网络，并且用户可以灵活地通过自然语言定义网络中的个体。然而，这种潜力引发了两个关键问题：1）LLMs生成的社会网络是否具有现实性？2）鉴于人口统计学在形成社会关系中的重要性，是否存在偏见风险？为了解答这些问题，我们开发了三种用于网络生成的提示方法，并将生成的网络与一系列真实社会网络进行比较。我们发现，采用“局部”方法（LLMs一次构建一个个体的关系）比“全局”方法（一次性构建整个网络）生成的网络更加现实。同时，我们发现生成的网络在许多特性上与真实网络相匹配，包括密度、聚类、连通性和度分布。然而，我们还发现，LLMs强调政治同质性，而忽视其他类型的同质性，并且显著高估了政治同质性与真实社会网络之间的差异。

Generating social networks is essential for many applications, such as epidemic modeling and social simulations. The emergence of generative AI, especially large language models (LLMs), offers new possibilities for social network generation: LLMs can generate networks without additional training or need to define network parameters, and users can flexibly define individuals in the network using natural language. However, this potential raises two critical questions: 1) are the social networks generated by LLMs realistic, and 2) what are risks of bias, given the importance of demographics in forming social ties? To answer these questions, we develop three prompting methods for network generation and compare the generated networks to a suite of real social networks. We find that more realistic networks are generated with "local" methods, where the LLM constructs relations for one persona at a time, compared to "global" methods that construct the entire network at once. We also find that the generated networks match real networks on many characteristics, including density, clustering, connectivity, and degree distribution. However, we find that LLMs emphasize political homophily over all other types of homophily and significantly overestimate political homophily compared to real social networks.