基于智能手机的设备端联邦学习在从Reddit帖子检测抑郁中的应用
On-device Federated Learning in Smartphones for Detecting Depression from Reddit Posts
摘要 Abstract
使用深度学习模型进行抑郁检测在以往的研究中已被广泛探讨,尤其是在可以从社交媒体帖子中获取大量数据的情况下。这些帖子提供了关于个人心理健康状况的有价值信息,并可用于训练模型和识别数据中的模式。然而,在这一领域分布式学习方法尚未得到充分探索。在这项研究中,我们采用联邦学习(FL)在智能手机上实现去中心化训练,同时保护用户数据隐私。我们在Reddit帖子上训练三种神经网络架构——门控循环单元(GRU)、循环神经网络(RNN)和长短期记忆网络(LSTM),以检测抑郁征兆,并在异构联邦学习设置下评估其性能。为了优化训练过程,我们利用一个通用的标记器在所有客户端设备上工作,从而减少了计算负载。此外,我们分析了智能手机上的资源消耗和通信成本,以评估它们在现实世界联邦学习环境中的影响。实验结果表明,联邦模型的表现与集中式模型相当。本研究通过在边缘设备上提供安全高效的模型训练过程,展示了联邦学习在去中心化心理健康预测方面的潜力。
Depression detection using deep learning models has been widely explored in previous studies, especially due to the large amounts of data available from social media posts. These posts provide valuable information about individuals' mental health conditions and can be leveraged to train models and identify patterns in the data. However, distributed learning approaches have not been extensively explored in this domain. In this study, we adopt Federated Learning (FL) to facilitate decentralized training on smartphones while protecting user data privacy. We train three neural network architectures--GRU, RNN, and LSTM on Reddit posts to detect signs of depression and evaluate their performance under heterogeneous FL settings. To optimize the training process, we leverage a common tokenizer across all client devices, which reduces the computational load. Additionally, we analyze resource consumption and communication costs on smartphones to assess their impact in a real-world FL environment. Our experimental results demonstrate that the federated models achieve comparable performance to the centralized models. This study highlights the potential of FL for decentralized mental health prediction by providing a secure and efficient model training process on edge devices.