基于时域-空域状态空间对偶性的高效低延迟远程光电容积脉搏波图技术

Memory-efficient Low-latency Remote Photoplethysmography through Temporal-Spatial State Space Duality

摘要 Abstract

远程光电容积脉搏图(rPPG)通过面部光反射分析实现非接触式生理监测,在深度学习带来性能提升的同时也面临着巨大的计算资源需求瓶颈。本文提出了一种名为ME-rPPG的记忆高效算法,该算法基于时域-空域状态空间对偶性,解决了模型可扩展性、跨数据集泛化能力和实时约束之间的三难问题。利用可迁移的状态空间,ME-rPPG能够高效捕捉面部帧中微妙的周期性变化,同时保持极低的计算开销,支持在长视频序列上的训练并实现低延迟推理。在MMPD、VitalVideo和PURE三个数据集上的测试结果表明,ME-rPPG的平均绝对误差(MAE)分别为5.38、0.70和0.25,相较于现有方法提升了21.3%到60.2%。我们的方案实现了仅需3.6 MB内存占用和9.46毫秒延迟的实时推理能力,相比现有方法在真实场景部署中的准确率提升了19.5%-49.7%,用户满意度提升了43.2%。代码和演示已发布于https://github.com/Health-HCI-Group/ME-rPPG-demo以供复现。

Remote photoplethysmography (rPPG), enabling non-contact physiological monitoring through facial light reflection analysis, faces critical computational bottlenecks as deep learning introduces performance gains at the cost of prohibitive resource demands. This paper proposes ME-rPPG, a memory-efficient algorithm built on temporal-spatial state space duality, which resolves the trilemma of model scalability, cross-dataset generalization, and real-time constraints. Leveraging a transferable state space, ME-rPPG efficiently captures subtle periodic variations across facial frames while maintaining minimal computational overhead, enabling training on extended video sequences and supporting low-latency inference. Achieving cross-dataset MAEs of 5.38 (MMPD), 0.70 (VitalVideo), and 0.25 (PURE), ME-rPPG outperforms all baselines with improvements ranging from 21.3% to 60.2%. Our solution enables real-time inference with only 3.6 MB memory usage and 9.46 ms latency -- surpassing existing methods by 19.5%-49.7% accuracy and 43.2% user satisfaction gains in real-world deployments. The code and demos are released for reproducibility on https://github.com/Health-HCI-Group/ME-rPPG-demo.

基于时域-空域状态空间对偶性的高效低延迟远程光电容积脉搏波图技术 - arXiv