使用视觉Transformer进行肺部疾病自动诊断：胸部X光分类的对比研究

Research

arXiv

Automated diagnosis of lung diseases using vision transformer: a comparative study on chest x-ray classification

K. Sajid ,

M. Hasnain ,

Muhammad Jalal ,

Grigori Sidorov

论文信息在线阅读PDF

摘要 Abstract

背景：肺病是重要的健康问题，特别是在儿童和老年人中。它通常由肺部感染引起，是儿童死亡的主要原因之一。全球每年因肺部相关疾病导致许多生命丧失，因此早期且准确的诊断至关重要。放射影像学是诊断此类病症的重要工具。最常见的肺部疾病包括肺炎、哮喘、过敏、慢性阻塞性肺病（COPD）、支气管炎、肺气肿和肺癌，这些构成了重要的公共卫生挑战。早期预测这些疾病至关重要，因为它可以识别风险因素并采取预防措施以降低发病的可能性。方法：本研究利用了一个包含3,475张胸部X光图像的数据集，该数据集来源于Talukder, M. A. (2023)提供的Mendeley Data，分为三类：正常、肺部浸润和肺炎。我们应用了五种预训练的深度学习模型，包括CNN、ResNet50、DenseNet、CheXNet和U-Net，以及两种迁移学习算法，即视觉Transformer（ViT）和窗口位移（Swin），用于对这些图像进行分类。这种方法旨在通过减少对人工干预的依赖，通过自动化分类系统解决肺部异常的诊断问题。我们的分析在二分类和多分类设置下均进行了评估。结果：在二分类中，我们专注于区分正常和病毒性肺炎病例；而在多分类中，包含了所有三类（正常、肺部浸润和病毒性肺炎）。我们提出的ViT方法表现出色，在二分类中达到了99%的准确率，在多分类中达到了95.25%的准确率。

Background: Lung disease is a significant health issue, particularly in children and elderly individuals. It often results from lung infections and is one of the leading causes of mortality in children. Globally, lung-related diseases claim many lives each year, making early and accurate diagnoses crucial. Radiographs are valuable tools for the diagnosis of such conditions. The most prevalent lung diseases, including pneumonia, asthma, allergies, chronic obstructive pulmonary disease (COPD), bronchitis, emphysema, and lung cancer, represent significant public health challenges. Early prediction of these conditions is critical, as it allows for the identification of risk factors and implementation of preventive measures to reduce the likelihood of disease onset Methods: In this study, we utilized a dataset comprising 3,475 chest X-ray images sourced from from Mendeley Data provided by Talukder, M. A. (2023) [14], categorized into three classes: normal, lung opacity, and pneumonia. We applied five pre-trained deep learning models, including CNN, ResNet50, DenseNet, CheXNet, and U-Net, as well as two transfer learning algorithms such as Vision Transformer (ViT) and Shifted Window (Swin) to classify these images. This approach aims to address diagnostic issues in lung abnormalities by reducing reliance on human intervention through automated classification systems. Our analysis was conducted in both binary and multiclass settings. Results: In the binary classification, we focused on distinguishing between normal and viral pneumonia cases, whereas in the multi-class classification, all three classes (normal, lung opacity, and viral pneumonia) were included. Our proposed methodology (ViT) achieved remarkable performance, with accuracy rates of 99% for binary classification and 95.25% for multiclass classification.