Machine Learning-Based Respiratory Disease Classification Using Lung Sounds

IEEE BASE PAPER TITLE:

Triplet Multi-Kernel CNN for Detection of Pulmonary Diseases From Lung Sound Signals

IEEE BASE PAPER ABSTRACT:

Recent studies have demonstrated the notable success of Convolutional Neural Networks (CNNs) to detect respiratory diseases from Lung Sound (LS) signals. However, traditional Single-Kernel CNN (SK-CNN) methods frequently face limitations when depending solely on a singular kernel type to extract essential information. To overcome these limitations, we present a Multi-Kernel CNN (MK-CNN), designed specifically to capture a wider range of information from LS signals, which increases the accuracy and reliability of Pulmonary Diseases (PDs) detection. We also present a Triplet MK-CNN (TMK-CNN) model that combines the benefits of multi-kernel feature extraction with a triplet-based architecture to enhance detection performance. The effectiveness of these models was evaluated using LS data from a publicly accessible dataset provided by King Abdullah University Hospital (KAUH). Experimental results show that the MK-CNN model achieves an accuracy of 93.94%, indicating a 1.17 percentage-point improvement over the SK-CNN baseline, which is at 92.77%. The TMK-CNN model enhances classification accuracy to 97.98%, achieving a 4.04 percentage points over MK-CNN and a 5.21 percentage-point improvement compared to SK-CNN. These findings indicate the significant potential of MK-CNN and TMK-CNN architectures in enhancing automated PD identification, allowing for more reliable, user-friendly, and clinically useful diagnostic tools.

PROJECT OUTPUT VIDEO:

ALGORITHM / MODEL USED:

Random Forest Classifier.

OUR PROPOSED PROJECT ABSTRACT:

Respiratory diseases are among the most prevalent health conditions worldwide, making early and accurate diagnosis essential for effective treatment and patient care. Conventional diagnostic practices, such as manual lung auscultation, depend heavily on physician expertise and are often subjective, leading to variability in diagnosis. Recent advances in machine learning and digital signal processing have enabled the automated analysis of lung sounds, offering a reliable, non-invasive approach for identifying respiratory abnormalities and supporting clinical decision-making.

The need for an automated respiratory disease classification system arises from the increasing demand for accessible, cost-effective, and consistent diagnostic tools, particularly in remote and resource-limited healthcare settings. Lung sounds contain critical acoustic signatures, such as crackles and wheezes, which are indicative of underlying respiratory conditions but are difficult to analyze accurately through traditional methods. Machine learning techniques provide the capability to extract and learn meaningful patterns from these complex audio signals, thereby improving diagnostic reliability.

In this project, a machine learning-based respiratory disease classification system using lung sounds is developed. The system is implemented using Python for backend processing, with HTML, CSS, and JavaScript for the frontend interface, and Flask as the web framework for seamless integration. The dataset used in this work consists of 920 lung sound recordings in .wav format along with 920 corresponding annotation .txt files. These recordings, ranging from 10 to 90 seconds in duration, were collected from 126 patients and together represent approximately 5.5 hours of audio data. The dataset includes 6,898 annotated respiratory cycles, of which 1,864 contain crackles, 886 contain wheezes, and 506 contain both crackles and wheezes. Both clean and noisy recordings are included to reflect real-world clinical conditions, and the patient population spans all age groups, including children, adults, and the elderly. The dataset covers multiple respiratory conditions with an imbalanced class distribution, including Chronic Obstructive Pulmonary Disease (COPD), Pneumonia, Healthy, Upper Respiratory Tract Infection (URTI), Bronchiectasis, Bronchiolitis, Lower Respiratory Tract Infection (LRTI), and Asthma.

Lung sound recordings are preprocessed and transformed into representative acoustic features using Mel-Frequency Cepstral Coefficients (MFCCs). These features are then used to train an optimized machine learning classifier capable of learning disease-specific sound patterns. A Random Forest Classifier is employed as the core machine learning model to learn disease-specific acoustic patterns and perform classification. The model achieves an overall accuracy of 91.74%, demonstrating its effectiveness in distinguishing between different respiratory conditions.

The trained model is integrated into a web-based application that allows users to upload lung sound recordings and receive predicted disease outcomes through an intuitive interface, highlighting the practical applicability of the system in supporting respiratory disease diagnosis. The proposed system demonstrates the effectiveness of machine learning in analyzing lung sounds and provides a scalable, non-invasive solution for assisting respiratory disease diagnosis.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

System : Pentium i3 Processor.
Hard Disk : 20 GB.
Monitor : 15’’ LED.
Input Devices : Keyboard, Mouse.
Ram : 8 GB.

SOFTWARE REQUIREMENTS:

Operating System : Windows 10 / 11.
Coding Language : Python 3.12.0.
Web Framework : Flask.
Frontend : HTML, CSS, JavaScript.

REFERENCE:

PUMIN DUANGMANEE, KHOMDET PHAPATANABURI, WONGSATHON PATHONSUWAN, TALIT JUMPHOO, ATCHARAWAN RATTANASAK, KHWANJIT ORKWEHA, PATIKORN ANCHUEN, MONTHIPPA UTHANSAKUL, AND PEERAPONG UTHANSAKUL, “Triplet Multi-Kernel CNN for Detection of Pulmonary Diseases From Lung Sound Signals”, IEEE ACCESS, VOLUME 13, 2025.

👉CLICK HERE TO BUY THIS PROJECT “Machine Learning-Based Respiratory Disease Classification Using Lung Sounds” SOURCE CODE👈

Frequently Asked Questions (FAQ’s) and Answers

1. What is the objective of this project?

The objective of this project is to develop an automated system that classifies respiratory diseases using lung sound recordings. The system analyzes auscultation audio signals with machine learning techniques to assist in identifying respiratory conditions in a non-invasive and efficient manner.

2. What type of data is used in this project?

The project uses lung sound recordings in .wav format along with corresponding annotation files. The dataset includes recordings of varying durations collected from patients across different age groups and contains both clean and noisy respiratory sounds to reflect real-world conditions.

3. Which respiratory diseases are classified by the system?

The system is designed to classify multiple respiratory conditions, including COPD, Pneumonia, Asthma, Bronchiectasis, Bronchiolitis, Upper Respiratory Tract Infection (URTI), Lower Respiratory Tract Infection (LRTI), and Healthy cases.

4. What machine learning algorithm is used in the project?

The project uses a Random Forest Classifier as the core machine learning model. This algorithm is selected due to its ability to handle complex feature relationships, robustness to noise, and efficient performance with moderate computational requirements.

5. How are features extracted from lung sound recordings?

Features are extracted using Mel-Frequency Cepstral Coefficients (MFCCs). MFCCs capture important frequency-domain characteristics of audio signals and are effective in representing lung sound patterns associated with different respiratory diseases.

6. What is the overall accuracy achieved by the system?

The system achieves an overall classification accuracy of 91.74%, demonstrating its effectiveness in distinguishing between multiple respiratory disease classes.

7. Is the system capable of handling noisy lung sound recordings?

Yes, the dataset used for training includes both clean and noisy recordings. This enables the model to learn robust acoustic patterns and perform effectively under real-world recording conditions.

8. What technologies are used to develop the system?

The system is developed using Python for backend processing, Flask as the web framework, and HTML, CSS, and JavaScript for the frontend interface. Machine learning and audio processing are implemented using Scikit-learn and Librosa.

9. How does the web application work?

Users upload lung sound recordings through the web interface. The system preprocesses the audio, extracts MFCC features, applies the trained machine learning model, and displays the predicted respiratory disease on the interface.

10. Is the system intended to replace medical professionals?

No. The system is designed as a decision-support tool to assist healthcare professionals. It does not replace clinical judgment or professional medical diagnosis.

Python IEEE Projects

Machine Learning-Based Respiratory Disease Classification Using Lung Sounds