2.1. Wearable skin attached real-time LSMP sensor
We developed a thin, flexible device that can be attached to the human body and is controlled by wireless communication for continuous monitoring and long-term analysis of lung sounds. The key factors in its development were that the data be highly reliable and sensitive to allow lung sounds to be clearly distinguished and that the skin not be irritated even when attached for several days. Figure 1(a) shows schematics of the LSMP device, and (b) includes a photo of the flexible printed circuit board (fPCB) (left) and the biocompatible silicone used as a cover. The cover is open to operate the slide-type power switch and charger port for charging the battery. The blue LED indicates the operation status; through blink (waiting) or light on (pairing). All the other components are sealed through a cover for circuit protection and noise reduction purposes. The component list of LSMP is described in Table S1.
Figure 1(c) shows a block diagram of the major parts driving the LSMP device, which was programmed in C using embedded software. The power source is a 3.7 V battery that supplies a 3.3 V voltage via the power management unit to the circuit: the red hatch part. A MEMS microphone, purple hatch part, receives clock signals and power through the MCU and transmits the bioacoustic signals in Pulse Density Modulation (PDM) format to a mobile device through the Bluetooth Low Energy (BLE), blue hatch part. And a blue LED indicates the operation status.
We conducted a test to evaluate the basic characteristics of the developed LSMP. Figure 1(d) shows the schematic of the experimental setup to evaluate the acoustic response using LSMP and a mobile device. LSMP is attached to Child Sim and collects bioacoustic signals through the acoustic path formed at the fPCB. The bioacoustic signal is wirelessly transmitted to the mobile device through the BLE. Figure S1 shows the photograph of the experimental setup.
Streaming data are received through a customized iOS app on a mobile device, whose screen is shown in Fig. 1(e); the bioacoustic signal can be played, visualized, and analyzed in real-time through the app. Data analysis using AI is not included in the app function due to the memory resource limitation of the mobile device at this time. The customized iOS app function is explained in detail in Experimental section and Movie S1.
Due to the high-performance characteristics of the MEMS microphone, unexpected external noise can be mixed with the bioacoustics signal. To evaluate the effect of external noise according to the directionality of the microphone, the sound generated by the 1/3 octave-band tuning fork was recorded and reproduced with the same intensity as a reference signal, respectively. First, we collected bioacoustics signals from the Child Sim using the LSMP with one of two different MEMS microphones—one unidirectional, the other omnidirectional—in the presence of 60 dBA, reference noise signals centered at different frequencies and generated by the noise generator. Figure 1(f) shows the opposite directional acoustic response ratio of measured noise by the two MEMS microphones for the different frequency bands (along with the intensity of the noise at those bands). The results shown in the figure demonstrate a difference of up to 15 percent between the unidirectional and omnidirectional microphones, with lower noise levels detected by the unidirectional microphone. Thus, we chose a unidirectional microphone for implementation in LSMP sensor to help reduce the influence of external noise.
We next sought to assess the effect of different types of Bluetooth (BT) module antennas on the RSSI for the system. We equipped the LSMP with either an embedded or an external antenna, then attached it to the Child Sim as shown in Figure S1. Figure 1(g) shows the values of the RSSI for the different distances between the antenna and the receiver. It shows an average signal strength of -70 dBm up to 5 m when an external antenna is used; this kind of antenna allows greater circuit flexibility than an embedded antenna, but the reception strength is lower. It has been reported that smooth communication can be achieved without delay and interruption when the RSSI value is above − 80 dBm. 38. This shows the value of conventional BT-based communication, which is the most commercially available and embedded antenna type, and indicates that data communication is smooth under the proper conditions. Photograph and schematic illustration of the LSMP with the different antenna types is explained in detail in Figure S2.
For continuous monitoring with the LSMP, we used medical-grade adhesives to attach our sensor to the skin and a simple in-vivo test to investigate their sustainability and compatibility with the skin. We attached four samples to the posterior (auscultation position between the scapula and vertebral line) of the human subject and took one off after 1, 3, 5, and 7 days of daily living; Fig. 1(h) shows the surface of the skin after the test. After seven days of having the sensor attached, the skin became slightly reddened, but the sensor did not fall off, and skin irritation did not occur. Therefore, we have verified that LSMP can be attached to human skin for more than five days for use as a long-term continuous pulmonary monitoring device.
2.2. Characterization of the LSMP sensor with a normal subject
We demonstrated the performance of the LSMP with an artificial simulator that generates sound signals in Figure S3, S4 and described in supplementary SI section 1. The feasibility of classifying respiratory symptoms was also confirmed by assessing differences in the acoustic properties of wheezing and crackling sounds, both of which are types of adventitious breathing. However, the auscultation from an actual human body contains the sounds from other organs in addition to the lungs, and we need a way to separate those sounds for further analysis. In this experiment, we evaluate the performance of LSMP with a human subject. Figure 2 shows the complete data analysis process and development of the algorithm for classifying the original bioacoustic signals recorded by the LSMP for heart rate (HR) and respiratory rate (RR). These are processed simultaneously for interpretation, as shown in Figs. 2(a) and 2(b), respectively. These processing steps are conducted via the app after the entire auscultation signal is acquired. Figure 2(c) shows the original, 12-second bioacoustic signal acquired using the LSMP from the posterior left lung field of a healthy volunteer (36 years old, male). Prior to processing, the original bioacoustic signal consists of indiscriminate mixing of the information from the heart (blue line) and lungs (red line); the boundary between inhalation and exhalation is unclear, and it is challenging to classify systolic (S1)/diastolic (S2) of the heart sound. Figure 2(d) and 2(e) show the results of the data processing of the original sound, producing heart and respiration sounds, respectively, that can be used for further classification. Next, the estimated heart rate and respiration rate are calculated by counting the cardiac and respiration cycles, respectively, for 10 seconds and are presented as the beats per minute and breaths per minute. Figure 2(f) shows an expansion of the red dot box shown in Fig. 2(d), highlighting the S1 and S2 signals within a cardiac cycle. HR and HRV can be calculated through the analysis of cardiac S1 and S2 information. This can provide important information (arrhythmia and heart failure) to cardiologists. Figure 2(g) shows a spectrogram of the entire auscultation signal, corresponding to normal breathing and the heartbeat. Each frequency component of the signal is included in its time-intensity plot, but by nature, they cannot be displayed directly; thus, the spectrogram is used to simultaneously analyze the frequency components of the wave file and their intensity over time. The heartbeat can be strongly visualized in the 20–100 Hz band, and normal respiration can be seen as a soft, broad signal in the 100–1000 Hz band. Since the main purpose of LSMP is to monitor lung sounds for a continuous long-term using wireless communication, communication must be maintained without interruption even when the sensor and cell phone are separated by a certain distance or while wearing clothes. Figure 2(h) shows the RSSI values obtained when the subject was and was not wearing clothes as a function of distance from the mobile device. Since the antenna embedded within the LSMP and the mobile device communicate directly, the distance at which the data can be received varies greatly depending on the experimental conditions. The presence of obstacles (clothes) results in an average difference of 10 dBm within 5 m and an average signal strength of -78 dBm without any other communication problems. These findings indicate that when attached to the skin of a healthy subject, the LSMP can maintain a steady communication strength within 5 m of the receiver without interruption. In summary, the data processing algorithm of our system is capable of distinguishing the HR and RR of a healthy subject from the bioacoustic data acquired through the LSMP, which extracted a soft, breezy, broadband breath sound. We also confirmed that the LSMP can classify heart and lung sounds even when they are mixed. Furthermore, it is possible to classify the S1 and S2 periods of the heartbeat, indicating that the use of the LSMP can be extended to the monitoring of cardiovascular diseases.
2.3. Clinical study of pediatric patients with asthma
Diagnosis of respiratory diseases including asthma and COPD through pulmonary function tests (PFTs) analysis is known as the gold standard. Given the findings of our assessment of the proposed device, we expect long-term monitoring using the LSMP to help evaluate the degree of worsening in pediatric asthma patients under 6 years old for whom PFTs cannot be performed 39. To confirm the acoustic characteristics of pediatric asthma patients, we conducted additional measurements and analyses with the LSMP in a patient from this population. Figure 3(a) shows a photograph of the pediatric asthma patient recruited for this experiment. And red dash box represents a magnified view of the attached LSMP. The LSMP was attached to the patient's back according to the clinician’s instructions, and bioacoustic data were recorded for 15 minutes. Figure 3(b) shows 12 seconds of representative time-series data, in which normal and abnormal breathing were recorded. The plot shows both the inhalation and exhalation cycle, and clear differences in intensity between normal and abnormal breathing. In particular, the 1st, 2nd, 3rd, and 6th of the abnormal breaths show relatively strong intensity, which can be used as an indication of an abnormal breathing sound and has a similar trend to the presentation of typical wheezing as observed in the time-intensity plot from the Child Sim, shown in Figure S3(b). Figure 3(c) shows spectrograms of normal breathing period (blue dot box) and Fig. 3(d) shows spectrograms of abnormal breathing period (red dot box), the quantitative details of individual physiological events during normal and abnormal breathing. An analysis of the spectrograms reveals that during normal breathing, no specific frequency peak develops, and exhalations are slightly stronger than inhalations. Meanwhile, during the abnormal breathing period, wheezing signatures (duration > 200 ms) were confirmed. A clear wheezing signature show 4 times during the exhalation phase as marked black dot box. Figures 3(e) and 3(f) show the FFT of the signal during exhalation in the normal and abnormal breathing periods, respectively. During normal breathing, no signal characteristics other than the background sound component can be observed, while the FFT of the abnormal breathing period reveals a characteristic peak for wheezing. In summary, we used LSMP with a pediatric asthma patient to analyze the inhalation/exhalation phase of normal breathing as well as identify the wheezing during abnormal respiration. In addition, it can be evaluated that the RR per minute of this pediatric asthma patient is 57 breaths on average, which is significantly higher than the average RR of 28 to 46 breaths in normal same pediatric ages.
2.4. Clinical study of elderly patients with COPD
We also conducted measurements and analyses of the acoustic characteristics of COPD patients with LSMP. Figure 4(a) shows a photograph of an elderly patient with COPD. The LSMP was attached to the patient's back and bioacoustic data were recorded for 15 minutes. Figure 4(b) shows 12 seconds of representative time-series data, in which normal and abnormal breathing were successively recorded among the continuously measured data.
Although the inhalation/exhalation cycle can be observed during normal breathing, any differences in intensity with the abnormal breathing are challenging to assess in the time-intensity plot due to the noisy situation. However, the three breaths captured from 6 to 12 seconds have a relatively long duration and strong intensity, and although there is extensive noise, it can be interpreted as a sign of abnormal respiration. Figure 4(c) shows the spectrogram for the normal breathing period, demonstrating the presence of external heating, ventilation & air conditioning (HVAC) system, very short-duration, and relatively broadband noise between respirations. Figure 4(d) shows the spectrogram for the abnormal respiration period, in which different wheezing signals of varying duration and frequency band can be observed during the three exhalations. This shows a wheezing signature consisting of a strong intensity at a certain frequency in the spectrogram for a certain duration, which can be easily distinguished as the inflection line (red box) distinguished from the background sound for each adventitious exhalation. Figures 4(e) and 4(f) show the FFT during one exhalation for normal and abnormal breathing, respectively. During the 400 ms expiratory phase of normal breathing (red box in Fig. 4(c)), no characteristic signal other than the background sound can be observed. In Fig. 4(f), FFT analysis was performed to quantify the three types of wheezing signals observed during the expiratory phases of the abnormal breathing period. The analysis reveals a 400 ms monophonic wheezing sound, presenting with a strong, single 600 Hz peak, and two polyphonic wheezing sounds, demonstrating a strong 400 Hz peak and a 580 Hz peak during 1 second. Polyphonic wheezes are a known symptom of patients with extensive airflow obstruction (asthma, COPD, chronic bronchitis, etc.) and manifest as a high-pitched wheeze during breathing when the airway is narrow or stiff 40. Similar to conclusion for the pediatric asthma patient, we showed that LSMP can be used to distinguish between normal breathing and abnormal breathing through short, continuous monitoring in a noisy environment. Given our findings, we expect that long-term monitoring using the LSMP can be useful for classifying the characteristics of abnormal breathing in elderly patients with respiratory diseases.
2.5. AI-based wheezing counting algorithm for long-term lung sound analysis
Machine learning is an effective tool for the classification of sounds, and we just need a well-structured model and raw data. We conducted a simple demonstration to show how machine learning can be used with LSMP to classify breathing sounds as shown in Fig. 5.
Open access database are used for these learning data, an example of which is shown in Fig. 5(a). The constructed database consists of 18 types of normal and 11 types of wheezing breathing sounds that were extracted. We modified the length of the extracted sound, representing changes from 1x speed to 0.5x speed in 0.1x decrements, as shown in Fig. 5(b). Then the data were extracted and converted into Log-Mel spectrogram images, as shown in Fig. 5(c). Figure 5(d) shows the flow of the data processing in the deep learning architecture. We used the max-pooling for computational efficiency and memory saving, and a dropout layer for preventing overfitting. And as the output layer, we used the SoftMax function because the sum value of output (= 1) can be effectively utilized in counting algorithm. The model uses binary cross-entropy as the loss function and Adaptive Moment Estimation (Adam) as the optimizer. Training and validation data were divided at a ratio of 8:2 from learning data; additional details of the model architecture and the data splitting are shown in Figs. 5(a) and S6(b), respectively. The receiver operating characteristic (ROC) curve of the trained model, depicted in Fig. 5(e) and Figure S7, indicates that the model had excellent training efficiency. More detailed information about AI algorithm are described in supplementary SI section 2.
The use of the LSMP to continuously monitor respiratory patients and quantify the extracted breathing sounds represents a novel clinical diagnostic application that can overcome the limitations of the existing intermittent use of existing stethoscopes. The LSMP was attached to a patient's anterior right lung field by a clinician and used to record the patient’s lung sounds, as shown in Fig. 6(a). Bioacoustic data were continuously recorded using the LSMP for 79 minutes while the patient lay on an air mattress and received oxygen therapy. Figure 6(b) shows a time-series plot of the entire continuously measured waveform; three 12-second portions of the data are highlighted and blown up in the next three panels as examples of the characteristics of the patient’s breathing. During forced respiration, as seen in Fig. 6(c), a strong signal can be seen than in the normal respiration part when compared to pediatric data as shown in Fig. 3(b) and COPD patient data as shown in Fig. 4(b). The strong regular intensity and breathing duration are observed during normal breathing caused by the oxygen therapy device as shown in Figure S8(a, d). A simple time-series analysis could cause forced respiration to be confused with abnormal respiration; however, this signal is the result of the artificial ventilation provided to the patient, producing relatively rough and high-intensity respiration but no abnormal respiration. The abnormal respiration shown in Fig. 6(d) is rare among the symptoms of respiratory diseases, depicting a low-pitch wheeze in both the inhalations and exhalations. In addition to the strong intensity signal due to forced breathing, the wheezing signature due to the deformation of the airway can be observed, and although a strong background sound component is present, it can be distinguished by a distinct inflection line as shown in Figure S8(b). The abnormal respiration depicted in Fig. 6(e) shows the wheezing signature in the exhalation, a symptom in typical asthma and COPD, as well as the presence of polyphonic wheezing rather than single component wheezing.
To reduce the clinician’s labor and misdiagnosis, we developed an AI-based event counting algorithm to monitor the time-varying symptoms from COPD patient clinical data with the LSMP. Figure 6(f) shows a 30-second segment of clinical data for the preliminary test, covering 12 cycles of inhalation and exhalation. The blue line is bioacoustic data used for test data, and the trained model’s predictions are marked in yellow dots. The trained model predicts the value between 0 to 1 for each label by the result of the SOFTMAX activation function. In this case, we use prediction values for the ‘wheeze’ label. The clinical data were sliced with a fixed 0.6 second window taken every 0.06 seconds; these segments were then input into the model to calculate the predicted values for each continuous, sliced segment of the lung sound. The prediction resolution was sufficiently high to detect normal and wheezing sounds in one breath cycle. Because the trained model predicts the incoming signals every 0.06 seconds, we included an algorithm to count the number of wheezing events. The predicted values range between 0.0 and 1.0, indicating how close the input data are to a wheezing sound; a value of 1.0 indicates a wheeze, and a value of 0.0 indicates normal breathing. We set the thresholds for wheezing and normal breathing to 0.9 and 0.1 respectively for the values are the appropriate hyperparameters for the model’s high prediction accuracy; in this way, when the predicted value dropped from 0.9 (wheezing) to 0.1 (normal breathing), the AI counted a single wheezing event; this is shown schematically using an example wheezing sound in the dashed red square box in Fig. 6(f), which is blown up in Fig. 6(g). The plotted yellow rectangles are the range over which we predict a wheezing sound. The counted number of events and the timing precisely matched the 12 wheezing events.
Figure 6(h) shows a comparison of the number of events counted over time by the AI algorithm and a human observer (pulmonologist). Over a total of 1630 breaths, the AI algorithm and clinician counted the number of wheezing events every 5 minutes with the 79-minute COPD lung sound described above, respectively. The total count was 1450 for the clinician and 1430 for AI; Figure S9(a) shows that the average match rate over the entire set of counts was 80.5%. The results show that the AI algorithm can classify normal and wheezing sounds with high accuracy, especially in asthma or COPD patients, indicating that the LSMP can monitor lung sounds to determine the severity and change of symptoms over time. As a supplement, we compare the count trajectory every one minute (Figure S9(b)) and plot the prediction of three extracted regions whose length is approximately 18 seconds (Figure S10).