خلاصه
1. مقدمه
2. مدل تشخیص سرفه
3. دوربین صدا
4. نتایج و بحث
5. نتیجه گیری
اعلامیه منافع رقابتی
قدردانی ها
منابع
Abstract
1. Introduction
2. Cough detection model
3. Sound camera
4. Results and discussion
5. Conclusion
Declaration of Competing Interest
Acknowledgments
References
چکیده
سرفه یکی از علائم معمول COVID-19 است. برای شناسایی و بومی سازی صداهای سرفه از راه دور، یک مدل یادگیری عمیق مبتنی بر شبکه عصبی کانولوشن (CNN) در این کار توسعه داده شد و با یک دوربین صوتی برای تجسم صداهای سرفه ادغام شد. مدل تشخیص سرفه یک طبقهبندی باینری است که ورودی آن یک ویژگی صوتی دو ثانیهای و خروجی یکی از دو استنتاج (سرفه یا موارد دیگر) است. تقویت دادهها بر روی فایلهای صوتی جمعآوریشده برای کاهش عدم تعادل کلاس و انعکاس نویزهای مختلف پسزمینه در محیطهای عملی انجام شد. برای مشخص کردن موثر صدای سرفه، ویژگیهای معمولی مانند طیفنگارها، طیفنگارهای مقیاسشده ذوب، و ضرایب فرکانس ذوب مغزی (MFCC) با استفاده از نقشههای سرعت (V) و شتاب (A) آنها در این کار تقویت شدند. VGGNet، GoogLeNet و ResNet به طبقهبندیکنندههای باینری سادهسازی شدند و به ترتیب V-net، G-net و R-net نامگذاری شدند. برای یافتن بهترین ترکیب از ویژگی ها و شبکه ها، آموزش در مجموع 39 مورد انجام شد و عملکرد با استفاده از نمره آزمون F1 تایید شد. در نهایت، نمره آزمون F1 91.9٪ (دقت آزمون 97.2٪) از G-net با ویژگی MFCC-V-A (به نام Spectroflow)، یک ویژگی صوتی موثر برای استفاده در تشخیص سرفه به دست آمد. مدل تشخیص سرفه آموزش دیده با یک دوربین صوتی (یعنی دوربینی که منابع صدا را با استفاده از یک آرایه میکروفون شکل دهنده پرتو تجسم می کند) ادغام شد. در یک آزمایش آزمایشی، دوربین تشخیص سرفه صداهای سرفه را با امتیاز F1 90.0٪ (دقت 96.0٪) تشخیص داد و مکان سرفه در تصویر دوربین به صورت واقعی ردیابی شد.
توجه! این متن ترجمه ماشینی بوده و توسط مترجمین ای ترجمه، ترجمه نشده است.
Abstract
Coughing is a typical symptom of COVID-19. To detect and localize coughing sounds remotely, a convolutional neural network (CNN) based deep learning model was developed in this work and integrated with a sound camera for the visualization of the cough sounds. The cough detection model is a binary classifier of which the input is a two second acoustic feature and the output is one of two inferences (Cough or Others). Data augmentation was performed on the collected audio files to alleviate class imbalance and reflect various background noises in practical environments. For effective featuring of the cough sound, conventional features such as spectrograms, mel-scaled spectrograms, and mel-frequency cepstral coefficients (MFCC) were reinforced by utilizing their velocity (V) and acceleration (A) maps in this work. VGGNet, GoogLeNet, and ResNet were simplified to binary classifiers, and were named V-net, G-net, and R-net, respectively. To find the best combination of features and networks, training was performed for a total of 39 cases and the performance was confirmed using the test F1 score. Finally, a test F1 score of 91.9% (test accuracy of 97.2%) was achieved from G-net with the MFCC-V-A feature (named Spectroflow), an acoustic feature effective for use in cough detection. The trained cough detection model was integrated with a sound camera (i.e., one that visualizes sound sources using a beamforming microphone array). In a pilot test, the cough detection camera detected coughing sounds with an F1 score of 90.0% (accuracy of 96.0%), and the cough location in the camera image was tracked in real time.
Introduction
Since the COVID-19 outbreak started, there has been increasing demand for a monitoring system to detect human infection symptoms in real time in the field. The most common symptoms of infectious diseases including COVID-19 are fever and cough. While fever can be detected remotely using a thermal imaging camera, there is still no widespread monitoring system able to detect coughing. Since coughing is a major cause of virus transmission through airborne-droplets, it is very important to detect coughing to prevent the spread of infectious diseases. Although the cough detection is not sufficient to detect COVID-19, it is expected to be effective in preventing the spread of COVID-19 infection in the pandemic situation.
In previous studies to detect cough sounds, various acoustic features were used in conventional machine learning methods. Barry et al. (2006) developed a program that calculates characteristic spectral coefficients from audio recordings, which are then classified into cough and non-cough events by using probabilistic neural networks (PNN). Liu et al. (2013) introduced gammatone cepstral coefficients (GTCC) as a new feature and applied support vector machine (SVM) as a classifier for cough recognition. You et al. (2017a) extracted subband features by using gammatone filterbank and then trained SVM, k-nearest neighbors (k-NN) and random forest (RF) with the features in order to make final decision using ensemble method. Further, You et al. (2017b) exploited non-negative matrix factorization (NMF) to find the difference of cough and other sounds in a compact representation.
Conclusion
Since the outbreak of COVID-19, there has been increasing demand for systems that can detect infection symptoms in real time in the field. It is very important to detect coughs to prevent the spread of infectious diseases, because droplets released during coughing are one of the transmission pathways of viral disease. From this perspective, a cough detection camera was developed that can monitor coughing sounds in real time in the field, and its detection performance was evaluated by conducting a pilot test in an office environment. As a result of reviewing the modeling process and the pilot test, we confirmed that DA technique, Spectroflow, and the inception module of G-net have significant contributions to the performance of the cough detection camera. For future works, it is suggested to improve further the cough detection performance by reflecting real environmental noise or proposing a network that maximizes the advantages of the inception module. In addition, through data collection using IoT devices, the cough detection camera is expected to be used as a medical device that automatically monitors patient conditions in hospitals or as a monitoring system to detect epidemics in public places, such as schools, offices, and restaurants.