خلاصه
1. مقدمه
2 روش های استخراج ویژگی
3 تشخیص گفتار لکنت: یادگیری ماشین سنتی و رویکردهای مبتنی بر یادگیری عمیق
4 بحث و تحلیل
5 نتیجه گیری و کار آینده
اعلامیه ها
منابع
Abstract
1 Introduction
2 Methods of feature extraction
3 Stuttered speech recognition: Traditional Machine Learning & Deep Learning based approaches
4 Discussion and analysis
5 Conclusion and future work
Declarations
References
چکیده
تشخیص گفتار لکنت یک مفهوم کاملاً مطالعه شده در پردازش سیگنال گفتار است. طبقه بندی اختلال گفتار محور اصلی این مطالعه است. طبقه بندی گفتار دارای لکنت با افزایش یادگیری ماشینی و یادگیری عمیق اهمیت بیشتری پیدا می کند. در این مطالعه، برخی از جدیدترین و تاثیرگذارترین روشهای تشخیص گفتار لکنت با بحث در مورد دستههای مختلف لکنت بررسی میشوند. فرآیند تشخیص گفتار لکنت عمدتاً به چهار بخش تقسیم میشود: پیش تاکید گفتار ورودی، تقسیمبندی، استخراج ویژگی و طبقهبندی لکنت. همه این بخش ها به اختصار توضیح داده شده و تحقیقات مرتبط مورد بحث قرار می گیرد. مشاهده میشود که روشهای مختلف یادگیری ماشین سنتی و طبقهبندی یادگیری عمیق برای تشخیص گفتار لکنتدار در چند دهه اخیر به کار گرفته شدهاند. یک تحلیل جامع بر روی روشهای استخراج و طبقهبندی ویژگیهای مختلف با کارایی آنها ارائه شده است.
توجه! این متن ترجمه ماشینی بوده و توسط مترجمین ای ترجمه، ترجمه نشده است.
Abstract
Stuttering speech recognition is a well-studied concept in speech signal processing. Classification of speech disorder is the main focus of this study. Classification of stuttered speech is becoming more important with the enhancement of machine learning and deep learning. In this study, some of the recent and most influencing stuttering speech recognition methods are reviewed with a discussion on different categories of stuttering. The stuttering speech recognition process is divided mainly into four segments-input speech pre-emphasis, segmentation, feature extraction, and stutter classification. All these segments are briefly elaborated and related researches are discussed. It is observed that different traditional machine learning and deep learning classification approaches are employed to recognize stuttered speech in last few decades. A comprehensive analysis is presented on different feature extraction and classification method with their efficiency.
Introduction
Human speech is employed for communication to precise their feelings, ideas, and thoughts. A sort of speech problem where the flow of speech is interrupted is understood as stuttering or generally heard as stammering. It is a speech disorder where the sufferers want to say but have difficulty saying it. Stutterers feel same of difficulty while communicating with other people, which often affect a person’s quality of life and interpersonal relationships. It creates negative vibes influencing job performance and opportunities. A huge number of people i.e., more than 70 million people worldwide are affected by this problem. This number is about 1% of the total population [41]. It is observed whenever they communicate, receiver person feels irritated by hearing the prolonged words and most of the time don’t understand. E. Charles Healey, in his article, sought a discussion of children reaction to stuttering, impacts of stuttering with listener recall and comprehension of story information listeners’, interferes stuttering on listeners’ reactions and listeners’ reaction on strategies and therapy programs on stuttering [21]. An enormous source of evidence-based information about the cited things has been provided in the extant literature. Stuttering, aging processes and several neurological diseases in relation to speech can be identified by muscular stiffness and analyzing the latency times in verbal reactions, their coordination and their patterns of the muscles (respiratory, glottal, oromandibular) involved in speaking [50]. Being an interdisciplinary field of research among different domains like speech pathology, psychology, speech physiology, acoustics and signal analysis, the field of stuttering speech recognition is one area of interest for the researches over previous few decades. Traditionally, the assessment of stuttering is done by manually counting and classifying the occurrence of disturbances in stuttering speech. Time of disfluency in total speech is also considered as a measurement to assess stuttered speech. But this type of manually stuttering assessment varies depending on different speech language pathologist (SLP). So, it is time consuming and liable to error.
Conclusion and future work
Speech is the communication carrier to express human thoughts, feelings and ideas. Stuttering, or stammering is a disorder of speech which affects millions of people in the glove. In the field of stuttered speech recognition, different machine learning models were applied for analysis and classification over the last few decades. In this study, different machine learning and deep learning models with their application in stuttered speech recognition are discussed. The 3 major classifiers i.e., ANNs, HMMs and SVM have been used to classify different types of stutterers. Deep learning algorithms have become very popular nowadays over traditional machine learning algorithms for stuttering speech recognition, discussed briefly in this study. The major challenges like small volume unlabeled data, similarity between different stuttering classes are observed. Moreover, an input speech file sometimes contains more than one types of stuttering which creates difficulties on labeling. Most of the research had been concentrated on prolongation and repetition types of stuttering. Some work on Interjection types of stuttering was also done but work on classification of broken words, revisions, incomplete phrases types of stuttering is almost nil. Most of the researchers labeled different no of stuttered speech from UClASS database manually in order to train their model. Different features like LPC, LPCC, PLP and MFCC were used in the previous researches to train and test the models among them MFCC features was extensively used. Reviews and comparisons of earlier researches have been highlighted in this paper. Accuracy in respect to recognition and correction of stuttering speech may be improved by employment of modified feature extraction algorithm and different deep learning based algorithms on large database.