خلاصه
مقدمه
II. بررسی ادبیات
III. روش طراحی پیشنهادی
IV. نتایج و بحث
نتیجه گیری
منابع
Abstract
I. INTRODUCTION
II. LITERATURE REVIEW
III. PROPOSED DESIGN METHODOLOGY
IV. RESULTS AND DISCUSSIONS
V. CONCLUSION
REFERENCES
چکیده
هدف این مقاله توسعه الگوریتمی برای تقویت تشخیص گفتار یک گفتار لکنتی است. لکنت اختلالی است که با تکرار غیر ارادی، طولانی شدن کلمات/هجاها، یا فواصل بی اختیار سکوت بر روان گفتار تأثیر می گذارد. سیستم های تشخیص گفتار فعلی قادر به تشخیص گفتار لکنتی نیستند. روشهایی برای تشخیص لکنت در ادبیات گزارش شده است، اما تکنیکهای کارآمد برای اصلاح لکنت گزارش نشده است. این مقاله به این موضوع می پردازد و روش هایی را برای تشخیص و اصلاح لکنت در محدوده زمانی قابل قبول پیشنهاد می کند. برای حذف طولانی شدن (ها) از نمونه، آستانه دامنه از طریق شبکه های عصبی توسعه داده شده است. تکرارها از طریق الگوریتم حذف تکرار رشته ای با استفاده از یک سیستم موجود متن به گفتار (TTS) حذف می شوند. بنابراین، سیگنال خروجی، خالی از تمام لکنت ها، تشخیص گفتار بهتری ایجاد می کند.
توجه! این متن ترجمه ماشینی بوده و توسط مترجمین ای ترجمه، ترجمه نشده است.
Abstract
The aim of this paper is to develop an algorithm to enhance speech recognition of a stuttered speech. Stuttering is a disorder that affects the fluency of speech by involuntary repetition, prolongation of words/syllables, or involuntary silent intervals. Current speech recognition systems fail to recognize stuttered speech. Methods to detect stutter have been reported in literature but efficient techniques for stutter correction have not been reported. This paper addresses this issue and proposes methods to detect and correct stutter within acceptable time limits. To remove prolongation(s) from the sample, amplitude thresholding through neural networks is developed. Repetitions are removed through string repetition removal algorithm using an existing Text-to-Speech (TTS) system. Thus, the output signal, void of all stutters, produces better speech recognition.
Introduction
Stuttering is a speech disorder characterized by repetition of sounds, syllables, or words; prolongation of sounds. An individual who faces this disorder knows what he or she intend to say but is unable to produce fluent speech. Millions of people, in today’s world, suffer from various speech disorders like stuttering, lisp, and articulation disorder. This often renders them unable to utilize certain things that a normal person takes for granted, like speech recognition systems.
Stuttering disorder is characterized by disruptions in the production of speech sounds, called disfluencies. Disfluencies are not necessarily a problem; however, they can hinder communication when a person produces too many of them. Most people often produce brief disfluencies. For instance, some words are repeated or prolonged while others are preceded by an ‘um’ or ‘uh’. In most cases, stuttering has an impact at least on some daily activities. The specific everyday activities that a person finds challenging to perform, vary across individuals. For example, for some people, communication difficulties happen only during specific activities, like talking on the phone, talking before large groups, utilizing everyday tools that use speech as inputs. An author claims, “Stuttering cannot be permanently cured; it may go into remission for a time, or clients can learn to shape their speech into fluent speech with the appropriate speech pathology treatment” [1].
The main objective of this paper is to present an algorithm that efficiently detects and corrects stutter in a speech segment of a person with stuttering speech disability. The proposed algorithm gives an accuracy level of 86% for 50 stutter speech samples. Two algorithms were used for more precise stutter removal system that can be built on any device.
The developed system can be incorporated into any existing speech recognition system. It can also serve as a speech therapy system where a user suffering from stutter can sound like the correct output obtained from the system. Hence the device can be used by people suffering from stutter to use the existing virtual assistant services, or talk to others with confidence using the device. This would enhance the level of communication amongst people with this disorder.