یک چارچوب دستور زبان سفارشی
ترجمه نشده

یک چارچوب دستور زبان سفارشی

عنوان فارسی مقاله: یک چارچوب دستور زبان سفارشی برای طبقه بندی پرس و جو
عنوان انگلیسی مقاله: A customised grammar framework for query classification
مجله/کنفرانس: سیستم های خبره با کابردهای مربوطه – Expert Systems with Applications
رشته های تحصیلی مرتبط: مهندسی کامپیوتر
گرایش های تحصیلی مرتبط: هوش مصنوعی
کلمات کلیدی فارسی: پردازش زبان طبیعی، بازیابی اطلاعات، طبقه بندی متن، طبقه بندی پرس و جو، یادگیری ماشین
کلمات کلیدی انگلیسی: Natural language processing (NLP)، Information retrieval، Text classification، Query classification، Machine learning
نوع نگارش مقاله: مقاله پژوهشی (Research Article)
شناسه دیجیتال (DOI): https://doi.org/10.1016/j.eswa.2019.06.010
دانشگاه: School of Computing, University of Portsmouth, United Kingdom
صفحات مقاله انگلیسی: 17
ناشر: الزویر - Elsevier
نوع ارائه مقاله: ژورنال
نوع مقاله: ISI
سال انتشار مقاله: 2019
ایمپکت فاکتور: 5.891 در سال 2018
شاخص H_index: 162 در سال 2019
شاخص SJR: 1.190 در سال 2018
شناسه ISSN: 0957-4174
شاخص Quartile (چارک): Q1 در سال 2018
فرمت مقاله انگلیسی: PDF
وضعیت ترجمه: ترجمه نشده است
قیمت مقاله انگلیسی: رایگان
آیا این مقاله بیس است: خیر
آیا این مقاله مدل مفهومی دارد: ندارد
آیا این مقاله پرسشنامه دارد: ندارد
آیا این مقاله متغیر دارد: ندارد
کد محصول: E13561
رفرنس: دارای رفرنس در داخل متن و انتهای مقاله
فهرست مطالب (انگلیسی)

Abstract

1. Introduction

2. Categories of queries

3. Related studies

4. Customised grammar framework

5. CGF for query classification

6. Experiments

7. Performance comparison

8. Discussion

9. Conclusions and future work

CRediT authorship contribution statement

Declaration of Competing Interest

Appendix A. Grammar terms and corresponding abbreviations

References

بخشی از مقاله (انگلیسی)

Abstract

In real-life classification problems, prior information about the problem and expert knowledge about the domain are often used to obtain reliable and consistent solutions. This is especially true in fields where the data is ambiguous, such as text, in which the same words can be used in seemingly similar texts, but have a different meaning. A promising avenue for text classification is machine learning, which has been shown to perform well in a variety of applications including query classification and sentiment analysis. Many of the proposed approaches rely on the bag-of-words representation, which loses the information about the structure of the text. In this paper, we propose a Customised Grammar Framework for text classification, which exploits domain-related information and a new way to represent text as a series of syntactic categories forming syntactic patterns. The framework employs a formal grammar approach for transforming the text into the syntactic patterns representation. We applied the framework for the query classification problem and our results show that our approach outperforms previous ones in terms of classification performance.

Introduction

In many classification real-world problems, some prior information about the structure of the problem are known in advance, such as the relation between some attributes or the patterns that are likely to appear in certain instances. Moreover, the features extracted from many real-world problems are not completely independent and the meaning of each feature may be influenced by other attributes and/or the position of the attribute in the instance. For example, in signal processing, the same set of signal features may have different meanings (and thus, belong to different classes) depending on the sequence in which these features appear in the signal. Another example is text classification – in addition to words in the text, the syntax plays an important role in defining the meaning of the text. Text classification is an important task in Natural Language Processing with many applications, such as web search (e.g. Hernández, Gupta, Rosso, & Rocha, 2012; Højgaard, Sejr, & Cheong, 2016; Shi, Yao, Tian, & Jiang, 2016; Wu, Zhang, Zhao, & Liu, 2010), question–answering (e.g. Hardy & Cheah, 2013; Li, Su, Chen, & Yuan, 2017; Zhang & Lee, 2003), sentiment analysis (e.g. Altrabsheh, Cocea, & Fallahkhair, 2014; Glorot, Bordes, & Bengio, 2011; Taboada, Brooke, Tofiloski, Voll, & Stede, 2011; Yang et al., 2017). However, traditional text classifiers often rely on many human-designed features, such as dictionaries, knowledge bases and special tree kernels rather than the relations between the entities, as well as the types of the entities and relations which carry much more information to represent the texts (Wang, Song, Li, Zhang, & Han, 2016). The selection of distinctive features is essential for text classification (Uysal, 2016; Uysal & Gunal, 2012).