انتخاب نوع هدف اکتشافی برای پیشبرد عملکرد پوشش مبتنی بر قانون یادگیری
ترجمه نشده

انتخاب نوع هدف اکتشافی برای پیشبرد عملکرد پوشش مبتنی بر قانون یادگیری

عنوان فارسی مقاله: انتخاب نوع هدف اکتشافی برای پیشبرد عملکرد پوشش مبتنی بر قانون یادگیری
عنوان انگلیسی مقاله: Heuristic target class selection for advancing performance of coverage-based rule learning
مجله/کنفرانس: علوم اطلاعاتی - Information Sciences
رشته های تحصیلی مرتبط: مهندسی کامپیوتر
گرایش های تحصیلی مرتبط: مهندسی نرم افزار، مهندسی الگوریتم ها و محاسبات، برنامه نویسی کامپیوتر، هوش مصنوعی
کلمات کلیدی فارسی: یادگیری ماشین، سیستم های مبتنی بر قانون، طبقه بندی مبتنی بر قانون، یادگیری درخت تصمیم گیری، قانون یادگیری، منشور
کلمات کلیدی انگلیسی: Machine learning، Rule based systems، Rule based classification، Decision tree learning، Rule learning، Prism
نوع نگارش مقاله: مقاله پژوهشی (Research Article)
شناسه دیجیتال (DOI): https://doi.org/10.1016/j.ins.2018.12.001
دانشگاه: School of Computer Science and Informatics, Cardiff University, Queen’s Buildings, 5 The Parade, Cardiff, CF24 3AA, United Kingdom
صفحات مقاله انگلیسی: 40
ناشر: الزویر - Elsevier
نوع ارائه مقاله: ژورنال
نوع مقاله: ISI
سال انتشار مقاله: 2019
ایمپکت فاکتور: 6/774 در سال 2018
شاخص H_index: 154 در سال 2019
شاخص SJR: 1/620 در سال 2018
شناسه ISSN: 0020-0255
شاخص Quartile (چارک): Q1 در سال 2017
فرمت مقاله انگلیسی: PDF
وضعیت ترجمه: ترجمه نشده است
قیمت مقاله انگلیسی: رایگان
آیا این مقاله بیس است: خیر
کد محصول: E11240
فهرست مطالب (انگلیسی)

Abstract

1- Introduction

2- Preliminaries

3- Related work

4- Rule learning driven by PrismCTC

5- Experimental results

6- Conclusions

References

بخشی از مقاله (انگلیسی)

Abstract

Rule learning is a popular branch of machine learning, which can provide accurate and interpretable classification results. In general, two main strategies of rule learning are referred to as ‘divide and conquer’ and ‘separate and conquer’. Decision tree generation that follows the former strategy has a serious drawback, which is known as the replicated sub-tree problem, resulting from the constraint that all branches of a decision tree must have one or more common attributes. The above problem is likely to result in high computational complexity and the risk of overfitting, which leads to the necessity to develop rule learning algorithms (e.g., Prism) that follow the separate and conquer strategy. The replicated sub-tree problem can be effectively solved using the Prism algorithm, but the trained models are still complex due to the need of training an independent rule set for each selected target class. In order to reduce the risk of overfitting and the model complexity, we propose in this paper a variant of the Prism algorithm referred to as PrismCTC. The experimental results show that the PrismCTC algorithm leads to advances in classification performance and reduction of model complexity, in comparison with the C4.5 and Prism algorithms.

Introduction

Rule learning is a popular form of machine learning approaches, which essentially aims at production of rule based systems [32]. In general, rule learning is undertaken through two well-known strategies, namely, ‘divide and conquer’ (DAC), and ‘separate and conquer’ (SAC). The DAC strategy is also known as Top-Down Induction of Decision Trees (TDIDT), since this strategy aims at generation of a decision tree that can be directly converted into a set of if-then rules. For example, ID3 [41] and C4.5 [43] are commonly known algorithms of TDIDT with high popularity in real-world applications. On the other hand, the SAC strategy is also referred to as the covering approach, since this strategy aims at learning a rule that covers some training instances that should then be deleted before the learning of the next rule is initiated. The representative of the covering approach is the Prism algorithm [5]. Due to the fact that the DAC strategy produces rules that are automatically represented in the form of a decision tree and rules in the form of ‘if-then’ can be directly generated from training instances through the SAC strategy [26], thus the former strategy is referred to as ‘decision tree learning’ and the latter strategy is referred to as ‘rule learning’ (in a narrow sense) in the rest of this paper. The nature of decision tree learning leads to parallel growth of different branches that can be converted into several rules, i.e., the DAC strategy enables different rules to be learned in parallel. Since decision tree learning starts from attribute selection for the root node, all rules must have this selected attribute as the common attribute. The constraint on the common attribute is likely to result in a decision tree containing redundant parts (i.e., the replicated sub-tree problem) [5]. In this case, after the decision tree is transformed into a set of rules, some of these rules would have redundant terms.