Abstract
I. Introduction
II. Preliminaries
III. Greedy Support Conlitron Algorithm
IV. Greedy Support Multiconlitron Algorithm
V. Experimental Results
Authors
Figures
References
Abstract
Multiconlitron is a general theoretical framework for constructing piecewise linear classifier. However, it contains a relatively large number of linear functions, resulting in complicated model structure and poor generalization ability. Learning to prune redundant or excessive components may be a very necessary progression. We propose a novel greedy method, i.e., greedy support multiconlitron algorithm (GreSMA) to simplify the multiconlitron. In GreSMA, a procedure of greedy selection is first used. It generates the initial linear boundaries, each of which can separate maximum number of training samples under the current iteration. In this way, a minimal set of decision functions is established. In the second stage of GreSMA, a procedure of boundary adjustment is designed to retrain the classification boundary between convex hulls of local subsets, instead of individual samples. Thus, the adjusted boundary will fit the data more closely. Experiments on both synthetic and real-world datasets show that GreSMA can produce minimal multiconlitron with better performance. It meets the criteria of ‘‘Occam’s razor’’, since simpler model can help prevent over-fitting and improve the generalization ability. More significantly, the proposed method does not contain parameters that depend on the datasets or make assumptions of the underlying statistical distributions of the samples. Therefore, it should be regarded as an attractive advancement of piecewise linear learning in the general framework of multiconlitron.
Introduction
In pattern recognition, piecewise linear classifier (PLC) is effective when a statistical model cannot express the underlying distribution of samples [1]. It approximates the true classification boundary by a combination of hyperplanes. Since each piece is linear, a PLC is very simple to implement with requirement of low memory usage. Therefore, it has the potential to be applied to the scenarios of small reconnaissance robots, intelligent cameras, embedded and real-time systems, and portable devices [2]. Despite the simplicity in implementation, constructing a PLC usually requires complex computational procedure [3]. In general, there are two criteria that need to be considered, i.e., selecting appropriate number of hyperplanes and minimizing the error of classification. Under their guidance, many methods have been presented to synthesize PLCs over the last few decades. Hierarchical partitioning is one of the common ways. In 1996, Chai et al. [4] achieved a binary tree structure with genetic algorithm to design a PLC in the sense of maximum impurity reduction. For simplifying the construction of a decision-tree PLC, in 2006 Kostin [2] developed and implemented a simple and fast multi-class PLC with acceptable classification accuracies, based on tree division of subregion centroids. In 2016, Wang et al. [5] proposed hierarchical mixing linear support vector machines (SVMs) for nonlinear classification, which can be seen as special form of a PLC. Furthermore, Ozkan et al. [6] designed a highly dynamical self-organizing decision tree structure for mitigating overtraining issues.