Abstract
1- Introduction
2- Literature review of machine-learning methods for bankruptcy prediction
3- Data
4- Preliminary statistical analysis
5- Analysis of feature importance
6- Bankruptcy prediction
7- Conclusions
References
Abstract
Bankruptcy prediction is still important topic receiving notable attention. Information about an imminent bankruptcy threat is a crucial aspect of the decision-making process of managers, financial institutions, and government agencies. In this paper, we utilize a newly acquired dataset comprising financial parameters derived from the annual reports of small- and medium-sized companies. The data, which reveal the true ratio between bankrupt and non-bankrupt companies, are severely imbalanced and only contain a small fraction of bankrupt companies. Our solution to overcome this challenging scenario of imbalanced learning was to adopt three oneclass classification methods: a least-squares approach to anomaly detection, an isolation forest, and one-class support vector machines for comparison with conventional support vector machines. We provide a comprehensive analysis of the financial attributes and identify those that are most relevant to bankruptcy prediction. The highest prediction performance in terms of the geometric mean score is 91%. The results are validated on two datasets from the manufacturing and construction industries.
Introduction
The recent financial crisis showed the increasing vulnerability of firms involved in complex business relations, relations with financial institutions, obligations toward tax agencies, etc. The threat of financial contagion is rising with the growing complexity of the economy. The latter experience brought evidence of the fragile financial stability of numerous firms. These companies are prone to turbulent financial shocks with their origins in the external environment. Even though many studies have been devoted to bankruptcy prediction, a general methodology that would enable a firm to identify business partners in financial distress has not yet been proposed. The uniqueness of the bankruptcy prediction problem can be found in the nature of the data that are the subject of analysis. The majority of studies are based on a variety of financial ratios that are derived from annual financial statements. The annual financial statements usually consist of two documents –- the balance sheet and income statement. The first contains information regarding the assets, liabilities, and owners’ equity, whereas the income statement considers the costs, revenues, and eventual profit or loss. Because the frequency of data is annual, the information in the financial ratios is condensed and may conceal important fluctuations between two reporting periods. The quality of data is usually determined by the type of companies included in the analysis. In general, larger firms or firms listed on the stock exchange are more likely to disclose more information (Firth, 1979), thereby allowing a more meaningful analysis of their current financial condition.