Abstract
1. Introduction
2. Problem formalization
3. Methodology
4. Available data
5. Experiments discussion
6. Conclusion and future directions
Declaration of Competing Interest
CRediT authorship contribution statement
References
Abstract
Stock market prediction is one of the most challenging problems which has been distressing both researchers and financial analysts for more than half a century. To tackle this problem, two completely opposite approaches, namely technical and fundamental analysis, emerged. Technical analysis bases its predictions on mathematical indicators constructed on the stocks price, while fundamental analysis exploits the information retrieved from news, profitability, and macroeconomic factors. The competition between these schools of thought has led to many interesting achievements, however, to date, no satisfactory solution has been found. Our work aims to combine both technical and fundamental analysis through the application of data science and machine learning techniques. In this paper, the stock market prediction problem is mapped in a classification task of time series data. Indicators of technical analysis and the sentiment of news articles are both exploited as input. The outcome is a robust predictive model able to forecast the trend of a portfolio composed by the twenty most capitalized companies listed in the NASDAQ100 index. As a proof of real effectiveness of our approach, we exploit the predictions to run a high frequency trading simulation reaching more than 80% of annualized return. This project represents a step forward to combine technical and fundamental analysis and provides a starting point for developing new trading strategies.
Introduction
Stock market prediction is a challenging problem to solve and its complexity is strictly related to multiple factors which could affect price changes. Researchers and practitioners coming from different fields have taken the challenge, so that research units composed of mathematicians, data scientists, philosophers, and financial analysts are widely common. The heterogeneity of this environment has led to important steps forward the market theory. In fact, two theoretical hypotheses have been built to explain the market behavior: Efficient Market Hypothesis (EMH) and Adaptive Market Hypothesis (AMH). The EMH (Fama, 1991) states that the current market price fully reflects all the recently published news. This results in the past and current information being immediately incorporated into stock prices. Thus, price changes are merely due to new information or news, and independent of existing information. Since news is unpredictable in nature, in theory, stock prices should follow a random walk pattern and the best bet for the next price is the current price. In practice, the EMH states that it is not possible to ‘beat the market’ because stocks are always traded at their fair value, thus, buying of undervalued stocks or selling them for exaggerated prices should be impossible. However, the AMH (Lo, 2004) tries to connect the rational EMH, with the irrational behavioral finance principles. The AMH applies the principles of evolution and behavior to financial interactions. Behavioral finance attempts to explain stock market anomalies through psychology-based theories. According to AMH, it is possible to exploit weaknesses in the market efficiency to obtain positive returns from a portfolio of stocks.