Blogs

Abstract – Misclassified Sentiment; an Obstacle to Predictive Analytics in Algorithmic Trading


News analytics systems making sentiment misclassifications that drive high-frequency trading (HFT) systems is a topic of international importance according to the Board of Governors of the Federal Reserve System, Wharton… Read more

News analytics systems making sentiment misclassifications that drive high-frequency trading (HFT) systems is a topic of international importance according to the Board of Governors of the Federal Reserve System, Wharton School, University of Pennsylvania, and InstitutEuropéen d’Administration des Affaires (INSEAD), (von Beschwitz, Keim, & Massa, 2013, 2018)Sentiment misclassification found in news articles may temporarily thwart the performance HFT algorithmic systems that combine both quantitative financial data and qualitative sentiment analytics from news articles. These HFT algorithmic systems exploit inefficiencies in the financial market by implementing investment strategies and executing transaction within milliseconds.  Most algorithmic trading systems use quantitative financial data and advanced analytics to manage 401K mutual funds, pension funds, hedge funds, and investment banks.  Isidore(2018) states that 50% to 60% of New York Stock Exchange (NYSE) trading volume is processed by HFT algorithms.  In a continual pursuit to gain a competitive advantage and to increase profits, traders are now augmenting their quantitative input datasets with news sentiment.  Regulators and high-frequency traders should be aware of the root-causes of sentiment misclassifications.  Predictive Analytics and Artificial Intelligence techniques that include text analytics, machine learning, and natural language processing facilitate sentiment analysis that calibrates the mood and feeling gleaned from unstructured text.  It is reasonable that some algorithmic trading predictions to purchase, to hold, or to sell stocks have been and continue to be based on an understanding of emotions regarding current news events that might affect stocks or companies.  Fitbit and NVIDIA were chosen as representative samples of two distinct types of business models: namely, business-to-consumer and business-to-business, respectively. This study used Google’s RankBrain search engine recommended pages (SERPs) as proxies for news article sources. Two-hundred-eighty SERP titles and summary snippets from March 18, 2018 to March 31, 2018 were manually scored and used to establish ground-truth samples of sentiment.  Surprisingly, Naïve Bayes algorithm consistently misclassified true negative sentiment higher than true positive sentiment for article titles and summary snippets.  Results show the highest negative precision score of 0.3 and the highest positive precision score of 0.76. Misclassified news sentiment presents challenges to HFT algorithmic developers.