Sentiment analysis to predict cryptocurrency prices

This research seeks to predict the ever-changing direction of cryptocurrency prices ‒ in particular the price of Bitcoin – by evaluating the sentiment and volume of opinions on social media, which directly or indirectly may affect these values. Sentiment would be useful for price prediction because, since these views could influence the interest of potential investors, it would be relevant to investigate whether better ways of supporting investors could be created. In fact, not only could public opinion sway the interest of potential investors, but it could also be used to provide help for investors to make better-informed decisions about future price predictions. Twitter is generally used as the source of sentiment, since it is widely used by people to express their opinion about various topics, including cryptocurrency price changes, forecasts and other factors that might cause fluctuations in price. On the other hand, cleaning Twitter data ‒ which is known to present several obstacles ‒ is another issue that is observed before constructing the actual models.

The solution proposed in this study involves determining the relation between the overall sentiment and price by testing several varying-time granularities with the aim of discovering the optimal time interval at which the sentiment expressed would begin to affect the price in question. This study involved the implementation of two types of models: one that predicts the direction of the price in the next day, and another that attempts to predict the magnitude of the next day’s price change as a multi-class classification. These two model types were constructed using three different neural network models (two based on recurrent neural networks and another based on a convolutional network) to explore and compare the performance of these neural network models in relation to each task. Finally, a prediction model was constructed to allow the forecasting of a price direction by combining the two types of predictions.

The outcome of the study suggests that recurrent neural networks perform better in predicting the direction, while the convolutional neural network performs better in predicting the magnitude of change. Additionally, the study indicates that the mean accuracy of the models is inversely proportional to the lag in the datasets.

Despite having a dataset containing over 16 million tweets and more daily records than other related work, when binning the tweets into days the number of records that could be used to train and test the models was still insufficient. Consequently, it could be concluded that having a larger tweet dataset ranging over more days, the accuracy levels of the results of this research could be improved.

Figure 1. Reading sentiment from Twitter to predict the price of Bitcoin
Student: Jacques Vella Critien
Course: B.Sc. (Hons.) Computing Science
Supervisor: Dr Joshua Ellul
Co-supervisor: Prof. Albert Gatt