Project: Sentiment Analysis of Tweets with Application to Stock markets

Github Link

The python code to perform stock status prediction using tweet sentiment information. This python code has six stages of data processing as shown in the figure.

Stage 1:

This program downloads the tweets for the companies between the specified dates. The list company abbreviations are given in the file companyAbbvs.txt in the folder companyAbbvs. The tweets downloaded are stored in tweetsForAllCompany.JSON in the folder twitterData

Stage 2:

This program performs the sentiment analysis of the tweets. The python code reads the tweetsForAllCompany.JSON file and computes the sentiments of tweets. The sentiment computed for the tweets are written as a file tweetsSentiScoreAndCls.csv in folder twitterData.

Stage 3:

This program fetches the financial data for the given list of company abbreviations from yahoo finance. The output is stored as stockPriceOpenAllCompany.JSON and stockPriceCloseAllCompany.JSON files in the folder yahooFinData.

Stage 4:

This program combines sentiment information collected from tweets and finance data. It prepares a file stockpredict.txt and stored in the folder twFeaturesAndCls.

Stage 5:

This program performs the prediction of stock status. The data set is read from the file stockpredict.txt of folder twFeaturesAndCls. The SVM classifier is created and on the data set, the cross-validation is performed.

Research Paper

This model of stock status prediction using tweet sentiment information has appeared in the research paper:

Cite this work

Please cite as

Siddhaling Urolagin, "Text Mining of Tweet for Sentiment Classification and Association with Stock Prices", IEEE International Conference on Computer and Applications (ICCA), pp 384-388, Dubai, 2017.

Further Projects and Contact

For further reading and other projects please visit