Collecting Financial Data from Yahoo

This python code fetches financial data from Yahoo finance between the specified dates. The company names for which financial data to be collected are listed in ‘companyAbbvs.txt’.
The opening and closing stock prices of the companies are collected from yahoo finance and stored in JSON files.

import datetime
import time
from urllib.request import urlopen
from bs4 import BeautifulSoup as bs
import simplejson as json

The folder ‘companyAbbvs’ contains a file ‘companyAbbvs.txt’, which maintains a list of company abbreviations. For example company abbreviations
folder and file details


The user specifies a period of date between which the financial data from Yahoo is fetched. User specifies the start date in ‘fromDateDataToFetch’ and previous date till in ’tillDateDataToFetch’. Between these two dates, the data is fetched.
Specify from date to previous till date. Between these two dates data is fetched.


The financial data for a given company abbreviation is fetched using url read operation. The URL is consists of a link to yahoo finance, the name of the company and details regarding the time period.
The format of the URL is: and company abbreviation and start date and end date. The start date and end date are represented in terms of numeric values after conversion using mk.time()
Using the beautifulsoup module, all the table objects are located from the URL response. All the table rows are collected into python list gatherRows.
From each row, various fields such as date, open price, high price, low price, closing price, adjusted close price and volume are fetched in terms of key and value pairs and appended to trendHistory list.
The trend history which consisting of trade history information is returned from the function.

def trendHistoryFromYahoo(name,eDate,sDate):
    trendHistory = []
    startDateNumeric=time.mktime(datetime.datetime.strptime(sDate, '%Y-%m-%d').timetuple())
    endDateNumeric=time.mktime(datetime.datetime.strptime(eDate, '%Y-%m-%d').timetuple())
    gatherRows = bs(dataFetchdYahoo).findAll('table')[0].tbody.findAll('tr')
    for each_row in gatherRows:
        divisions = each_row.findAll('td')        
        if divisions[1].span.text  != 'Dividend': 
            trendHistory.append({'Date': divisions[0].span.text, 'Open': float(divisions[1].span.text.replace(',','')),'High':float(divisions[2].span.text.replace(',','')),'Low':float(divisions[3].span.text.replace(',','')),'Close':float(divisions [4].span.text.replace(',','')),'AdjClose':float(divisions[5].span.text.replace(',','')),'Volume':float(divisions[6].span.text.replace(',',''))})
    return trendHistory

All the company abbreviations from text file are stored in a list companyAbbv.


For each company abbreviation, the historic trend from Yahoo between the date of period is fetched. Four dictionaries are maintained: stockPriceOpen, stockPriceClose, highestPrice and lowestPrice. The open price, close price, highest price, and lowest price of a stock are updated in these dictionaries according to dates.
For instance the stoclPriceClose dictionary entry after updating it becomes {“Nov 14 2018”: 186.8, “Nov 13 2018”: 192.23, “Nov 12 2018”: 194.17, “Nov 09 2018”: 204.47, “Nov 08 2018”: 208.49}.
Two dictionaries are used to store stock open price and stock close price of all the companies. These dictionaries are stockPriceOpen and stockPriceClose.
The stockPriceOpen keeps open prices of all the stocks with key as company abbreviation and stockPriceClose stores close prices of all stockes with key as company abbreviation.
For example the closing prices for the stock APP and MCD are stored as
{“AAPL”: {“Nov 14 2018”: 186.8, “Nov 13 2018”: 192.23, “Nov 12 2018”: 194.17, “Nov 09 2018”: 204.47, “Nov 08 2018”: 208.49}, “MCD”: {“Nov 14 2018”: 183.85, “Nov 13 2018”: 184.01, “Nov 12 2018”: 184.37, “Nov 09 2018”: 185.94, “Nov 08 2018”: 185.48}}


For each company collect trend history

for ci in range(len(companyAbbv)):
    if not companyAbbv[ci]:
    print ('Collecting yahoo finance data for:'+companyAbbv[ci])    
    trendHistory = trendHistoryFromYahoo(companyAbbv[ci],fromDateDataToFetch,tillDateDataToFetch)     
    stockPriceOpen= {}
    stockPriceClose= {}
    highestPrice= {}
    lowestPrice = {}
    for i in range(len(trendHistory)):
        date = trendHistory[i]['Date'].replace(",","")
        stockPriceOpen.update({date: trendHistory[i]['Open']})
        stockPriceClose.update({date: trendHistory[i]['Close']})
        highestPrice.update({date: trendHistory[i]['High']})
        lowestPrice.update({date: trendHistory[i]['Low']})

Two files are created ‘stockPriceOpenAllCompany.json’ and ‘stockPriceCloseAllCompany.json’ to store open prices and close prices for all companies in JSON format.
The dictionary stockPriceOpenAllCpmy is written to file ‘stockPriceOpenAllCompany.json’ and stockPricesCloseAllCpmy is dumped to the file ‘stockPriceCloseAllCompany.json’.
write in JSON file format

with open(financeDataFldr+'/'+stockPrzOpenFn,'w') as f:
with open(financeDataFldr+'/'+stockPrzCloseFn,'w') as f:


Leave a Reply

Your email address will not be published. Required fields are marked *