Using python and urllib to get data from Yahoo FInance

Question 1

You have not escaped the forward slash in your regex. Change your regex from:

<span id="yfs_l84_%s">(.+?)</span>

to

<span id="yfs_l84_goog">(.+?)<\/span>

This will fix your problem assuming you enter the company's listing code as the input to your code. Ex; goog for google.

That said, regex is a bad choice for what you are trying to do. As suggested by others, explore BeautifulSoup which is a Python library for pulling data out of HTML. With BeautifulSoup your code can be as simple as:

from bs4 import BeautifulSoup
import requests

name = raw_input('>')
url = 'http://finance.yahoo.com/q?s={}'.format(name)
r = requests.get(url)
soup = BeautifulSoup(r.text)
data = soup.find('span', attrs={'id':'yfs_l84_'.format(name)})
print data.text

Question 2

Any reason you can't use pandas? It has good support for financial data scraping and time series analysis.

http://pandas.pydata.org/pandas-docs/stable/remote_data.html

Here's the yahoo example straight from the documentation :

In [1]: import pandas.io.data as web
In [2]: import datetime
In [3]: start = datetime.datetime(2010, 1, 1)
In [4]: end = datetime.datetime(2013, 01, 27)
In [5]: f=web.DataReader("F", 'yahoo', start, end)
In [6]: f.ix['2010-01-04']
Out[6]: 
OnOpen               10.17
High               10.28
Low                10.05
Close              10.28
Volume       60855800.00
Adj Close           9.75
Name: 2010-01-04 00:00:00, dtype: float64

Question 3

The best way to get data from Yahoo Finance using python2 or python3 is by using a POST method.
You can easily test this out using a Rest service like Postman

Open up postman and use Method POST and use this Then you will see the data. Simply re-create this in python

import requests
url="https://query1.finance.yahoo.com/v7/finance/download/GOOG? period1=1519938930&period2=1522354530&interval=1d&events=history&crumb=.tLvYBkGDu3"

response = requests.post(url)
print response.text

I used to get the data using urllib2 but it gives an authorization error now They are probably filtering everything through Rest methods like GET and POST

Question 4

This guide will show you how to build Yahoo finance queries that will return csvs. Then you can use the csv library to parse them easily.

If you really want to try hacking through the HTML, use BeautifulSoup. HTML can't be parsed easily with regexes.