I know it's not exactly what you asked for, but I thought I would show a way of converting the dates from your link text into the format you show in your example of desired output(dd/mm/yy). I used BeautifulSoup to read elements from the html.
from bs4 import BeautifulSoup
import datetime as dt
import re
html = '<a href="data/self/dated/station1_140208.txt">Saturday, February 08, 2014</a><br/>'
p = re.compile(r'.*/station1_\d+\.txt')
soup = BeautifulSoup(html)
a_tags = soup.find_all('a', {"href": p})
>>> print a_tags # would be a list of all a tags in the html with relevant href attribute
[<a href="data/self/dated/station1_140208.txt">Saturday, February 08, 2014</a>]
names = [str(a.get('href')).split('/')[-1] for a in a_tags] #str because they will be in unicode
dates = [dt.datetime.strptime(str(a.text), '%A, %B %m, %Y') for a in a_tags]
names and dates use list comprehensions
strptime creates datetime objects out of the date strings
>>> print names # would be a list of all file names from hrefs
['station1_140208.txt']
>>> print dates # would be a list of all dates as datetime objects
[datetime.datetime(2014, 8, 1, 0, 0)]
toFileData = ["{0}: {1}".format(dt.datetime.strftime(d, '%w/%m/%y'), n) for d in dates for n in names]
strftime reformats the date into the format in your example:
>>> print toFileData
['5/08/14: station1_140208.txt']
then write the entries in toFileData
to a file
For info on the methods I used such as soup.find_all()
and a.get()
in the code above, I recommend you look at the BeautifulSoup
docs via the link at the top. Hope this helps.