You didn't mention what data structure you're looking for, i.e. what operations you intend to perform on the parsed data. In the simplest case, you could massage the file into a list of 8-tuples - the last element being either '*' or an empty string. That is as simple as
import string
def tokenize(s):
if s[-1] == '*':
return string.rsplit(s, None, 7)
else:
return string.rsplit(s, None, 6) + ['']
tokens = (tokenize(line.rstrip()) for line in open('so21712204.txt'))
To be fair, this doesn't make tokens
a list of 8-tuples but rather a generator (which is more space efficient) of lists, each of which having 8 elements.