Question

So I am writing code to read a file and print out the content as (fileID, sentenceID, wordID, word). It keeps telling me for word in line[0].split('ZZ'): IndexError: string index out of range. So how do I fix this? Thanks.

lineCount = 0
wordCount = 0
for line in file[0].split('ZZ'):
    lineCount +=1
    for word in line[0].split('ZZ'):
        wordCount +=1
        print fileNumber + '|' + str(lineCount) + '|' + str(wordCount) + word +'\n'
Was it helpful?

Solution

try with for word in line.split('ZZ'): instead of for word in line[0].split('ZZ'):.

This file[0].split('ZZ'): returns list of strings, so line is one of those strings. line.split('ZZ') will return list of strings once again, but now word will be one of those strings.

EDIT Here is example for your question in comment:

line = "one-two threeZZfour five-six seven eight nineZZten"
for word in line.split('ZZ')
    print word

output>>
one-two three
four five-six seven eight nine
ten

for word in line.split('-')
    print word
output>>
one
two threeZZfour five
six seven eight nineZZten

for word in line.split()# or split(' ')
    print word
output>>
one-two
threeZZfour
five-six
seven
eight
nineZZten

OTHER TIPS

Ok let's see what we get, step by step:

for line in file[0].split('ZZ'):

If this line is correct, then file must be a list of strings (because of the split method). What is the line then? Well, split returns a list of strings. Thus line is a string.

for word in line[0].split('ZZ'):

Sine line is a string, line[0] is a single char (or empty string). This is where things begin not to make sense. The error you get is caused by trying to index an empty string, i.e.

>>>''[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range

However, that's not all. Applying split('ZZ') to a single char will return, well, a list with one element - that char (or an empty string). Now the for word part doesn't make sense, since you're iterating over a list with one element that's a single char. I don't this this is what you want...

Since file is apparently a list of strings, this is probably what you're looking for:

for line in file[0].split('ZZ'):
    lineCount+=1
    for word in line.split('ZZ'):
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top