Question

I'm trying to essentially take a list of strings containg sentences such as:

sentence = ['Here is an example of what I am working with', 'But I need to change the format', 'to something more useable']

and convert it into the following:

word_list = ['Here', 'is', 'an', 'example', 'of', 'what', 'I', 'am',
'working', 'with', 'But', 'I', 'need', 'to', 'change', 'the format',
'to', 'something', 'more', 'useable']

I tried using this:

for item in sentence:
    for word in item:
        word_list.append(word)

I thought it would take each string and append each item of that string to word_list, however the output is something along the lines of:

word_list = ['H', 'e', 'r', 'e', ' ', 'i', 's' .....etc]

I know I am making a stupid mistake but I can't figure out why, can anyone help?

Was it helpful?

Solution

You need str.split() to split each string into words:

word_list = [word for line in sentence for word in line.split()]

OTHER TIPS

Just .split and .join:

word_list = ' '.join(sentence).split(' ')

You haven't told it how to distinguish a word. By default, iterating through a string simply iterates through the characters.

You can use .split(' ') to split a string by spaces. So this would work:

for item in sentence:
    for word in item.split(' '):
        word_list.append(word)
for item in sentence:
    for word in item.split():
        word_list.append(word)

Split sentence into words:

print(sentence.rsplit())
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top