Pregunta

I have a text file and my goal is to generate an output file with all the words that are between two specific words.

For example, if I have this text:

askdfghj... Hello world my name is Alex and I am 18 years all ...askdfgj.

And I want to obtain all the words between "my" and "Alex".

Output:

my name is Alex

I have it in mind... but I don't know how to create the range:

if 'my' in open(out).read():
        with open('results.txt', 'w') as f:
            if 'Title' in open(out).read():
                f.write('*')
        break

I want an output file with the sentence "my name is Alex".

¿Fue útil?

Solución

You can use regex here:

>>> import re
>>> s = "askdfghj... Hello world my name is Alex and I am 18 years all ...askdfgj."
>>> re.search(r'my.*Alex', s).group()
'my name is Alex'

If string contains multiple Alex after my and you want only the shortest match then use .*?:

With ?:

>>> s = "my name is Alex and you're Alex too."
>>> re.search(r'my.*?Alex', s).group()
'my name is Alex'

Without ?:

>>> re.search(r'my.*Alex', s).group()
"my name is Alex and you're Alex"

Code:

with open('infile') as f1, open('outfile', 'w') as f2:
    data = f1.read()
    match = re.search(r'my.*Alex', data, re.DOTALL)
    if match:
        f2.write(match.group())

Otros consejos

You can use the regular expression my.*Alex

data = "askdfghj... Hello world my name is Alex and I am 18 years all ...askdfgj"
import re
print re.search("my.*Alex", data).group()

Output

my name is Alex
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top