Вопрос

How can I parse tokens separated by line break such as the one below:

Wolff PERSON
is O
in O    
Argentina LOCATION

The O
US LOCATION
Envoy O 
noted O

into full sentences like this using python?

Wolff is in Argentina
The US Envoy noted
Это было полезно?

Решение

You can use itertools.groupby for this:

>>> from StringIO import StringIO
>>> from itertools import groupby
>>> s = '''Wolff PERSON
is O
in O    
Argentina LOCATION

The O
US LOCATION
Envoy O 
noted O'''
>>> c = StringIO(s)
>>> for k, g in groupby(c, key=str.isspace):
    if not k:
        print ' '.join(x.split(None, 1)[0] for x in g)
...         
Wolff is in Argentina
The US Envoy noted

If input is actually coming from a string rather than a file, then:

for k, g in groupby(s.splitlines(), key= lambda x: not x.strip()):
    if not k:
        print ' '.join(x.split(None, 1)[0] for x in g)
...         
Wolff is in Argentina
The US Envoy noted
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top