I'm new to python and struggle with data types concept and their conversions.
I have sentences in NLTK Tree format (obtained from Stanford parser and converted to an NLTK tree). I need to apply functions written for NLTK Chunker. However, NLTK tree format is different from NLTK Chunker format. Both formats are NLTK trees, but elements structure seems to be different (see below).
Could you please help me to convert an NLTK tree to an NLTK Chunker output format?
Thanks in advance!
Here is an NLTK Chunker output:
(S
(NP Pierre/NNP Vinken/NNP)
,/,
(NP 61/CD years/NNS old/JJ)
,/,
will/MD
join/VB
(NP the/DT board/NN)
as/IN
(NP a/DT nonexecutive/JJ director/NN Nov./NNP 29/CD)
./.)
Now printed by element and each element type:
class 'nltk.tree.Tree' (NP Pierre/NNP Vinken/NNP)
type 'tuple' (',', ',')
class 'nltk.tree.Tree' (NP 61/CD years/NNS old/JJ)
type 'tuple' (',', ',')
type 'tuple' ('will', 'MD')
type 'tuple' ('join', 'VB')
class 'nltk.tree.Tree' (NP the/DT board/NN)
type 'tuple' ('as', 'IN')
class 'nltk.tree.Tree' (NP a/DT nonexecutive/JJ director/NN Nov./NNP 29/CD)
type 'tuple' ('.', '.')
Here is an NLTK "pure" Tree output (exactly as in NLTK doc):
(S
(NP
(NP (NNP Pierre) (NNP Vinken))
(, ,)
(ADJP (NP (CD 61) (NNS years)) (JJ old))
(, ,))
(VP
(MD will)
(VP
(VB join)
(NP (DT the) (NN board))
(PP (IN as) (NP (DT a) (JJ nonexecutive) (NN director) (NNP Nov.) (CD 29)))
))
(. .))
Now printed by element and each element type:
class 'nltk.tree.Tree' (NP
(NP (NNP Pierre) (NNP Vinken))
(, ,)
(ADJP (NP (CD 61) (NNS years)) (JJ old))
(, ,))
class 'nltk.tree.Tree' (NP (NNP Pierre) (NNP Vinken))
class 'nltk.tree.Tree' (NNP Pierre)
type 'str' Pierre
class 'nltk.tree.Tree' (NNP Vinken)
type 'str' Vinken
class 'nltk.tree.Tree' (, ,)
type 'str' ,
class 'nltk.tree.Tree' (ADJP (NP (CD 61) (NNS years)) (JJ old))
class 'nltk.tree.Tree' (NP (CD 61) (NNS years))
class 'nltk.tree.Tree' (CD 61)
type 'str' 61
class 'nltk.tree.Tree' (NNS years)
type 'str' years
class 'nltk.tree.Tree' (JJ old)
type 'str' old
class 'nltk.tree.Tree' (, ,)
type 'str' ,
class 'nltk.tree.Tree' (VP
(MD will)
(VP
(VB join)
(NP (DT the) (NN board))
(PP (IN as) (NP (DT a) (JJ nonexecutive) (NN director)))
(NP (NNP Nov.) (CD 29))))
class 'nltk.tree.Tree' (MD will)
type 'str' will
class 'nltk.tree.Tree' (VP
(VB join)
(NP (DT the) (NN board))
(PP (IN as) (NP (DT a) (JJ nonexecutive) (NN director)))
(NP (NNP Nov.) (CD 29)))
class 'nltk.tree.Tree' (VB join)
type 'str' join
class 'nltk.tree.Tree' (NP (DT the) (NN board))
class 'nltk.tree.Tree' (DT the)
type 'str' the
class 'nltk.tree.Tree' (NN board)
type 'str' board
class 'nltk.tree.Tree' (PP (IN as) (NP (DT a) (JJ nonexecutive) (NN director)))
class 'nltk.tree.Tree' (IN as)
type 'str' as
class 'nltk.tree.Tree' (NP (DT a) (JJ nonexecutive) (NN director))
class 'nltk.tree.Tree' (DT a)
type 'str' a
class 'nltk.tree.Tree' (JJ nonexecutive)
type 'str' nonexecutive
class 'nltk.tree.Tree' (NN director)
type 'str' director
class 'nltk.tree.Tree' (NP (NNP Nov.) (CD 29))
class 'nltk.tree.Tree' (NNP Nov.)
type 'str' Nov.
class 'nltk.tree.Tree' (CD 29)
type 'str' 29
class 'nltk.tree.Tree' (. .)
type 'str' .