문제

I have a nested XML that looks like this:

<data>foo <data1>hello</data1> bar</data>

I am using minidom, but no matter how I try to get the values between "data", I am only get "foo" but not "bar"

It is even worse if the XML is like this:

<data><data1>hello</data1> bar</data>

I only get a "None", which is correct according to the logic above. So I came accross this: http://levdev.wordpress.com/2011/07/29/get-xml-element-value-in-python-using-minidom and concluded that it is due to the limitation of minidom?

So I used the method in that blog and I now get

foo <data1>hello</data1> bar

and

<data1>hello</data1> bar

which is acceptable. However, if I try to create a new node (createTextNode) using the output above as node values, the XML becomes:

<data>foo &lt;data1&gt;hello&lt;/data1&gt; bar</data>

and

<data>&lt;data1&gt;hello&lt;/data1&gt; bar</data>

Is there any way that I can create it so that it looks like the original? Thank you.

도움이 되었습니까?

해결책 3

So after pointed out by @pandubear, the XML:

<data>foo <data1>hello</data1> bar</data>

Does have two text nodes, containing "foo " and " bar", so what can be done is to iterate through all the child nodes in data and get the values.

다른 팁

You can use element tree For xml it very efficient for both retrieval and creation of the node

have a look at the link below

element tree-- tutorials mixed xml

someof the examples of creating node

import xml.etree.ElementTree as ET

  data = ET.Element('data')

data1= ET.SubElement(data, 'data1',attr="value")
data1.text="hello"
data.text="bar"
data1.tail="some code"
ET.dump(data)

output :<data>bar<data1 attr="value">hello</data1>some code</data>

Use the following function to prettify your xml so it is a LOT easier to see...first of all..

import xml.dom.minidom as minidom

def prettify(elem):
    """Return a pretty-printed XML string for the Element.  Props goes
    to Maxime from stackoverflow for this code."""
    rough_string = et.tostring(elem, 'utf-8')
    reparsed = minidom.parseString(rough_string)
    return reparsed.toprettyxml(indent="\t")

That makes stepping through the tree visually a lot simpler.

Next I would suggest a modification in your xml that will make your life a whole lot easier i think.

Instead of :

<data>foo
    <data1>hello</data1>
    bar
</data>

which is not a correct XML format I would save your 'foo' and 'bar' as attributes of

it looks like this:

<data var1='foo' var2='bar'>
    <data1>hello</data1>
</data>

to do this using xml.etree.ElementTree:

import xml.etree.ElementTree as ET

data = ET.Element('data', {'var1:'foo', 'var2':'bar'})
data1= ET.SubElement(data, 'data1')
data1.text='hello'
print prettify(data)
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top