So after pointed out by @pandubear, the XML:
<data>foo <data1>hello</data1> bar</data>
Does have two text nodes, containing "foo " and " bar", so what can be done is to iterate through all the child nodes in data and get the values.
문제
I have a nested XML that looks like this:
<data>foo <data1>hello</data1> bar</data>
I am using minidom, but no matter how I try to get the values between "data", I am only get "foo" but not "bar"
It is even worse if the XML is like this:
<data><data1>hello</data1> bar</data>
I only get a "None", which is correct according to the logic above. So I came accross this: http://levdev.wordpress.com/2011/07/29/get-xml-element-value-in-python-using-minidom and concluded that it is due to the limitation of minidom?
So I used the method in that blog and I now get
foo <data1>hello</data1> bar
and
<data1>hello</data1> bar
which is acceptable. However, if I try to create a new node (createTextNode) using the output above as node values, the XML becomes:
<data>foo <data1>hello</data1> bar</data>
and
<data><data1>hello</data1> bar</data>
Is there any way that I can create it so that it looks like the original? Thank you.
해결책 3
So after pointed out by @pandubear, the XML:
<data>foo <data1>hello</data1> bar</data>
Does have two text nodes, containing "foo " and " bar", so what can be done is to iterate through all the child nodes in data and get the values.
다른 팁
You can use element tree For xml it very efficient for both retrieval and creation of the node
have a look at the link below
element tree-- tutorials mixed xml
someof the examples of creating node
import xml.etree.ElementTree as ET
data = ET.Element('data')
data1= ET.SubElement(data, 'data1',attr="value")
data1.text="hello"
data.text="bar"
data1.tail="some code"
ET.dump(data)
output :<data>bar<data1 attr="value">hello</data1>some code</data>
Use the following function to prettify your xml so it is a LOT easier to see...first of all..
import xml.dom.minidom as minidom
def prettify(elem):
"""Return a pretty-printed XML string for the Element. Props goes
to Maxime from stackoverflow for this code."""
rough_string = et.tostring(elem, 'utf-8')
reparsed = minidom.parseString(rough_string)
return reparsed.toprettyxml(indent="\t")
That makes stepping through the tree visually a lot simpler.
Next I would suggest a modification in your xml that will make your life a whole lot easier i think.
Instead of :
<data>foo
<data1>hello</data1>
bar
</data>
which is not a correct XML format I would save your 'foo' and 'bar' as attributes of
it looks like this:
<data var1='foo' var2='bar'>
<data1>hello</data1>
</data>
to do this using xml.etree.ElementTree:
import xml.etree.ElementTree as ET
data = ET.Element('data', {'var1:'foo', 'var2':'bar'})
data1= ET.SubElement(data, 'data1')
data1.text='hello'
print prettify(data)