Pregunta

I sent a POST message in python based on this answer on SO. Once this is done, I get a resultant XML representation that looks like this from the website:

<status>Active</status>
<registeredname>MyTestName</registeredname>
<companyname>TEST</companyname>
<email>mytestemail@gmail.com</email>
<serviceid>8</serviceid>
<productid>1</productid>
<productname>Some Test Product</productname>
<regdate>2013-08-06</regdate>
<nextduedate>0000-00-00</nextduedate>
<billingcycle>One Time</billingcycle>
<validdomain>testing</validdomain>
<validip>XX.XX.XXX.XX</validip>
<validdirectory>/root</validdirectory>
<configoptions></configoptions>
<customfields></customfields>
<addons></addons>
<md5hash>58z9f70a9d738a98b18d0bf4304ac0c6</md5hash>

Now, I would like to convert this into a python dictionary of the format:

{"status": "Active", "registeredname": "MyTestName".......}

The corresponding PHP code from which I am trying to port has something like this:

preg_match_all('/<(.*?)>([^<]+)<\/\\1>/i', $data, $matches);

My correponding Python code is as follows:

matches = {}
matches = re.findall('/<(.*?)>([^<]+)<\/\\1>/i', data)

'data' is the XML representation that I receive from the server. When I run this, my 'matches' dictionary remains empty. Is there something wrong in the regex statement? Or am I wrong in using re.findall in the first place?

Thanks in advance

¿Fue útil?

Solución

Remove leading/trailing /s from the regular expression. No need to escape /. Specify flags=re.IGNORECASE instead of trailing i.

matches = re.findall('<(.*?)>([^<]+)</\\1>', data, flags=re.IGNORECASE)
print(dict(matches))

Using raw string, no need to escape \.

matches = re.findall(r'<(.*?)>([^<]+)</\1>', data, flags=re.IGNORECASE)
print(dict(matches))

Both codes print:

{'status': 'Active', 'companyname': 'TEST', ...}

non-regex alternative: lxml

Used lxml.html instead of lxml.etree because data is incomplete.

import lxml.html
print({x.tag:x.text for x in lxml.html.fromstring(data)})
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top