Element.findall gives 'invalid predicate' with XPath and Element Tree

https://stackoverflow.com/questions/21943866

14-10-2022
|

题

I am trying to parse a SOAP response, which I got using Suds, with ElementTree, and I am getting the error:

Traceback (most recent call last):
...
  File "C:\Python27\lib\xml\etree\ElementPath.py", line 263, in iterfind
    selector.append(ops[token[0]](next, token))
  File "C:\Python27\lib\xml\etree\ElementPath.py", line 224, in prepare_predicate
    raise SyntaxError("invalid predicate")
SyntaxError: invalid predicate

My I am working with XML that looks like this:

<sitesResponse>
    <queryInfo></queryInfo>
    <site>
        <siteInfo>
            <siteName>name</siteName>
        </siteInfo>
    </site>
    <site />
    <site />
    <site />
     ....
</sitesResponse>

... My objective is to access "name," (node) from the XML in each , and put it in a list And my code looks like this:

from suds.client import Client
import xml.etree.ElementTree as ET
url="http://worldwater.byu.edu/interactive/dr/services/index.php/services/cuahsi_1_1.asmx?WSDL"
def getNames(url):
    client = Client(url,cache=None)
    response = client.service.GetSites()
    response_string=str(response)
    root=ET.fromstring(response_string)
    names=[]
    for i in root.findall(".//siteName[*]"):
        name=sites.find(".//siteName[i]/*").text
        names.append(name)
    return names

names_list= getNames(url)
names_list.sort()
for i in names_list:
    print names_list[i]

解决方案 2

Thanks for your help! As it turns out, the issue was that I needed to account for the namespacing. I also revamped the code to make it a module, using the ideas that Eugene put forth.

from suds.client import Client
import xml.etree.ElementTree as ET

def getNames(url,namespace):
    ###Suds Call###
    client = Client(url,cache=None)
    response = client.service.GetSites()
    ###         ###

    response_string=str(response)

    ###ElementTree Parsing###
    root=ET.fromstring(response_string)
    siteNameTags = root.findall("{0}site/{0}siteInfo/{0}siteName".format(namespace)) #must include {0} due to namespacing (this is where I need to add generality)
    ###                   ###

    siteNames=[]
    for i in siteNameTags:
        siteNames.append(i.text)
    siteNames.sort()
    return siteNames

###Example###    
url="http://worldwater.byu.edu/interactive/dr/services/index.php/services/cuahsi_1_1.asmx?WSDL"        
namespace="{http://www.cuahsi.org/waterML/1.1/}"
names_list= getNames(url,namespace)
for i in names_list:
    print ("{0} ".format(i))  #95% sure this is necessary because of the namespacing

其他提示

You can use something like:

for sitename in root.findall(".//siteName"):
    names.append(sitename.text)

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow