Domanda

Very new with elementtree so i'm trying to parse xml file for tv addon for xbmc. Below is the code that i'm having issue with. I think my xpath is not correct and placeholder is not working on the the attribute!

This is the xml file i'm workig with - http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930

    seasonnum = root2.findall("/Show/Episodelist/Season[@no='%s']/episode/seasonnum" % (season))


        import xml.etree.ElementTree as ET
        import urllib            
        tree2 = ET.parse(urllib.urlopen(url))
        root2 = tree2.getroot()
        seasonnum = tree2.findall("./Episodelist/Season[@no='%s']/episode/seasonnum" % '1')
        print seasonnum

SyntaxError: expected path separator ([) is what i get

È stato utile?

Soluzione 4

    import xml.etree.ElementTree as ET
    import urllib
    content = urllib.urlopen(url).read()
    tree2 = ET.fromstring(content)
    tvrage_seasons = tree2.findall('.//Season' )

Had to work it like this as for some reason in xbmc Elementtree there must be an error or something to not make it work. But this worked out for me!

Altri suggerimenti

using ElementTree:

>>> from xml.etree import ElementTree
>>> import urllib2
>>> url = 'http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930'
>>> request = urllib2.Request(url, headers={"Accept" : "application/xml"})
>>> u = urllib2.urlopen(request)
>>> tree = ElementTree.parse(u)
>>> rootElem = tree.getroot()
>>> [s.text for s in rootElem.findall('.//Season[@no="2"]/episode/seasonnum')]
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', 
 '15', '16', '17', '18', '19', '20', '21', '22']

According to xml.etree.ElementTree documentation - XPath support:

This module provides limited support for XPath expressions for locating elements in a tree. The goal is to support a small subset of the abbreviated syntax; a full XPath engine is outside the scope of the module.

You may need third-part library like lxml to use XPath.

Example:

>>> import lxml.etree
>>>
>>> url = 'http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930'
>>> tree = lxml.etree.parse()
>>> tree.xpath("/Show/Episodelist/Season[@no='%s']/episode/seasonnum/text()" % 1)
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']

UPDATE

To use lxml.etree.ElementTree, the xpath should be slightly modified:

>>> import urllib
>>> import xml.etree.ElementTree as ET
>>>
>>> f = urllib.urlopen(url)
>>> tree = ET.parse(f)
>>> [e.text for e in tree.findall("./Episodelist/Season[@no='%s']/episode/seasonnum" % 1)]
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']

I have tried your example and it works. Here is a condensed, complete version:

import urllib
import xml.etree.ElementTree as ET

url = 'http://services.tvrage.com/myfeeds/episode_list.php?key=ag6txjP0RH4m0c8sZk2j&sid=2930'
tree = ET.parse(urllib.urlopen(url))
seasons = tree.findall("./Episodelist/Season[@no='%s']/episode/seasonnum" % '1')

for s in seasons:
    print s.text

The only problem I can think of, is somehow, you downloaded a partial XML document--unlikely, but I do not know any other explanations. Note that the above script is taken from your question. I only added the for loop.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top