Parsing strings in python from xml file

https://stackoverflow.com/questions/21889587

13-10-2022
|

Domanda

Trying to parse info from a xml file that's hosted on a site. I'm making a tv addon for xbmc and my issue is that the info is all on on page and i only want to parse in sections like all of season 1! Where it only shows Season 1 in one spot then all of the episode below it then on season 2. I'm not sure how to write it type of code to only pull up season 1 if in click on season 1! Below is what i got going on:

    if type == 'tv_seasons':
         match=re.compile('<Season no="(.+?)">').findall(content)
         for seasonnumber in match:                
             item_url = new_url
             item_title = 'Season ' + seasonnumber
             item_id = common.CreateIdFromString(title + ' ' + item_title)               
             self.AddContent(list, indexer, common.mode_Content, item_title, item_id, 'tv_episodes', url=item_url, name=name, season=seasonnumber)

     elif type == 'tv_episodes':
         from entertainment.net import Net
         net = Net()
         content2 = net.http_GET(url).content
         match=re.compile('<episode><epnum>.+?</epnum><seasonnum>(.+?)</seasonnum>.+?<link>(.+?)</link><title>(.+?)</title>').findall(content2)
         for item_v_id_2, link_url, item_title  in match:
             item_v_id_2 = str(int(item_v_id_2))
             item_url = link_url
             item_id = common.CreateIdFromString(name + '_season_' + season + '_episode_' + item_v_id_2)
             self.AddContent(list, indexer, common.mode_File_Hosts, item_title, item_id, type, url=item_url, name=name, season=season, episode=item_v_id_2)

So now i'm working with this but still not working out for me.

        tree2 = ET.parse(urllib.urlopen(url))
        root2 = tree2.getroot()
        seasonnum = root2.findall("Show/Episodelist/Season[@no='%s']/episode/seasonnum" % season)
        seasonnumtext = seasonnum.text
        title = root2.findall("Show/Episodelist/Season[@no='%s']/episode/title" % season)
        item_title = title.text
        item_v_id_2 = str(int(seasonnumtext))
        item_url = url
        item_id = common.CreateIdFromString(name + '_season_' + season + '_episode_' + item_v_id_2)
        self.AddContent(list, indexer, common.mode_File_Hosts, item_title, item_id, type, url=item_url, name=name, season=season, episode=item_v_id_2)

Soluzione

I would recommend using Python XML Parser. You can then traverse the XML tree in a similar manner to Python dictionaries and lists.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow