Question

I am tyring to import from this xml feed http://www.lnv.fr/xml/ajaccio/calendrier.xml I have having some trouble because some of the data I want to extract has French accent marks.

url = 'http://www.lnv.fr/xml/ajaccio/calendrier.xml'
r = requests.get(url)
soup = BeautifulSoup(r.content)
matches = soup.findAll('match')

When I do this

for match in matches:
    print match.equipedomicile.string

It prints them out as they should there is no problem with a team with accent marks like Sète for example.

But when I do this

def GetGames():
homeTeamList = []    
for match in matches:
    homeTeam = unicode(match.equipedomicile.text)        
    homeTeamList.append(homeTeam)
return homeTeamList

and call the function the list teams with accent marks don't come out right. ie Sète now becomes u'S\xe8te'

Was it helpful?

Solution

What you're getting is a repr version of the unicode string, use print on individual elements of the list and you'll get the correct output.

>>> a = [u'S\xe8te']
>>> a
[u'S\xe8te']
>>> print a[0]
Sète
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top