Pregunta

The title is terrible. This is what I mean. I'm using Wolfram|Alpha's API. And while parsing it, I get these god-awful strings, like this (by querying "spider-man"):

"year | title | medium 1962 | Amazing Fantasy #15 | comic book 1967 | Spider-Man | animation > 1977 | The Amazing Spider-Man | television 1978 | Questprobe #2 Spider-Man | video game 2002 > | Spider-Man | movie"

And this is actually a string representation of what should be lists like this():

[year, title, medium]

[1962, Amazing Fantasy #15, comic book]

[1967, Spider-Man, video game]

[2002, Spider-Man, movie]

I can easily split this into one big list...but I can't think of a simple way to get them into the lists like they should be (shown above). Any suggestions other than converting to a big list, parsing the list, separating them into a list of lists by creating a a new list every 3rd item I iterate through...?

ex of my idea (long way):

listA = list()
listA = textRepresentation.split("|")
listB = list()
listC = list()
i = 1
for item in listA:
  if(i == 3):
    listB.append(listC)
    i = 1
  else:
    listC.append(item)
    i++
¿Fue útil?

Solución

import re
zip(*[(i.strip() for i in re.split('(\d{4})|\||>', text) if i and i.strip())]*3)

out:

[('year', 'title', 'medium'),
 ('1962', 'Amazing Fantasy #15', 'comic book'),
 ('1967', 'Spider-Man', 'animation'),
 ('1977', 'The Amazing Spider-Man', 'television'),
 ('1978', 'Questprobe #2 Spider-Man', 'video game'),
 ('2002', 'Spider-Man', 'movie')]
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top