Question

Edit: Translated

I have a RSS-feed that i want to parse. It's a podcast and I want just the MP3-urls to download them with wget.

This is the podcast: http://feeds.feedburner.com/Film-UndKino-trailerVideopodcast

The title should include an (de) to get just the german episodes. The publish-date should be today.

Would be great if someone could help me – I came this far:

wget -q -O- view-source:http://feeds.feedburner.com/Film-UndKino-trailerVideopodcast?format=xml| awk 'BEGIN{RS=""}
/(date +'%d %M %Y')/{
gsub(/.*|.*/,"")
print
}

But it doesn't work.

Thanks in advance, arneb3rt

Was it helpful?

Solution

You need to drop the "view-source:" from the wget command and execute the date command (with %b to print the abbreviated month instead of %M) outside of the awk command. The following bash script uses grep instead of awk to produce the URLs of where wget can fetch the podcasts.

Note that, probably due to the holidays, there have been no podcasts since 24 Dec 2011 at the feed, so I hard-coded the date of the last podcast for testing:

url='http://feeds.feedburner.com/Film-UndKino-trailerVideopodcast?format=xml'
d=$(date +'%d %b %Y')
d="24 Dec 2011"
echo "Checking podcasts for date: ${d}"
wget -q -O- ${url} |\
 grep -A6 "(de)" |\
 grep -A1 "${d}" |\
 egrep -o 'http[^ ]*de.mp4' |\
 sort | uniq

The output of the above bash script lists two URLs (one feedburner and the other iTunes):

Checking podcasts for date: 24 Dec 2011
http://feedproxy.google.com/~r/Film-UndKino-trailerVideopodcast/~5/pzeSvkVK-3A/trailer01_de.mp4
http://www.moviemaze-trailer.de/ipod/6841/trailer01_de.mp4

Therefore, you could wget the 24 Dec 2011 podcast from either of the above URLs.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top