Question

I'm look at a way of turning an iTunes podcast id into the RSS feed that the podcast producer serves.

I'm aware of the RSS generator, which can be used to generate a feed of links to podcasts, but these links are to HTML pages.

If you have iTunes open, you can manually export the list of podcasts by exporting to OPML, so we can surmise that iTunes eventually knows how to decode them (i.e. they're not exclusively going through an iTMS host).

I have looked at the Affiliate API document which gives you some nice JSON back. This gives you a collectionViewUrl which is the same as the ones given in the RSS generator, and incidentally, the iTunes Link Generator. It also give you the id, and a whole load of other things including a preview audio file which is not hosted on the phobos.

At this point, I'm looking for anything that would help me solve this question, including any language, unofficial or not.

(in actual fact, I'd prefer something vaguely supported, and in Java, that didn't involve HTML scraping).

Was it helpful?

Solution

Through a combination of answers from these two questions, I have found a way to do what I want.

Example of finding podcasts

First: grab a list of podcasts from iTunes, using the RSS generator. I'm not sure how the query parameters work yet, but here is an RSS feed for top tech podcasts in the US.

http://ax.itunes.apple.com/WebObjects/MZStoreServices.woa/ws/RSS/toppodcasts/sf=143441/limit=25/genre=1318/xml
  • sf relates to country, and is optional. I would guess that this defaults to global if absent.
  • genre relates to genre, and is optional. I would guess that this defaults to "all genres" is absent.
  • limit is optional, and seems to default to 9.

This gives you an Atom feed of podcasts. You'll need to do some sperlunking with XPath to get to the ITMS id of podcast, but you're looking for the numeric id contained in the URL found at the following XPath:

/atom:feed/atom:entry/atom:link[@rel='alernate']/@href

For example, the excellent JavaPosse has an id of 81157308.

The Answer to the Question

Once you have that id, you can get another document which will tell you the last episode, and the original feed URL. The catch here is that you need to use an iTunes user-agent to get this document.

e.g.

wget --user-agent iTunes/7.4.1 \
     --no-check-certificate \ 
     "https://buy.itunes.apple.com/WebObjects/MZFinance.woa/wa/com.apple.jingle.app.finance.DirectAction/subscribePodcast?id=81157308&wasWarnedAboutPodcasts=true"

This is a plist containing some metadata about the podcast, including the feed URL.

<key>feedURL</key><string>http://feeds.feedburner.com/javaposse</string>

The XPath for this could be something like:

//key[@text='feedURL']/following-sibling::string/text()

Disclaimer

Not entirely sure how stable any of this is, or how legal it is. YMMV.

OTHER TIPS

As soon as you have the id you can use it in lookup as defined in

https://www.apple.com/itunes/affiliates/resources/documentation/itunes-store-web-service-search-api.html

You should get what you need by parsing the response with JSON

To elaborate on @juhariis' answer, here's the basics of extracting the feed url from the json (python3):

from urllib.request import urlopen
from urllib.parse import urlparse
import codecs
import json

podcast_url = 'https://itunes.apple.com/us/podcast/grow-big-always/id1060318873'
ITUNES_URL = 'https://itunes.apple.com/lookup?id='
parsed = urlparse(podcast_url)
id = parsed.path.split('/')[-1][2:]
reader = codecs.getreader('utf-8')
with urlopen(ITUNES_URL + id) as response:
    feed = json.load(reader(response))['results'][0]['feedUrl']
print(feed)

Here's a script/module I made, that makes use of this: https://gist.github.com/theychx/f9fad123bef27bebac665847c7884cd9

I searched for a long time to deconstruct the iTunes podcast feed. It's a plist containing meta data, of which one of them is an RSS feed. My blog post How to subscribe to iTunes podcasts on Android has links to code in php and in javascript to extract the URL to the RSS feed from an individual iTunes link.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top