Question

I have a list of torrent info_hashes. For each info_hash, I have a list of trackers that correspond with that info_hash.

What I would like to do is scrape each tracker in the list to get the seeder/leecher/completed count. However, i'd rather not attempt to write this myself as i'm sure this code has been implemented elsewhere

Does anyone know of a python library that can scrape http:// and udp:// trackers?

I have been using libtorrent for other parts of this project, however it can only scrape a tracker from a valid torrent_handle (and I dont want to have to add these info_hashes to a libtorrent session in order to scrape the tracker because it will start downloading the files which I dont want)

Was it helpful?

Solution

I didnt want to use libtorrent also because it is quite inefficient - I want to be able to query a tracker for multiple info_hashes instead of one at a time.

I ended up writing my own python HTTP/UDP tracker scraping code, see here: https://github.com/erindru/m2t/blob/master/m2t/scraper.py (improvements most welcome!)

OTHER TIPS

This is not directly an answer to your question, but a suggestion of how you could use libtorrent.

If you add the info-hash in a paused, non-auto-managed state (controlled by the flags in add_torrent_params). In that case libtorrent won't start downloading it.

Keep in mind that libtorrent does not (yet) support scraping the DHT.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top