I have a list of torrent info_hashes. For each info_hash, I have a list of trackers that correspond with that info_hash.
What I would like to do is scrape each tracker in the list to get the seeder/leecher/completed count. However, i'd rather not attempt to write this myself as i'm sure this code has been implemented elsewhere
Does anyone know of a python library that can scrape http:// and udp:// trackers?
I have been using libtorrent for other parts of this project, however it can only scrape a tracker from a valid torrent_handle (and I dont want to have to add these info_hashes to a libtorrent session in order to scrape the tracker because it will start downloading the files which I dont want)
I didnt want to use libtorrent also because it is quite inefficient - I want to be able to query a tracker for multiple info_hashes instead of one at a time.
I ended up writing my own python HTTP/UDP tracker scraping code, see here: https://github.com/erindru/m2t/blob/master/m2t/scraper.py (improvements most welcome!)
This is not directly an answer to your question, but a suggestion of how you could use libtorrent.
If you add the info-hash in a paused, non-auto-managed state (controlled by the flags in add_torrent_params). In that case libtorrent won't start downloading it.
Keep in mind that libtorrent does not (yet) support scraping the DHT.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With