Can bittorrent peers handle seeding large numbers of idle torrents

Tags:

bigdata

I'm considering using bittorrent for a large data dissemination problem where the data source is petascale and users will want up to several terabytes. Some details

Number of torrents potentially in the millions
torrent sizes ranging from 100Mb to 100Gb
A stable set of clusters around the world capable of acting as seeders each holding a large subset of the total torrents (say 60% on average)
A relatively small number of simultaneous users (less than 100) wanting to download on average a few terabytes of data.

I expect the number of active torrents to be small compared to the total available but quality of service is important so there must be several seeders for each torrent or some mechanism for launching new seeders.

My question is can bittorrent clients handle seeding huge numbers of torrents, most of which are idle? Would I need to stripe torrents across the seeders in a cluster or could each node be seeding all torrents it has access to? Which client would do the best job? Are there any tools for managing clusters of seeders?

I am assuming that trackers can be made to scale to this level.

408

asked Jul 24 '11 20:07

Stephen

2 Answers

There are 2 main problems:

Each torrent (typically) needs to announce to a tracker periodically, this might end up using a significant amount of bandwidth.
The bittorrent client itself need to be written in a way to scale with a large number of torrents

As for the tracker traffic, let's assume you have 1 million torrents, the typical re-announce interval is 30 minutes, but some tracker has it set to 1 hour. Let's be conservative and assume your tracker uses 1 hour announce intervals. You will have to make 1 million GET requests per hour, let's say each request is 400 bytes up and 100 bytes down (assuming most responses will not contain any peers), that's about 111 kB/s up and 28 kB/s down constantly. That's not so bad, but keep in mind that TCP requires an extra round-trip for establishing connections, so that's another 40 bytes down and 40 bytes up.

This can be mitigated by only using UDP trackers. Then you would only need a single connect-message, and you can reuse the connection ID for each announce. Each announce message would then be 100 bytes, and the returned message would be a bit more compact as well, let's assume 60 bytes. That would get you 28 kB/s up and 16kB/s down, just to keep the torrents announced. For this you would need a client with decent udp tracker support (one that caches the connection ID for instance).

Not too bad, assuming that's insignificant compared to the actual data your seeds would send.

However, you don't necessarily need to stripe your torrents across separate data centers, you could also use an HTTP server to seed the torrents. All major bittorrent clients support http seeding, and you wouldn't have to worry about announcing to the tracker (the URL is burned into the .torrent itself).

As for a client that scales well with torrents, I don't know for sure, I haven't done any measurements. It should be fairly straightforward to just generate a million random torrents and try to load it up.

I have done some optimization work in libtorrent rasterbar to make it scale well with many torrents, I haven't tried millions though.

I've written a blog post on this topic, here.

194

answered Sep 28 '22 01:09

Arvid

You may be looking for Hekate It's in, at best, pre-alpha right now, but it's quite nearly what you're describing.

answered Sep 28 '22 01:09

Aranjedeath

Related questions
                            
                                Does every peer need to be a node in BitTorrent when DHT is enabled?
                            
                                How do you build a torrent file indexer?
                            
                                How exactly is availability of a torrent calculated in uTorrent
                            
                                Resolve metadata of a torrent from the hash (or the magnet link)? Ideally in python
                            
                                bittorrent tracker seeder and leecher in nodejs
                            
                                'Looser' typing in C# by casting down the inheritance tree
                            
                                Torrent Library for C++, Windows [closed]
                            
                                Python 3: Opening A Magnet Link Contained In A Variable
                            
                                How bittorrent tracker works?
                            
                                How to create/build/construct completely trackerless p2p(peer-to-peer)? [closed]
                            
                                What happens when I download the same torrent file using BitTorrent in two different pcs and using the same internet connection?
                            
                                Pure PHP torrent client? [closed]
                            
                                How to get responses to ut_metadata piece request ? (node.js Bit Torrent BEP 0009)
                            
                                Sending scrape request for getting torrent's seeds and peers
                            
                                Convert Torrent info_hash from bencoded to URLEncoded data
                            
                                BitTorrent protocol with Java - Bitfield after successful handshake
                            
                                Implementing find node on torrent kademlia routing table
                            
                                Bittorrent Peer Wire Protocol implementing in Java
                            
                                Building a distributed bittorrent-SQL database

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With