Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can bittorrent peers handle seeding large numbers of idle torrents

I'm considering using bittorrent for a large data dissemination problem where the data source is petascale and users will want up to several terabytes. Some details

  • Number of torrents potentially in the millions
  • torrent sizes ranging from 100Mb to 100Gb
  • A stable set of clusters around the world capable of acting as seeders each holding a large subset of the total torrents (say 60% on average)
  • A relatively small number of simultaneous users (less than 100) wanting to download on average a few terabytes of data.

I expect the number of active torrents to be small compared to the total available but quality of service is important so there must be several seeders for each torrent or some mechanism for launching new seeders.

My question is can bittorrent clients handle seeding huge numbers of torrents, most of which are idle? Would I need to stripe torrents across the seeders in a cluster or could each node be seeding all torrents it has access to? Which client would do the best job? Are there any tools for managing clusters of seeders?

I am assuming that trackers can be made to scale to this level.

like image 408
Stephen Avatar asked Jul 24 '11 20:07

Stephen


People also ask

How do I increase BitTorrent seeding?

The easiest way to increase your seeding ratio is by downloading smaller files and storing them on your seedbox. When you start seeding smaller-sized files, you increase the chances of seeders downloading your files.

Is more peers better when downloading?

The more seeds, the better the download rate. However, it is good to have more peers in addition to the seeders, as the downloaders can utilize both. Once a file is fully seeded, the BitTorrent application automatically stops the seeding process and the file can then be removed from the seeding list.

What is seeding status in BitTorrent?

Seeding refers to leaving a peer's BitTorrent client open and available for additional individuals to download from. Normally, a peer should seed more data than download. However, whether to seed or not, or how much to seed, depends on the availability of downloaders and the choice of the peer at the seeding end.

What happens if there are no seeders?

From this BiTorrent FAQ: When there are zero seeds for a given torrent (and not enough peers to have a distributed copy), then eventually all the peers will get stuck with an incomplete file, if no one in the swarm has the missing pieces.


2 Answers

There are 2 main problems:

  1. Each torrent (typically) needs to announce to a tracker periodically, this might end up using a significant amount of bandwidth.
  2. The bittorrent client itself need to be written in a way to scale with a large number of torrents

As for the tracker traffic, let's assume you have 1 million torrents, the typical re-announce interval is 30 minutes, but some tracker has it set to 1 hour. Let's be conservative and assume your tracker uses 1 hour announce intervals. You will have to make 1 million GET requests per hour, let's say each request is 400 bytes up and 100 bytes down (assuming most responses will not contain any peers), that's about 111 kB/s up and 28 kB/s down constantly. That's not so bad, but keep in mind that TCP requires an extra round-trip for establishing connections, so that's another 40 bytes down and 40 bytes up.

This can be mitigated by only using UDP trackers. Then you would only need a single connect-message, and you can reuse the connection ID for each announce. Each announce message would then be 100 bytes, and the returned message would be a bit more compact as well, let's assume 60 bytes. That would get you 28 kB/s up and 16kB/s down, just to keep the torrents announced. For this you would need a client with decent udp tracker support (one that caches the connection ID for instance).

Not too bad, assuming that's insignificant compared to the actual data your seeds would send.

However, you don't necessarily need to stripe your torrents across separate data centers, you could also use an HTTP server to seed the torrents. All major bittorrent clients support http seeding, and you wouldn't have to worry about announcing to the tracker (the URL is burned into the .torrent itself).

As for a client that scales well with torrents, I don't know for sure, I haven't done any measurements. It should be fairly straightforward to just generate a million random torrents and try to load it up.

I have done some optimization work in libtorrent rasterbar to make it scale well with many torrents, I haven't tried millions though.

I've written a blog post on this topic, here.

like image 194
Arvid Avatar answered Sep 28 '22 01:09

Arvid


You may be looking for Hekate It's in, at best, pre-alpha right now, but it's quite nearly what you're describing.

like image 33
Aranjedeath Avatar answered Sep 28 '22 01:09

Aranjedeath