Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bittorrent implementation in java && need some info on swarm behaviour [duplicate]

I'm developing a bitTorrent client in Java. I know there are a lot of libraries out there, online, but I can't help it; I want my own. Anyway, I noticed some weird behaviors and maybe you guys know something that I'm missing:

  • About 80% of all peers I'm trying to connect to result in unsuccessful connections (either socketTimeOut or "can't connect" errors). Obviously, the list of peers is received from the trackers. I also tested randomly some IPs by trying to ping them; the ping is usually successful.
  • When I do connect:
    • 50% drop connection after HandShake,
    • on 30% I noticed a weird behaviour: I receive Handshake, I receive BitField (they have all pieces), I get bombarded with +20 Have messages (I checked the index of piece they already mentioned this in BitField), then they drop connection, which is weird.

(For all statistics, figures are not precise.)

Some BitTorrent questions:

UPDATE #4: im cutting off some questions due to considering answer found

  • this was the '80% failed connect rate question': What could be reason of my 80% failed to connect rate? This can't be bad luck, in the sense that every client I tried to connect had no more room for me. I'm listening on 6881, but also tested with other ports. Yesterday I had great success, a bunch of connections accepted (same code, a few changes in past week), Piece messages started flowing.. so my code is not totally useless.

  • Do torrent clients send, before closing, a last message to tracker with event=stopped to make it update its internal database with peer info so that it won't send, as a response, a list with useless peer info? Or just they should.. because it really seems I'm receiving dead peers.

  • Is the order of received peers of any importance? Maybe percentage of completion.. or really random.
  • Also, every now and then I receive a peer with port 0, which makes my Socket constructor throw an exception. What does port 0 mean ? Can I contact it on any port?
  • Can my PeerId (that I send in Handshake or announce my self to the tracker) influence if the torrent clients I' trying to communicate will continue a started connection? Meaning what if I lie and say that I'm an Azureus client by using '-AZ2060-' as my ID?
  • this was the 'piece availability scaring off peers question': Does my piece availability scare off peers? I'm trying to connect, and I send a empty bitfield (I have no pieces, [length: 1][Id = 5][payload: {}]); it seems that they send bitfield, I send bitfield.. (some send like crazy Have messages), they realise I'm poor, they drop me.. some drop connection after handshake. (How rude.)
  • Is there a benefit of not using the classic port interval: 6881 - 6889?
  • this was the 'list of bad peers question': Do torrent clients keep internally a list of bad peers (like a black list)? Sometimes after finding a nice peer, I continually used its info in my tests but only 1/3 connection was accepted. Sometimes 10 minutes had to pass to have a successful connection again.

UPDATE #1: it seems that connections with μTorrent clients behave in the aforementioned pattern (BITFIELD, HAVE bombardment, close connection). I tested locally with a bunch of bitTorrent clients (μTorrent, BitTorrent, Vuze, BitCommet, Deluge) and only noticed this pattern on μTorrent. On the others, communication was fine (HS, BITFIELD, UNCHOCE & happy piece sharing). Now, this μTorrent is probably the most popuar bitTorrent client (6/8 connections started were μTorrent), so… any ideas?

UPDATE #2: In terms of keeping a "bad list," it does seem so (and it actually makes sense to do so). For example, with μTorrent, I noticed the following no-connect intervals (30s, 1min, 1min30s, 2min.. ). By "no-connect" in mean, after previous connection ended, for x seconds no new connection was accepted.

UPDATE #3: That HAVE message bombardment might have been the so-called "lazy bitfield" (did a couple of tests, each piece mentioned in HAVE was not present in BITFIELD). I see that μTorrent and BitTorrent use this approach.

Another conclusion: Some clients are more restrictive in terms of respecting the BitTorrent specs and will close connection if you break a rule. Ex: I noticed with BitTorrent and BitTornado that if you send a bitfield message but have no pieces they will close connection (no pieces = empty bitfield.. but specs say "It is optional, and need not be sent if a client has no pieces"), while others close connection if you send any type of msg before they send a UNCHOKE msg (not even INTERESTED).

UPDATE #4: Since I'm mostly interested in first question (What could be reason of my 80% failed to connect rate?.. the striked questions are more than probably liked), here are some explanations of why sometimes connections were unsuccesfully:

1) if I start a connection with peer shortly after stopping a previous connection (by stop - I mean close socket): the peer on the other side wont know until next read/write.

Details: - I noticed this a bunch of times, this is more obvious after finishing a download.. if I close connection peer won't realise this until it tries to send a new KEEP_ALIVE (~2 minutes). But if I close while in an exchange REQUEST-PIECE, peer will realise pretty fast.. In first scenario after closing connection, I am still present in uTorrent peer tab. If I look inside the logger tab, after about 2 minutes, it will realise that I am gone.

2) it seems that uTorrent sees my BITFIELD message corrupted (& obvious should close connection after receiving it) (this doesn't happen always.. also I checked & rechecked, msg is OK & with other BT client there were no such problems).

Details: - if I look inside uTorrent logger tab, it displays "Disconnected: Bad packet" right after I send bitfield - I'm planning to try an implementation of lazzy bitfield, maybe I can escape this (also I see that majority of BT clients do this)

3) (more than probably linked to #1) when uTorrent doesn't allow me to re-connect, I see in logger tab: "Disconnect: already have equal connection (dropped extra connection)".. Currently i choose random local port when initing a new connection (saw this implemented in the majority of BT clients), but this doesn't trick it, he still sees that im a peer already present in his "peer list" (probably does ip match).. Buuut: in 30% of the tests, same scenario, it does allow me to reconnect :) .. I have no explanations why yet

4) one more thing: it seems that the 'listener for incomming connexions' is still alive after you close a torrent in uTorrent (by close I mean: right click + stop). This means that I can still start a connection, send HANDSHAKE.. after this, I'm disconnected (it doesn't HANDSHAKE back). Message in uTorrent logger: "Disconnect: No such torrent: 80FF40A75A3B907C0869B798781D97938CE146AE", this long string being my info hash.. seen this while testing with other BT clients too.

Some more info:

  • scenarios with uTorrent of type full-upload/partial-upload & full-download are successful, those of partial-download not so much.. probably due to #2
  • I still get with uTorrent that bitField + have bombardment + close connection.. as I remember the same msg in logger tab "Disconnected: Bad packet".. probably due to #2
  • besides uTorrent, I've tested with: BitTorrent, BitTornado, BitCommet, qBitTorrent, FlashGet (communication was OK) & with Vuze, FrostWire, Shareaza (with these guys, it was super OK).
  • not all clients behave the same. Ex: FlashGet & uTorrent (& BitCommet?) don't unchoke until you send INTERESTED.. while others seem to unchoke right after BITFIELD.. in this sense I'm planning somehow to treat clients differently (i really think this is necessary).. probably guess their name from the bitfield (there are only 2 naming conventions) & start from there.. I already have something implemented, this is how I know that I connected to client of type uTorrent..
like image 848
pulancheck1988 Avatar asked Mar 20 '13 15:03

pulancheck1988


People also ask

How is BitTorrent implemented?

In its original implementation, BitTorrent base its operation around the concept of a torrent file, a centralized tracker and an associated swarm of peers. The centralized tracker provides the different entities with an address list over available peers.

Is BitTorrent application layer?

BitTorrent is an application layer network protocol used to distribute files. It uses a peer- to-peer (P2P) network architecture where many peers act as a client and a server by downloading from peers at the same time they are uploading to others.


1 Answers

Ok, I have an answer for you but I must warn you that I myself never wrote a bit-torrent client and some answers might not be 100% accurate, all that I wrote is from my understanding of the global view of how bit-torrent work. So I apologize if I wasted your time but I still think you might learn about the core of what you asking about from my answer.

•What could be reason of my 80% failed to connect rate?

Very complicated to explain in one linear explanation but: - bit torrent ideology is tit-4-tat.. if you're not giving/having tit you ain't getting tat.. UNLESS you just started to download and in that case you might get a "donation" to start with... OR the other side is a dedicated seeding machine.. in that case he might check if you are a giver or just a taker... OR many currently downloading this... OR (fill in your idea..) So, you see there are many, and actually very smart mechanisms to make sure the swarm can be agile and efficient and while some of them can be traced to your machine most of them cannot really be even monitored by your machine least to say under its control.

•Do torrent clients send, before closing, a last message to tracker with event=stopped to make it update its internal database with peer info so that it won't send, as a response, a list with useless peer info? Or just they should.. because it really seems I'm receiving dead peers.

  • It depends on the client code - some might do that some not.. (keep reading)

•Is the order of received peers of any importance? Maybe percentage of completion.. or really random.

  • It depends on the server code - some might do that some not.. (keep reading)

Alright, Note for those two (keep reading) notes.. You should keep in mind that in a P2P network there is no authority to strictly bind clients or even servers to uphold the protocol to the letter, even if the protocol states something that should be done - it does not mean that every client will implement it or act the same upon it or upon missing it.

•Also, every now and then I receive a peer with port 0, which makes my Socket constructor throw an exception. What does port 0 mean? Can I contact it on any port?

  • Port 0 is kind of a wildcard, if you connect to it - it will automatically connect you to the next available port. (some say next available port above 1023 - but I never tested that)

•Can my PeerId (that I send in Handshake or announce myself to the tracker) influence if the torrent clients I' trying to communicate will continue a started connection? Meaning what if I lie and say that I'm an Azureus client by using '-AZ2060-' as my ID?

It will think you are Azureus and if other Azureuses promote connection to Azureuses according to that (and that's a big if there) you will be getting a benefit from it.

•Does my piece availability scare off peers? I'm trying to connect, and I send an empty bitfield (I have no pieces, [length: 1][Id = 5][payload: {}]); it seems that they send bitfield, I send bitfield.. (some send like crazy Have messages), they realise I'm poor, they drop me.. some drop connection after handshake. (How rude.)

  • possible..

•Is there a benefit of not using the classic port interval: 6881 - 6889?

  • I don't think so - except maybe confusing your ISP..

• Do torrent clients keep internally a list of bad peers (like a black list)? Sometimes after finding a nice peer, I continually used its info in my tests but only 1/3 connection was accepted. Sometimes 10 minutes had to pass to have a successful connection again.

  • Depends on client Code.

Summary

It's a jungle out there - everyone can write its own logic as long as he sending the correct protocol commands - your questions focus on the logical behaviour of clients but there is no common ground as you probably understood by now, this is also the beauty of the bit-torrent and probably the main reason for its success.

like image 96
G.Y Avatar answered Oct 17 '22 01:10

G.Y