How does a node join a Distributed Hash Table (DHT) cluster?

Question

I'm trying to learn about the Distributed Hash Table (DHT) paradigm, as it fits into a P2P or fully distributed computing architecture. From a theoretical standpoint, once a cluster is established, it makes some deal of sense how it manages to swarm data and distribute work.

The most interesting part to me is that the architecture never requires some kind of centralized controller or coordinator (no single point of failure.) However, I'm still struggling to understand the practical execution of the concept, particularly how a cluster formed. If it's a fully distributed system, how does a node know how to 'join' the already established cluster?

In a simplistic example:

Say I'm creating a P2P application based on the DHT model
The application is distributed across the Internet (a.k.a. not in the same network), and any public client may connect to the cluster
A client connected to the cluster can see some (but not necessarily all) of the other clients in the cluster
A client who isn't connected doesn't have any addresses or names of clients in the cluster.

So how would a new client 'connect' if there isn't any centralized server to act as a beacon, or serve the means of introducing the new client to the cluster?

Martin · Accepted Answer

This is a problem I covered as part of my dissertation, and I never found a solution I was happy with. The problem is that you need some kind of information about just one of the other peers before joining the network, getting that first address is the hard bit.

A Few ideas I came up with:

Encourage peers to publish their address, that way you get publicly accessible lists of known IPs building up
Run several "well known" bootstrap peers
Brute Force the address space

The last option is the only truly decentralised approach. A combination of the three is likely to be best.

Once you're bootstrapped into a network reestablishing connection after disconnecting is not hard, simply save the addresses of a couple of thousand nodes in the network who have already been long lived, at least one of them will still be online next time.

Rohit Keswani · Answer

From what I can think of you can create a proxy server for the network of DHT nodes and have shadow servers for that proxy server to enable reliability.

Any new node trying to join the DHT network , talks to proxy and the proxy lets it in the DHT network where it is entirely P2P.

This way only proxy server has to be public and all other DHT nodes can have their IP's private.

This might be a hinderance to you as the application is distributed across internet, but you can always talk via proxy.

How does a node join a Distributed Hash Table (DHT) cluster?

Tags:

architecture

theory

distributed-computing

p2p

David Elner

2 Answers

Martin

Rohit Keswani

Recent Activity

Donate For Us

How does a node join a Distributed Hash Table (DHT) cluster?

Tags:

architecture

theory

distributed-computing

p2p

David Elner

2 Answers

Martin

Rohit Keswani

Related questions

Recent Activity

Donate For Us