Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

guillaumemichel
Copy link
Contributor

@guillaumemichel guillaumemichel commented Mar 18, 2025

Long overdue

Fixes #345

Checklist

  • Update links to libp2p/kad-dht specs to the new ipfs/kad-dht specs.
  • Add LaTeX or MathML support

Copy link

github-actions bot commented Mar 18, 2025

🚀 Build Preview on IPFS ready

@guillaumemichel guillaumemichel force-pushed the dht branch 2 times, most recently from de8a611 to 9215300 Compare March 18, 2025 16:47
Comment on lines 262 to 265
DHT Servers SHOULD NOT return their own Peer ID in responses to `FIND_NODE`
queries. However, they MUST include information about the requester, if and
only if the requester is a DHT Server in its routing table and it is among the
`k` closest nodes to the target key.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most cases, returning information about self (DHT Server) or requester (DHT Client) will be useless for the request, since the DHT Client already knows about both. The main argument of not including these addresses is to save bytes on the wire.

I don't see a use case where it would be useful that the DHT Server sends information about itself, since listen addresses and supported protocols should be exchanged using libp2p identify.

In some cases, it could be useful for the client (if it is a DHT Server) to know whether or not it is included in the Server's routing table, or which addresses are advertised. Information about the requester will essentially be present when a node is looking up itself when refreshing its routing table, and having this information provides a guarantee that the peer is actually routable. In a network that is large enough, it is very unlikely that the requester (being a DHT Server) will be among the k closest nodes to a key (other than self) being looked up.

Alternatively is possible to get this information by starting a fresh DHT Client and requesting about the initial peer, but it is a bit cumbersome.

also see libp2p/specs#535

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I think that including the DHT server is mostly useless and takes space on the wire. It isn't useful when looking for one specific key, or the X closest peers to a key.

The only use case I can see if for when DHT servers try to see whether they are included in another DHT server's routing table, and what information is stored about them. This would be useful only for crawlers to gain statistics on the network. The exact same information can be retrieved using another peer id.

Hence I would suggest that the closestPeers field shouldn't contain neither self (DHT server), nor the requester's information. It self AND/OR requester are among the k closest peers to the requested key in the DHT server's routing table, then they should be replaced by the next closest peer, so that the response always contains k peers (if network is large enough).

It means that the current implementations may not be compliant with this spec change, but it should be easy to address.

WDYT @achingbrain ?

Related issues:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense.

Is an exception here if the server has the specifically requested ID in it's peer store (e.g. to allow finding multiaddrs for non-DHT servers)? What if the specifically requested ID is for the server or the requester's own ID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! In this case, I suggest the DHT server returns k+1 peers: the k closest DHT servers and the peer record matching the requested peer id (unless it corresponds to a DHT server).

Copy link
Member

@lidel lidel Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guillaumemichel iiuc this thread correctly, the suggestion is to make this change:

Suggested change
DHT Servers SHOULD NOT return their own Peer ID in responses to `FIND_NODE`
queries. However, they MUST include information about the requester, if and
only if the requester is a DHT Server in its routing table and it is among the
`k` closest nodes to the target key.
DHT Servers MUST NOT include the requester's Peer ID in responses to `FIND_NODE`
queries. DHT Servers SHOULD NOT include their own Peer ID in responses.
When the requester or self would be among the `k` closest nodes, the server
SHOULD return the next closest peer(s) to maintain a response of `k` peers
when possible.
Special case: When a specific Peer ID is requested and that peer is known but
is not the queried DHT Server, the server SHOULD include that peer's record even if it's
not among the `k` closest DHT servers.

Sgtm, go-libp2p-kad-dht remains technically compliant, because it implements MUST but just wanted to flag it does not (afaik) do the SHOULDs.

iiuc, the current go-libp2p-kad-dht behavior:

  1. ✅ Already excludes requester - The code has if targetPid != from check
  2. ❌ Does NOT exclude self - afaik there is no explicit check to exclude the DHT server's own ID?
  3. ❌ No special handling for non-DHT peers - doesn't distinguish between DHT servers and clients in responses?

Should we approve spec clarification here + fix reference implementation of kad-dht in go and js for consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to clarify in 23d4b91.

iiuc, the current go-libp2p-kad-dht behavior:

1. ✅ Already excludes requester - The code has if targetPid != from check

That is correct.

2. ❌ Does NOT exclude self - afaik there is no explicit check to exclude the DHT server's own ID?

Self is never returned by the routing table when looking for closest peers. See error if it happens here. The only situation where a provider record could be returned for self, is if self's Peer ID is directly requested.

3. ❌ No special handling for non-DHT peers - doesn't distinguish between DHT servers and clients in responses?

The protobuf message only has a single field to return peers. We cannot modify it, since it would be a breaking change. The only DHT client that can ever be returned in a response is if the requested Peer ID is a perfect match of a peer that is in the peerstore, but isn't a DHT Server (in the routing table). See here.

Should we approve spec clarification here + fix reference implementation of kad-dht in go and js for consistency?

libp2p/go-libp2p-kad-dht#1158

@guillaumemichel guillaumemichel marked this pull request as ready for review March 19, 2025 10:29
lidel added 3 commits March 24, 2025 23:17
will do for now, we should support this in generator at some point
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @guillaumemichel for documenting this ❤️

Made a first pass and pushed small cosmetics (fetch my changes) + some questions / suggestions in comments inline.

I think its ready for wider feedback, and if no concerns landing it in a few weeks.

@lidel lidel changed the title kad: adding DHT spec Add IPFS Kademlia DHT Specification Mar 24, 2025
Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting here, it's much needed 🙏. Most of my comments fall into:

  1. What should be here vs libp2p kad spec
  2. If we've covered all the things we need to here

lidel and others added 4 commits September 8, 2025 22:30
replace 'recommended' with appropriate SHOULD/MAY terms throughout
the document for clarity and consistency with RFC 2119 conventions

#497 (comment)
require TCP+Yamux as MUST, QUIC as SHOULD. require both TLS and Noise
for DHT servers to ensure maximum interoperability. both go-libp2p and
js-libp2p support both security protocols by default.

#497 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IPFS DHT Specification is missing
5 participants