Gossipsub v2.0 spec: Lower or zero duplicates by lazy mesh propagation #653

ppopth · 2024-12-14T14:53:13Z

This PR supersedes #652

This extension allows lazy propagation to mesh peers to reduce the number of duplicates (which can reach zero) in the network trading off with more latency.

Instead of sending the messages to mesh peers right away, it tosses a coin to decide whether to send it lazily or eagerly.

If it decides to send eagerly, it just forwards the message right away.

If it decides to send lazily, it sends IANNOUNCE instead and waits for INEED before sending the actual messages.

Notice that if the probability is configured to be 1, it guarantees that each node will receive exactly one copy of messages, which means no duplicates.

Authored by: @ppopth, @nisdas, @chirag-parmar

I made this PR as a draft first just to gain some visibility. We need to do simulations and further analysis to compare it with Gossipsub v1.2

Our next step would be to implement it in go-libp2p-pubsub and do simulations

gossipsub v2.0 allows you to reduce the number of duplicates lower or to zero trading off with more latency. It works by probabilistically deciding to forward the message eargerly or lazily to mesh peers. If it decides to send eargerly, it just forwards the message right away. If it decides to send lazily, it sends IANNOUNCE instead and waits for INEED before sending the actual messages. Notice that if the probability is configured to be 1, it guarantees that each node will receives exactly one copy of messages, which means no duplicates.

vyzo · 2024-12-14T21:56:35Z

pubsub/gossipsub/gossipsub-v2.0.md

+## Future Improvements
+
+- Penalize peers that don't send the message in time, after sending `INEED`.
+- Let publishers just send the full content of messages to mesh peers, rather than `IANNOUNCE`, because no one has really seen the message before. This saves one RTT, but it will kill anonymity so we are not sure yet to do it.


I think publishers should just flood publish the message, as we currently do.

We have allowed that as long as D_announce is less than D
https://github.com/libp2p/specs/pull/653/files#diff-f85861c1fe2084ec5cd59445f67d62b9bfd20e6eb0d879801833af26ebf8c107R139

However, we stop it if the protocol becomes fully announcement based

vyzo · 2024-12-14T21:57:21Z

pubsub/gossipsub/gossipsub-v2.0.md

+message ControlMessage {
+    // messages from v1.2
+    repeated ControlIAnnounce iannounce = 6;
+    repeated ControlINeed ineed = 7;


can't we just do this with IWANT?
In fact we could just send IHAVEs instead of using INEED.

The reasoning was that this would function very differently from IHAVE/IWANT which is emitted periodically. While message announcements were meant to always be immediate and be primarily be used for message propagation.

uhm ok, fair enough.

We also need to consider the interplay of IDONTWANT and IANNOUNCE as well.

We can just say that if we received IDONTWANT before, we won't send IANNOUNCE.

vyzo · 2024-12-14T21:59:35Z

In general I like the direction of where this is going.

However, it begs the question: Do we need the new control messages? We could do it by reusing IHAVE/IWANT.

ufarooqstatus · 2024-12-16T08:42:06Z

IMO, we may need to consider a few issues related to the duplicate-count problem in GossipSub:

Typically, a peer is unaware that it is already receiving a message and may generate MANY IWANT requests (GossipSub v1.1 allows that).
The same applies to IDONTWANT messages. A peer already receiving a message can only send IDONTWANTs once it finishes downloading that message. During this window (usually several hundred milliseconds), other mesh peers start relaying to that peer.

The same can happen to INEED messages. A peer is already downloading a message, and it sends another INEED request.

ppopth · 2024-12-16T10:01:15Z

The same can happen to INEED messages. A peer is already downloading a message, and it sends another INEED request.

I would like to note that this can happen only when D_announce < D (allowing some eager forwarding).

If D_announce = D (every forwarding is lazy), if I'm already downloading a message and I'm about to send another INEED, it means that I'm about to send INEED because the timeout occurs for the peer sending me the message, so that peer is deemed misbehaving.

ppopth · 2024-12-16T10:04:32Z

Do we need the new control messages? We could do it by reusing IHAVE/IWANT.

We have thought about that and we thought that

the logic of IANNOUNCE/INEED is very different from IHAVE/IWANT. (potential penalty for not sending the message after receiving INEED)
IANNOUNCE/INEED is supposed to contain only 1 msg id rather than a list
It's easier to distinguish between the two.

ufarooqstatus · 2024-12-16T10:17:05Z

If D_announce = D (every forwarding is lazy), if I'm already downloading a message and I'm about to send another INEED, it means that I'm about to send INEED because the timeout occurs for the peer sending me the message, so that peer is deemed misbehaving.

It depends on the message size and situation:
Early receivers will get around 'D' INEED and many IWANT requests.
If a peer tries to respond to all these requests for a large message, it might miss the 400ms deadline. Perhaps the deadline can be inferred from the message size.

nisdas · 2024-12-16T10:34:29Z

@ufarooqstatus

The same can happen to INEED messages. A peer is already downloading a message, and it sends another INEED request.

We would queue the multiple announcements we receive and only send INEED messages one peer at a time. In the event the first peer was unable to send us the full message within the timeout , we would then send the INEED to the next peer who sent us the announcement and so on.

After the router sends INEED, it will time out if it doesn't receive the message back in time, as indicated by timeout. If the timeout happens, the router will pop acache[msgid] send INEED to the next peer. If it still times out, keep going with next peers until the cache runs out of peers.

ppopth · 2024-12-16T11:53:20Z

it might miss the 400ms deadline. Perhaps the deadline can be inferred from the message size.

Yeah, you have to configure the timeout carefully.

ufarooqstatus · 2024-12-16T14:18:28Z

We would queue the multiple announcements we receive and only send INEED messages one peer at a time. In the event the first peer was unable to send us the full message within the timeout , we would then send the INEED to the next peer who sent us the announcement and so on.

Yes, the peer responding to 'INEEDs+IWANTs' can get overwhelmed, and may require much higher time for responding to these requests (depending on the message size).

vyzo · 2024-12-16T16:07:31Z

Another thing we could consider is the size of the messages; maybe small messages should always be eagerly forwarded and large messages could be just announced.

nisdas · 2024-12-17T04:34:18Z

Yes, the peer responding to 'INEEDs+IWANTs' can get overwhelmed, and may require much higher time for responding to these requests (depending on the message size).

Responding to INEED would be bounded by your degree, so we would only be providing data to our mesh peers that is actually useful for them. Currently we eagerly forward them anyway, so in the worst case where you have your mesh responding with the highest amount of INEED messages (D) , which is the status quo right for every message forwarded as of gossipsub v1.2 .

AgeManning · 2025-03-20T07:35:31Z

Hi all.
I've been chatting with various parties. In the interest of moving forward, I wanted to compile some reservations I've had about this PR and context as to why I've proposed an intermediate step (#664)

As an overview, I like the idea of reducing the eagerly sent messages in the mesh. I'm in favor of anything that improves the gossipsub protocol. However it is hard to tell what improves the protocol and what doesn't. Simulations often fall short of real world networks (we saw this with the recent IDONTWANT addition).

Here are some small reservations I have about this specific PR:

Extra Control Messages - I think we can achieve the same result without the extra control messages. I think we can use the IHAVE/IWANT system. In which case, I dont see the need for a major version bump, this could be minor version bump as ultimately the only thing we would be changing is just a probablistic parameter that sends the message or sends an IHAVE when forwarding.
Gossipsub 1.1 Scoring - I think we change gossipsub 1.1 scoring. The first message delivery factor and the mesh_deliveries parameter expects timely messages sent by mesh peers. I think these won't work as expected for peers that send IANNOUNCE messages rather than the message itself.

The main concern I have around this PR is the actual benefits we will see. This PR adds d_announce which if set to 0, gives us current gossipsub, and when set to d_high (The current PR says at most D, but I think it should say at most d_high.) gives us maximal bandwidth saving, but large latency (adds 1 RTT to every message hop).

In terms of latency, adding a RTT should be strictly worse than just sending the message. As this PR says

which reduces the number of duplicates lower or to zero trading off with more latency

I've seen simulations where this works in a congested network. When the network is not congested, this strategy is strictly worse (for message latency). As @nisdas points out (https://ethresear.ch/t/doubling-the-blob-count-with-gossipsub-v2-0/21893/6):

We re-ran simulations with much lower message sizes. The first 3 charts show the message arrival times with a size of 1kb. As it can be seen here D_announce=0 has significantly lower latency vs D_announce=8. It takes roughly a fourth of the time to reach the whole network compared to D_announce=8 . It does show in the absence of congestion, only pushing is strictly better.

It probably should be made clear that this PR and the d_announce parameter works favorably in congested networks, but potentially negatively (at least in terms of latency) in non congested networks.

More specific to Ethereum (where the motivation of this PR originates), there are some topics where we do want to trade latency for bandwidth, which this PR was specifically tested for (the PeerDAS topics). But there are others were we probably don't, i.e the block topic. If we set d_announce to something non-zero that favours a congested network, it may increase the block latency on the block topic, which (depending on timing) may not be congested.

I don't want to be the person who brings up issues and delays progress on innovative research (i.e this PR). So to solve the reservations I've listed here, I've proposed an intermediate step in (#664).

I think it can help by:

Decoupling this proposal from applying to all topics - This would allow us to use it for PeerDAS topic, without effecting the block topic (for example).
If it works for congested networks but performs worse for non-congested topics - It can be applied in those situations specifically.
We can revert/remove/change the design parameters of this more easily if it doesn't work as we expect on real world networks (as was the case for IDONTWANT)

ppopth and others added 3 commits December 14, 2024 02:37

Add in Announcement Degree Clarification

23e6603

Update table of content

0f18010

vyzo reviewed Dec 14, 2024

View reviewed changes

ppopth force-pushed the gossipsub-v2 branch from ef12789 to 5104660 Compare December 15, 2024 09:50

Handling IANNOUNCE only for trustworthy peers

18580e7

ppopth force-pushed the gossipsub-v2 branch from 5104660 to 18580e7 Compare December 15, 2024 09:54

Publishing using fanout peers

4371b3f

Handle bad message ID

56f7930

ppopth mentioned this pull request Dec 17, 2024

Gossipsub v2.0 libp2p/go-libp2p-pubsub#587

Open

7 tasks

ppopth added 2 commits December 17, 2024 20:52

fixup! Gossipsub v2.0 spec

23cd5f6

Clear acache before validation to stop INEED fast

ba3d165

AgeManning mentioned this pull request Mar 19, 2025

Generalised Gossipsub #664

Draft

ppopth mentioned this pull request Apr 2, 2025

Announcesub spec: pubsub protocol with no duplicates #652

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gossipsub v2.0 spec: Lower or zero duplicates by lazy mesh propagation #653

Gossipsub v2.0 spec: Lower or zero duplicates by lazy mesh propagation #653

ppopth commented Dec 14, 2024

vyzo Dec 14, 2024

nisdas Dec 14, 2024

vyzo Dec 14, 2024

nisdas Dec 14, 2024

vyzo Dec 15, 2024

ppopth Dec 15, 2024

vyzo commented Dec 14, 2024

ufarooqstatus commented Dec 16, 2024

ppopth commented Dec 16, 2024

ppopth commented Dec 16, 2024 •

edited

Loading

ufarooqstatus commented Dec 16, 2024

nisdas commented Dec 16, 2024

ppopth commented Dec 16, 2024

ufarooqstatus commented Dec 16, 2024

vyzo commented Dec 16, 2024

nisdas commented Dec 17, 2024

AgeManning commented Mar 20, 2025

Gossipsub v2.0 spec: Lower or zero duplicates by lazy mesh propagation #653

Are you sure you want to change the base?

Gossipsub v2.0 spec: Lower or zero duplicates by lazy mesh propagation #653

Conversation

ppopth commented Dec 14, 2024

vyzo Dec 14, 2024

Choose a reason for hiding this comment

nisdas Dec 14, 2024

Choose a reason for hiding this comment

vyzo Dec 14, 2024

Choose a reason for hiding this comment

nisdas Dec 14, 2024

Choose a reason for hiding this comment

vyzo Dec 15, 2024

Choose a reason for hiding this comment

ppopth Dec 15, 2024

Choose a reason for hiding this comment

vyzo commented Dec 14, 2024

ufarooqstatus commented Dec 16, 2024

ppopth commented Dec 16, 2024

ppopth commented Dec 16, 2024 • edited Loading

ufarooqstatus commented Dec 16, 2024

nisdas commented Dec 16, 2024

ppopth commented Dec 16, 2024

ufarooqstatus commented Dec 16, 2024

vyzo commented Dec 16, 2024

nisdas commented Dec 17, 2024

AgeManning commented Mar 20, 2025

ppopth commented Dec 16, 2024 •

edited

Loading