Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve peerguard #863

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

mintyleaf
Copy link
Contributor

@mintyleaf mintyleaf commented Feb 26, 2025

Hello there!

For our LocalAI P2P experiments we need to have an ability to authenticate the machines in the P2P network and invalidate them if needed as well.

I have tested the peerguard(-ian) capabilities, and found out that autoclean is broken (or i didn't get it), docs have wrong flag "--peerguardian vs. --peerguard" and PEERGATE_AUTH had wrong quotes, which leads to error in marshaling it as json.

The autoclean part is now kicks the peers that is not connected to MessageHub as it written in the docs and comments, as well it is now aware of self peerID, since ListPeers() didn't returns the self peerID, and we have constant kicking result, where trustzone data becomes unconsistent.
Also, the Authenticate now stores the pubkey which peer used to authenticate, so that way if we delete the pubkey - that user will be kicked from the trustzone.

There are also little testing script i used, which i include for now, since i want to know if i can miss something in the way of testing.

UPD:
Since we don't have a proper ledger with PoW, long-term block storage with ability to retrieve the full chain and just swapping current block if it matches verifier - i had to do some tricks:

  • for now, the trusted peer ID list can be 'hardened' in the VPN config, instead of doing so in genesis block
  • we can specify which buckets are write protected in the VPN config too
  • if trusted peer ID list in config is not empty - the non-matching peers for that list didn't listen to anything, but trusted ones from the start
  • the non-matching peers also didn't verify the blocks at all, just eating what trusted peer provides, so that way they can join later and didn't get desync
  • each storage write is now checked against trusted peer ids and protected keys list, skipping updates with unauthorized write to protected keys (for example, trustzoneAuth and trustzone)
  • there are new -p flag for generating protobuf serialized -> base64 encoded key, which can be used with EDGEVPNPRIVKEY env variable to make fixed peerID somewhat easier to handle, instead of using of caching into the file

I know, that this is going against decentralized VPN idea, but as part of LocalAI project with need of one-to-many authorized access current implementation of edgevpn is incapable to provide secure authorization mechanism without either complex PoW and proper ledger implementations, or without tricks to create some peers absolutely authoritarian

This PR is a draft, and i'm waiting for the feedback!
Thanks in advance!

Screen.Recording.2025-03-23.at.15.17.07.mov.mp4

@codecov-commenter
Copy link

codecov-commenter commented Feb 26, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 20.00000% with 56 lines in your changes missing coverage. Please review.

Project coverage is 31.46%. Comparing base (689c89a) to head (326a9e7).
Report is 679 commits behind head on master.

Files with missing lines Patch % Lines
pkg/trustzone/peerguardian.go 17.64% 42 Missing ⚠️
pkg/node/connection.go 0.00% 10 Missing and 1 partial ⚠️
pkg/node/node.go 50.00% 2 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #863      +/-   ##
==========================================
- Coverage   34.32%   31.46%   -2.87%     
==========================================
  Files          26       30       +4     
  Lines        2057     2333     +276     
==========================================
+ Hits          706      734      +28     
- Misses       1249     1494     +245     
- Partials      102      105       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mintyleaf mintyleaf marked this pull request as draft February 26, 2025 01:54
@mintyleaf
Copy link
Contributor Author

hello, @mudler!
i'm going to create a PR with porting of that features support into the LocalAI soon
i see that as client is generating keypair, send it over secure channel to some admin, then public key is put through API route into LocalAI's edgevpn ledger
client is simply providing the privkey through env into LocalAI and launches that with edgevpn token, in which we manually added the trusted peer id of LocalAI federated head, and trustzoneAuth/trustzone keys as protected

i want to have your feedback on that topic before that

thanks in advance!

@@ -0,0 +1,19 @@
otp:
dht:
interval: 360
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a leftover, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, there are also testing script, and other leftovers for demo purposes out of the box
that ofc will be cleaned if the main direction of that pr resolved, so further cleaning is upcoming

@@ -253,6 +254,21 @@ func (e *Node) handleEvents(ctx context.Context, inputChannel chan *hub.Message,
continue
}

// If we have enabled trusted arbiter peers
if len(e.config.TrustedPeerIDS) > 0 && e.host.ID().String() != m.SenderID {
Copy link
Owner

@mudler mudler Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would make sense to be consistent here and instantiate the PeerGater with static trusted peer IDs coming from the config.

For instance, the gater constructor could take as arguments a set of trustedPeerIDs coming from the config

func NewPeerGater(relaxed bool) *PeerGater {

and Gate()

func (pg *PeerGater) Gate(n *node.Node, p peer.ID) bool {

could check both in trustedDB (coming from the blockchain) and the static ones (that you are adding) coming from the constructor

Copy link
Contributor Author

@mintyleaf mintyleaf Mar 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have reviewed that now more deeply, as i have returned to edgevpn code while doing LocalAI PR init commit - the thing is, the configured initial trust IDs should work as a switch for authoritative/passive mode, where we either gating messages based on trustDB in authoritative mode, or gating all messages that came not from the initial trusted IDs in passive mode, disabling peergater further, as it don't have priority in that mode

The peergater most likely remains the same, and stays in the scope of working only for blockchain trustDB

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, but this sounds like implementing a whitelisting mechanism which is very much similar to the PeerGater. To keep code concise and clean we could or either expand the PeerGater module to accept a static ids and act it as a whitelister, or have a separate component dedicated to whitelisting (which would be even more clean).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i will expand peergater then, as for now whitelisting is effective only in pair with peergater

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yet, the ProtectedStoreKeys is relying at TrustedPeerIDs, so most likely the separated whitelisting component is needed

@@ -123,6 +124,12 @@ func (e *Node) Start(ctx context.Context) error {
return err
}

if len(e.config.TrustedPeerIDS) > 0 && !slices.Contains(e.config.TrustedPeerIDS, e.host.ID().String()) {
ledger.SkipVerify()
Copy link
Owner

@mudler mudler Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why skipping verification of blockchain blocks? this smells a bit off because from what I can see it would be used just to skip blockchain checks. These are more for integrity rather then security (it gates for example nodes to update blockchain versions which are older then what the current node has): by removing that check any node could post a older version of the blockchain, or a different blockchain coming from a different genesis block, and all the nodes skipping verification would accept that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first naive approach was the usage of genesis block for store authored peers from the start, as well, as the list of peer ids for protected keys
Later i have found that we don't really have a way to have everyone synced later, if genesis block differs

So i came with the approach of storing that data in the config, which is encoded as string token later, which providing a secure way to have that data from the start

Yet, later, if authored node is running through API mode, i noticed that healthcheck didn't start by default, so the later joined nodes always fails to sync, since the indexes are dramatically out of sync
Later i found that basically any way later joined node will fail to sync with the main blockchain view, because its hash never match, since it storing self view with unique healthchecks first, and in terms of indexes it will never get enough updates from self side to have index match later

So if the authoritarian trusted nodes base is present from the start - they will guarantee the proper view at blockchain, and every regular client most likely need to accept the blocks from them without any checks (since they filter out any other, than trusted nodes messages at top level)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I would then remove these bits as they are replaced by having static nodes via config, no?

The other way around I see it, the nodes having this set should periodically try to reconcile to the ledger the information if missing.

Also: I think you found something missing there, in api mode only healthcheck should run regularly to keep everything update, that probably requires a fix on its own

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, either we need to have a reconciliation mechanism implemented, still having priority over trusted nodes vs. block validation
Either we having a full passive mode, like i did, which is more simple to implement

We can still verify blocks for everyone, if we haven't defined the trusted nodes, so i would not remove that bits if i understand you correctly, because edgevpn still can be used as fully decentralized non-authoritarian VPN this way (yet, i see that later joined peers never go in sync that way, but for now that is needed only for healthchecks and other simple metadata)

About healthchecks in API mode - they can be enabled through flag --enable-healthchecks, so that is looking more like on purpose for me

@mudler
Copy link
Owner

mudler commented Mar 26, 2025

hey @mintyleaf - sorry for coming late at this but just saw the PR (please mention me to have a review, I am a bit overwhelmed by GH notifications!)

The direction looks good, just few nit/open points , thanks!

@mintyleaf
Copy link
Contributor Author

mintyleaf commented Mar 29, 2025

I have added "autotrust" network service, which checks and puts the node into trustDB if its id matches with configured initial trusted ids

And also disabled key check for autocleanup mode for such trusted nodes

The idea is other nodes don't have a trusted node public key (and they don't really need to authenticate it in trustzone)
So the public/private keypair of such nodes can be used as additional layer of authorization when adding public worker node keys in LocalAI

@mudler, what do you think about it?

l.Lock()
if block.Index > l.blockchain.Len() {
if l.skipVerify || block.Index > l.blockchain.Len() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure - why do we need still skipVerify? otherwise changes looks good to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As i mentioned before, we need to have nodes as passive listeners, and in that mode they listen only from trusted peers

Yeah, either we need to have a reconciliation mechanism implemented, still having priority over trusted nodes vs. block validation Either we having a full passive mode, like i did, which is more simple to implement

We can still verify blocks for everyone, if we haven't defined the trusted nodes, so i would not remove that bits if i understand you correctly, because edgevpn still can be used as fully decentralized non-authoritarian VPN this way (yet, i see that later joined peers never go in sync that way, but for now that is needed only for healthchecks and other simple metadata)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants