operator/ciliumidentity: Add CID time trackers #33380

ovidiutirla · 2024-06-25T12:01:17Z

Add EnqueueTimeTracker and CIDDeletionTracker structures to manage enqueuing times and track CID deletion marks. This feature will be used by Operator Managing CIDs.

EnqueueTimeTracker is only used for metrics to meter the duration from enqueuing the reconciliation until the CID reconciliation is completed.
CIDDeletionTracker is used to handle CID deletion when the CID is no longer needed, this is only used by operator. The deletion tracker allows us to implement a cidDeleteDelay which is the delay to enqueue another CID event to be reconciled after CID is marked for deletion. This is required for simultaneous CID management by both cilium-operator and cilium-agent. Without the delay, operator might immediately clean up CIDs created by agent, before agent can finish CEP creation. The deletion tracker is not used in cilium-agent and is only used to enforce the delay for deletion by operator.

Related CFP #27752

Draft full implementation: #33204

operator/pkg/ciliumidentity/cache.go

ovidiutirla · 2024-06-25T19:34:41Z

/test

operator/pkg/ciliumidentity/cache.go

joestringer · 2024-06-27T21:37:49Z

Could you update the description to describe how this time tracker fits into the design, ie expand the following sentence to explain not just what but how it integrates into the broader solution?

This feature will be used by Operator Managing CIDs.

ovidiutirla · 2024-06-27T21:51:48Z

I assume Andrew had similar concern with CIDDeletionTracker, it shouldn't impact the NP enforcement as the NP is done on agent level while the tracker is used to add a configurable delay for CID clean-up.

Hey @joestringer , thanks for the feedback. Added the following to the PR description:

EnqueueTimeTracker is only used for metrics to meter the duration from enqueuing the reconciliation until the CID reconciliation is completed.
CIDDeletionTracker is used to handle CID deletion when the CID is no longer needed, this is only used by operator. The deletion tracker allows us to implement a cidDeleteDelay which is the delay to enqueue another CID event to be reconciled after CID is marked for deletion. This is required for simultaneous CID management by both cilium-operator and cilium-agent. Without the delay, operator might immediately clean up CIDs created by agent, before agent can finish CEP creation. The deletion tracker is not used in cilium-agent and is only used to enforce the delay for deletion by operator.

joestringer · 2024-06-27T22:14:02Z

@ovidiutirla I still don't quite follow.

EnqueueTimeTracker is only used for metrics to meter the duration from enqueuing the reconciliation until the CID reconciliation is completed.

Which metrics? What does it mean to reconcile a CID?

asauber · 2024-06-27T22:23:34Z

In the draft implementation, there is a 30 second delay added to any CID deletion-via-operator here https://github.com/cilium/cilium/pull/33204/files#diff-a5b01710162b5dcb0c820096bd611ded7a00e14a8a60ebc382466ce30d9436caR249-R253

To me, this appears to introduce an [at least] 30 second delay between the operator first processing a delete event and the policy of that delete event being enforced by Agents. I do see the point that even if strong distributed concurrency were in place for CIDs, that would need to be coupled somehow with CEP operations.

asauber · 2024-06-27T22:25:30Z

Apologies for spam. The relevant diff is not linkable on GitHub due to the diff being collapsed by default.

In the draft implementation, there is a 30 second delay added to any CID deletion-via-operator here https://github.com/cilium/cilium/pull/33204/files#diff-a5b01710162b5dcb0c820096bd611ded7a00e14a8a60ebc382466ce30d9436caR249-R253

cilium/operator/pkg/ciliumidentity/reconciler.go

Lines 249 to 253 in 05e0ebe

    
           if !isMarked { 
        
           	r.cidDeletionTracker.Mark(cidName) 
        
           	r.queueOps.enqueueCIDReconciliation(cidResourceKey(cidName), cidDeleteDelay) 
        
           	return nil 
        
           }

ovidiutirla · 2024-06-28T09:13:10Z

The cidDeletionTracker is Marked in handleCIDDeletion (handleCIDDeletion marks or deletes already marked CID)
handleCIDDeletion is called only when:

CID only exists in the watcher's store, and it isn't used. (we could re-write that if statements to improve clarity though)

cilium/operator/pkg/ciliumidentity/reconciler.go

Lines 167 to 186 in 05e0ebe

    
           cidIsUsed := r.cidIsUsedInPods(cidName) || r.cidIsUsedInCEPOrCES(cidName) 
        
           if !existsInDesiredState { 
        
           	if cidIsUsed { 
        
           		return nil 
        
           	} 
        
           	r.cidCreateLock.Lock() 
        
           	defer r.cidCreateLock.Unlock() 
        
           	return r.handleCIDDeletion(cidName) 
        
           } 
        
           if !cidIsUsed { 
        
           	if existsInStore { 
        
           		r.cidCreateLock.Lock() 
        
           		defer r.cidCreateLock.Unlock() 
        
           		return r.handleCIDDeletion(cidName) 
        
           	} 
        
           	r.desiredCIDState.Remove(cidName) 
        
           	return nil 
        
           }

So the time.Now() is only used to enforce the 30s delay for the CID deletion, and we are not relying on the exact system time, we are just enforcing the delay. But indeed, yes, once a CID is deleted we mark it for deletion and only after the delay we propagate the CID deletion(of the unused CID) to all agents.

Similarly, in the current implementation, we keep the CID for ~30 minutes (2 * defaults.KVstoreLeaseTTL) until we delete it,

cilium/operator/identitygc/crd_gc.go

Lines 113 to 143 in 05e0ebe

    
           if !igc.heartbeatStore.isAlive(identity.Name) { 
        
           	ts, ok := identity.Annotations[identitybackend.HeartBeatAnnotation] 
        
           	if !ok { 
        
           		log.WithFields(logrus.Fields{ 
        
           			logfields.Identity: identity.Name, 
        
           			logfields.K8sUID:   identity.UID, 
        
           		}).Info("Marking identity for later deletion") 
        
           		// Deep copy so we get a version we are allowed to update 
        
           		identity = identity.DeepCopy() 
        
           		if identity.Annotations == nil { 
        
           			identity.Annotations = make(map[string]string) 
        
           		} 
        
           		identity.Annotations[identitybackend.HeartBeatAnnotation] = timeNow.Format(time.RFC3339Nano) 
        
           		if err := igc.updateIdentity(ctx, identity); err != nil { 
        
           			log.WithError(err). 
        
           				WithField(logfields.Identity, identity). 
        
           				Error("Marking identity for later deletion") 
        
           			return err 
        
           		} 
        
           		continue 
        
           	} 
        
           	log.WithFields(logrus.Fields{ 
        
           		logfields.Identity: identity, 
        
           	}).Debugf("Deleting unused identity; marked for deletion at %s", ts) 
        
           	err := igc.deleteIdentity(ctx, identity) 
        
           	if err != nil {

I might not have a full understanding of your concerns, just wondering if this brought some clarity on how we plan to use it? I'm happy to walk you through our approach or if you have any better ideas on how to handle this part we are happy to adapt our solution.

To reconcile a CID I mean that we ensure that the desired state for the CID is reached,

cilium/operator/pkg/ciliumidentity/reconciler.go

Lines 134 to 144 in 05e0ebe

    
           // reconcileCID ensures that the desired state for the CID is reached, by 
        
           // comparing the CID in desired state cache and watcher's store and doing one of 
        
           // the following: 
        
           // 1. Nothing - If CID doesn't exist in both desired state cache and watcher's 
        
           // store. 
        
           // 2. Deletes CID - If CID only exists in the watcher's store, and it isn't used. 
        
           // 3. Creates CID - If CID only exists in the desired state cache. 
        
           // 4. Updates CID - If CIDs in the desired state cache and watcher's store are 
        
           // not the same. 
        
           func (r *reconciler) reconcileCID(cidResourceKey resource.Key) error {

pippolo84

Thanks!

I've left some questions around some changes that were previously introduced and reviewed in #30649.
Also, some suggestions around the structure of the unit tests.

operator/pkg/ciliumidentity/cache.go

pippolo84 · 2024-07-03T10:32:55Z

operator/pkg/ciliumidentity/cache.go

+// CIDDeletionTracker tracks which CIDs are marked for deletion.
+// This is required for simultaneous CID management
+// by both cilium-operator and cilium-agent.


Is it possible to expand this comment explaining why is this required? It would be nice to detail the expected sequence of events that proves the tracker is needed when both operator and agent manage CIDs.

Updated the docs

I'm sorry but I'm not following the comment.

AFAIU we need the delay (and thus the marking and deletion-tracking) to avoid scenarios where the agent created the CID but it does not have updated the CEP with the CID yet. If the operator sees the new CID and no CEP actually using it, it might be too aggressive and delete the CID. This is supposed to happen more frequently in case of high pod churn (I guess because this increases the delay between the operator receiving the CID creation event and the operator receiving the CEP update event).
Is my understanding correct? Either way, I suggest to rephrase the comment to be more clear and descriptive.

It address two main concerns:

It facilitates CID reuse in scenarios with high pod churn. We are not deleting the CID and we are able to re-use it.

Ensure correct CID association in scenarios where the agent rapidly receives create, delete events.
e.g. we have 1 CID used by one pod, and we churn this pod (agents quickly receives create, delete CID events). If the agent receives the delete CID, schedules the CEP deletion, then a new create event comes, sees that the CEP is already existing, the scheduled CEP event deletes the CEP.

@pippolo84, added more details in the comment. I think once we have the full implementation it will provide enough context.

I also consider the delayed deletion is useful in high pod churn scenarios as we do not have to delete and re-create the same CID over and over..

sure, but that's already implemented in the current GC.

Updated the docs.

@dlapcevic do you have any thoughts around the existing GC and the one proposed to be in operator managing CIDs?

I will try to sum it up and let me know if my understanding is correct so @joestringer and @dlapcevic doesn't have to read all of that conversation 😅

We are considering how GC behaves in two different cases:
(1) During the transition period from cilium-agent creating CIDs to operator creating CIDs when both operator and agent can create CIDs
(2) How GC behaves once operators is fully managing CIDs

Let's think about the current GC and how it would behave:
(1) if CID was ever used, it will be marked for deletion after 30 minutes of not being used with annotation and deleted after another 15 minutes - this already addresses your concern with high-churn pods.
If CID was never used by any CEP, it will mark it for deletion and delete it after 15 minutes.
Sounds reasonable both for identities created by the operator and the agent.
CIDs created by the agent and not yet used by any CEPs won't be deleted immediately.

(2) It will still work the same, but adding an annotation mark for deletion won't make much sense as the agent won't be able to remove it anymore.
So proposed switch from deletion mark annotation to DeletionTracker is a nice optimization in terms of API calls, but not necessary for it to work correctly. It should be fairly easy to implement it in the current GC though.

It could probably even be instantly enabled ( deletion mark annotation -> DeletionTracker) when switching from agent creating CIDs to operator creating CIDs during migration and the current GC would still behave correctly in both cases (1) and (2), but would not react to deletion mark annotation, but used fixed delay for waiting on CEP.

Of course interval/timeouts (30/15 minutes) can be already changed IIRC, I was just using defaults as examples.

So the question is, what would be the purpose of implementing a new GC, instead of implementing this improvement to the current GC, which seems significantly easier?

In my opinion, GC is part of CID management, and there is no reason for it to be separated from the CID controller. The CID controller anyway needs to know the desired state of CIDs, when CIDs are used and when not, and adding deletion to it is not a complex effort.

Beside that I see a few benefits:

Performance and scalability
a) Smaller number of calls to kube-apiserver without adding and removing the mark. Remember: every update needs to be sent to all nodes.
b) Processing all CIDs at one time (each GC run) instead of continuously, creates a spike of requests that can significantly affect KCP and cilium-operator's k8s client’s rate limiting.

Ease of use
No need to configure GC.

Security
Remove write permission for CIDs from cilium-agent.
When cilium-agent is not managing CIDs, it shouldn’t update CIDs, so it won’t need write permission to CIDs.

We had issues related to these points.
We also had issues where cilium-operator was having connection instability with kube-apiserver so it was restarting every 15-20 minutes (or leader was changed), and in this case we ended up having no CIDs cleaned up and eventually hitting the 65k CID limit.

I think that this topic would be a lot more approachable with a more incremental approach. Use the current GC first, propose improvements to the current GC implementation and push on that independently. Then propose the PR that refactors/removes the current GC implementation to merge it into this CID controller. Each of those steps would be self-contained, provide value, and could be individually assessed for their benefits and drawbacks.

If we were looking at a PR that was titled "Improve cilium-operator Identity garbage collection" with the PR description containing the text in the post immediately above, then I think we would be in a much better position to critique the design and implementation, and debate how it affects the implementation.

Given how critical these operations are and how nuanced they can be, I don't think we're interested in having multiple implementations in the tree at the same time. Not unless there's something critical to the design that I'm missing which makes the garbage collection improvements inherently different.

EDIT: Let me adjust a bit on multiple GC: If we want to trial and incrementally roll out two implementations, then it could be useful to have a swappable implementation where you define which GC algorithm to use based on a flag. But for that, ideally we would just structure it so that the GC aspect fits into the broader code in a consistent way and the only difference is the details of the GC which are swapped in/out based on a flag. If that's what you're going for, I think that at least provides us a path from the current implementation to allowing the new implementation, enabling by default, deprecating the old one, then removing the old one. However if we have sufficient testing I don't think we necessarily need to go through all of that, we could just stick to one implementation that is incrementally improved/changed.

operator/pkg/ciliumidentity/cache_test.go

sypakine · 2024-07-04T16:09:08Z

note: I had been commenting on a previous commit version: ac7ac99

I'm not a github expert, so not sure if there is a way to reflect them in this commit...

pippolo84

Thanks for the follow-up.
Left some additional comments in the previously opened conversations.

ovidiutirla · 2024-07-04T16:22:04Z

note: I had been commenting on a previous commit version: ac7ac99

I'm not a github expert, so not sure if there is a way to reflect them in this commit...

Thanks both of you for the feedback!
I split in a new commit and addressed your changes Mark.

The field will be used mainly by operator managing CIDs. Related cilium#27752 Signed-off-by: Ovidiu Tirla <[email protected]>

Replaces the existing logger to slog Signed-off-by: Ovidiu Tirla <[email protected]>

operator/pkg/ciliumidentity/cache_test.go

dlapcevic

lgtm

Add EnqueueTimeTracker and CIDDeletionTracker structures to manage enqueuing times and track CID deletion marks. Signed-off-by: Ovidiu Tirla <[email protected]>

ovidiutirla · 2024-08-01T12:02:25Z

Closing this in favor of:

CID Event Tracker added as part of the metrics pkg/ciliumidentity: Add CID OP metrics #34128
CID Deletion Tracker removed as part of the feedback from operator/ciliumidentity: Add CID time trackers #33380 (comment) the existing identitygc will be used.

maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 25, 2024

ovidiutirla mentioned this pull request Jun 25, 2024

pkg/ciliumidentity: Add Cilium Identity Controller Operator hive cell #33383

Merged

ovidiutirla force-pushed the feature/op-id-cid-cache branch from 0557b1c to b2c6736 Compare June 25, 2024 13:06

ovidiutirla marked this pull request as ready for review June 25, 2024 13:22

ovidiutirla requested review from a team as code owners June 25, 2024 13:22

ovidiutirla requested review from asauber and pippolo84 June 25, 2024 13:22

ovidiutirla force-pushed the feature/op-id-cid-cache branch from b2c6736 to 67b62bc Compare June 25, 2024 13:25

asauber requested changes Jun 25, 2024

View reviewed changes

operator/pkg/ciliumidentity/cache.go Outdated Show resolved Hide resolved

operator/pkg/ciliumidentity/cache.go Show resolved Hide resolved

ovidiutirla force-pushed the feature/op-id-cid-cache branch from 67b62bc to 0d299d2 Compare June 25, 2024 19:22

pchaigno added the release-note/misc This PR makes changes that have no direct user impact. label Jun 25, 2024

maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 25, 2024

ovidiutirla force-pushed the feature/op-id-cid-cache branch from 0d299d2 to 81ddfbb Compare June 26, 2024 09:18

pchaigno requested a review from asauber June 27, 2024 14:09

asauber approved these changes Jun 27, 2024

View reviewed changes

operator/pkg/ciliumidentity/cache.go Show resolved Hide resolved

pippolo84 requested changes Jul 3, 2024

View reviewed changes

ovidiutirla force-pushed the feature/op-id-cid-cache branch from 81ddfbb to 37db6ca Compare July 3, 2024 14:15

This comment was marked as outdated.

Sign in to view

ovidiutirla force-pushed the feature/op-id-cid-cache branch 4 times, most recently from ac7ac99 to c63e252 Compare July 4, 2024 11:57

This comment was marked as outdated.

Sign in to view

ovidiutirla force-pushed the feature/op-id-cid-cache branch from c63e252 to 04e5547 Compare July 4, 2024 12:01

ovidiutirla requested a review from pippolo84 July 4, 2024 12:06

pchaigno enabled auto-merge July 4, 2024 12:15

auto-merge was automatically disabled July 4, 2024 16:08
Head branch was pushed to by a user without write access

ovidiutirla force-pushed the feature/op-id-cid-cache branch from 04e5547 to d8673e0 Compare July 4, 2024 16:08

pippolo84 requested changes Jul 4, 2024

View reviewed changes

ovidiutirla force-pushed the feature/op-id-cid-cache branch 5 times, most recently from 9134c80 to 3524a4c Compare July 4, 2024 22:40

ovidiutirla requested a review from pippolo84 July 4, 2024 22:40

ovidiutirla added 2 commits July 5, 2024 11:57

pkg/logging: Add CIDName to log fields

5f97291

The field will be used mainly by operator managing CIDs. Related cilium#27752 Signed-off-by: Ovidiu Tirla <[email protected]>

operator/ciliumidentity: Cleanup unused CIDUsedByPod method

3f4e539

Replaces the existing logger to slog Signed-off-by: Ovidiu Tirla <[email protected]>

ovidiutirla force-pushed the feature/op-id-cid-cache branch from 3524a4c to 8d420ba Compare July 5, 2024 11:58

sypakine reviewed Jul 5, 2024

View reviewed changes

operator/pkg/ciliumidentity/cache_test.go Outdated Show resolved Hide resolved

dlapcevic approved these changes Jul 5, 2024

View reviewed changes

operator/ciliumidentity: Add CID time trackers

9b2e980

Add EnqueueTimeTracker and CIDDeletionTracker structures to manage enqueuing times and track CID deletion marks. Signed-off-by: Ovidiu Tirla <[email protected]>

ovidiutirla force-pushed the feature/op-id-cid-cache branch from 8d420ba to 9b2e980 Compare July 5, 2024 14:17

joestringer mentioned this pull request Jul 9, 2024

operator/identitygc: Disable identitygc when Operator manages CID #33381

Merged

joestringer added the dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. label Jul 9, 2024

dlapcevic mentioned this pull request Jul 15, 2024

Delayed identity cleanup on operator restart #28339

Closed

2 tasks

ovidiutirla mentioned this pull request Jul 29, 2024

Revert "operator/identitygc: Disable identitygc when Operator manages CID" #34058

Merged

ovidiutirla closed this Aug 1, 2024

operator/ciliumidentity: Add CID time trackers #33380

operator/ciliumidentity: Add CID time trackers #33380

Uh oh!

Conversation

ovidiutirla commented Jun 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ovidiutirla commented Jun 25, 2024

Uh oh!

Uh oh!

joestringer commented Jun 27, 2024

Uh oh!

ovidiutirla commented Jun 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joestringer commented Jun 27, 2024

Uh oh!

asauber commented Jun 27, 2024

Uh oh!

asauber commented Jun 27, 2024

Uh oh!

ovidiutirla commented Jun 28, 2024

Uh oh!

pippolo84 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ovidiutirla Jul 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joestringer Jul 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

Uh oh!

sypakine commented Jul 4, 2024

Uh oh!

pippolo84 left a comment

Choose a reason for hiding this comment

Uh oh!

ovidiutirla commented Jul 4, 2024

Uh oh!

Uh oh!

dlapcevic left a comment

Choose a reason for hiding this comment

Uh oh!

ovidiutirla commented Aug 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

ovidiutirla commented Jun 25, 2024 •

edited

Loading

ovidiutirla commented Jun 27, 2024 •

edited

Loading

ovidiutirla Jul 5, 2024 •

edited

Loading

joestringer Jul 9, 2024 •

edited

Loading