-
Notifications
You must be signed in to change notification settings - Fork 3.4k
operator/ciliumidentity: Add CID time trackers #33380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0557b1c to
b2c6736
Compare
b2c6736 to
67b62bc
Compare
67b62bc to
0d299d2
Compare
|
/test |
0d299d2 to
81ddfbb
Compare
|
Could you update the description to describe how this time tracker fits into the design, ie expand the following sentence to explain not just what but how it integrates into the broader solution?
|
|
I assume Andrew had similar concern with CIDDeletionTracker, it shouldn't impact the NP enforcement as the NP is done on agent level while the tracker is used to add a configurable delay for CID clean-up. Hey @joestringer , thanks for the feedback. Added the following to the PR description:
|
|
@ovidiutirla I still don't quite follow.
Which metrics? What does it mean to reconcile a CID? |
|
In the draft implementation, there is a 30 second delay added to any CID deletion-via-operator here https://github.com/cilium/cilium/pull/33204/files#diff-a5b01710162b5dcb0c820096bd611ded7a00e14a8a60ebc382466ce30d9436caR249-R253 To me, this appears to introduce an [at least] 30 second delay between the operator first processing a delete event and the policy of that delete event being enforced by Agents. I do see the point that even if strong distributed concurrency were in place for CIDs, that would need to be coupled somehow with CEP operations. |
|
Apologies for spam. The relevant diff is not linkable on GitHub due to the diff being collapsed by default.
cilium/operator/pkg/ciliumidentity/reconciler.go Lines 249 to 253 in 05e0ebe
|
|
The
So the Similarly, in the current implementation, we keep the CID for ~30 minutes ( cilium/operator/identitygc/crd_gc.go Lines 113 to 143 in 05e0ebe
I might not have a full understanding of your concerns, just wondering if this brought some clarity on how we plan to use it? I'm happy to walk you through our approach or if you have any better ideas on how to handle this part we are happy to adapt our solution. To reconcile a CID I mean that we ensure that the desired state for the CID is reached, cilium/operator/pkg/ciliumidentity/reconciler.go Lines 134 to 144 in 05e0ebe
|
pippolo84
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I've left some questions around some changes that were previously introduced and reviewed in #30649.
Also, some suggestions around the structure of the unit tests.
operator/pkg/ciliumidentity/cache.go
Outdated
| // CIDDeletionTracker tracks which CIDs are marked for deletion. | ||
| // This is required for simultaneous CID management | ||
| // by both cilium-operator and cilium-agent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to expand this comment explaining why is this required? It would be nice to detail the expected sequence of events that proves the tracker is needed when both operator and agent manage CIDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry but I'm not following the comment.
AFAIU we need the delay (and thus the marking and deletion-tracking) to avoid scenarios where the agent created the CID but it does not have updated the CEP with the CID yet. If the operator sees the new CID and no CEP actually using it, it might be too aggressive and delete the CID. This is supposed to happen more frequently in case of high pod churn (I guess because this increases the delay between the operator receiving the CID creation event and the operator receiving the CEP update event).
Is my understanding correct? Either way, I suggest to rephrase the comment to be more clear and descriptive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It address two main concerns:
- It facilitates CID reuse in scenarios with high pod churn. We are not deleting the CID and we are able to re-use it.
- Ensure correct CID association in scenarios where the agent rapidly receives create, delete events.
e.g. we have 1 CID used by one pod, and we churn this pod (agents quickly receives create, delete CID events). If the agent receives the delete CID, schedules the CEP deletion, then a new create event comes, sees that the CEP is already existing, the scheduled CEP event deletes the CEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pippolo84, added more details in the comment. I think once we have the full implementation it will provide enough context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also consider the delayed deletion is useful in high pod churn scenarios as we do not have to delete and re-create the same CID over and over..
sure, but that's already implemented in the current GC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the docs.
@dlapcevic do you have any thoughts around the existing GC and the one proposed to be in operator managing CIDs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try to sum it up and let me know if my understanding is correct so @joestringer and @dlapcevic doesn't have to read all of that conversation 😅
We are considering how GC behaves in two different cases:
(1) During the transition period from cilium-agent creating CIDs to operator creating CIDs when both operator and agent can create CIDs
(2) How GC behaves once operators is fully managing CIDs
Let's think about the current GC and how it would behave:
(1) if CID was ever used, it will be marked for deletion after 30 minutes of not being used with annotation and deleted after another 15 minutes - this already addresses your concern with high-churn pods.
If CID was never used by any CEP, it will mark it for deletion and delete it after 15 minutes.
Sounds reasonable both for identities created by the operator and the agent.
CIDs created by the agent and not yet used by any CEPs won't be deleted immediately.
(2) It will still work the same, but adding an annotation mark for deletion won't make much sense as the agent won't be able to remove it anymore.
So proposed switch from deletion mark annotation to DeletionTracker is a nice optimization in terms of API calls, but not necessary for it to work correctly. It should be fairly easy to implement it in the current GC though.
It could probably even be instantly enabled ( deletion mark annotation -> DeletionTracker) when switching from agent creating CIDs to operator creating CIDs during migration and the current GC would still behave correctly in both cases (1) and (2), but would not react to deletion mark annotation, but used fixed delay for waiting on CEP.
Of course interval/timeouts (30/15 minutes) can be already changed IIRC, I was just using defaults as examples.
So the question is, what would be the purpose of implementing a new GC, instead of implementing this improvement to the current GC, which seems significantly easier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, GC is part of CID management, and there is no reason for it to be separated from the CID controller. The CID controller anyway needs to know the desired state of CIDs, when CIDs are used and when not, and adding deletion to it is not a complex effort.
Beside that I see a few benefits:
- Performance and scalability
a) Smaller number of calls to kube-apiserver without adding and removing the mark. Remember: every update needs to be sent to all nodes.
b) Processing all CIDs at one time (each GC run) instead of continuously, creates a spike of requests that can significantly affect KCP and cilium-operator's k8s client’s rate limiting. - Ease of use
No need to configure GC. - Security
Remove write permission for CIDs from cilium-agent.
When cilium-agent is not managing CIDs, it shouldn’t update CIDs, so it won’t need write permission to CIDs.
We had issues related to these points.
We also had issues where cilium-operator was having connection instability with kube-apiserver so it was restarting every 15-20 minutes (or leader was changed), and in this case we ended up having no CIDs cleaned up and eventually hitting the 65k CID limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this topic would be a lot more approachable with a more incremental approach. Use the current GC first, propose improvements to the current GC implementation and push on that independently. Then propose the PR that refactors/removes the current GC implementation to merge it into this CID controller. Each of those steps would be self-contained, provide value, and could be individually assessed for their benefits and drawbacks.
If we were looking at a PR that was titled "Improve cilium-operator Identity garbage collection" with the PR description containing the text in the post immediately above, then I think we would be in a much better position to critique the design and implementation, and debate how it affects the implementation.
Given how critical these operations are and how nuanced they can be, I don't think we're interested in having multiple implementations in the tree at the same time. Not unless there's something critical to the design that I'm missing which makes the garbage collection improvements inherently different.
EDIT: Let me adjust a bit on multiple GC: If we want to trial and incrementally roll out two implementations, then it could be useful to have a swappable implementation where you define which GC algorithm to use based on a flag. But for that, ideally we would just structure it so that the GC aspect fits into the broader code in a consistent way and the only difference is the details of the GC which are swapped in/out based on a flag. If that's what you're going for, I think that at least provides us a path from the current implementation to allowing the new implementation, enabling by default, deprecating the old one, then removing the old one. However if we have sufficient testing I don't think we necessarily need to go through all of that, we could just stick to one implementation that is incrementally improved/changed.
81ddfbb to
37db6ca
Compare
This comment was marked as outdated.
This comment was marked as outdated.
ac7ac99 to
c63e252
Compare
c63e252 to
04e5547
Compare
Head branch was pushed to by a user without write access
04e5547 to
d8673e0
Compare
|
note: I had been commenting on a previous commit version: ac7ac99 I'm not a github expert, so not sure if there is a way to reflect them in this commit... |
pippolo84
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the follow-up.
Left some additional comments in the previously opened conversations.
Thanks both of you for the feedback! |
9134c80 to
3524a4c
Compare
The field will be used mainly by operator managing CIDs. Related cilium#27752 Signed-off-by: Ovidiu Tirla <[email protected]>
Replaces the existing logger to slog Signed-off-by: Ovidiu Tirla <[email protected]>
3524a4c to
8d420ba
Compare
dlapcevic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Add EnqueueTimeTracker and CIDDeletionTracker structures to manage enqueuing times and track CID deletion marks. Signed-off-by: Ovidiu Tirla <[email protected]>
8d420ba to
9b2e980
Compare
|
Closing this in favor of:
|
Add
EnqueueTimeTrackerandCIDDeletionTrackerstructures to manage enqueuing times and track CID deletion marks. This feature will be used by Operator Managing CIDs.EnqueueTimeTrackeris only used for metrics to meter the duration from enqueuing the reconciliation until the CID reconciliation is completed.CIDDeletionTrackeris used to handle CID deletion when the CID is no longer needed, this is only used by operator. The deletion tracker allows us to implement acidDeleteDelaywhich is the delay to enqueue another CID event to be reconciled after CID is marked for deletion. This is required for simultaneous CID management by both cilium-operator and cilium-agent. Without the delay, operator might immediately clean up CIDs created by agent, before agent can finish CEP creation. The deletion tracker is not used in cilium-agent and is only used to enforce the delay for deletion by operator.Related CFP #27752
Draft full implementation: #33204