-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DONTMERGE]feat: add vmss and vm etag support for track1 SDK #7698
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mainred The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @mainred. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
we are going to migrate to track2 sdk for vmss client very soon, why do we want to add etag support for track1? |
EtagMismatchErrorTag = "EtagMismatchError" | ||
) | ||
|
||
type EtagMismatchError struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this error? e.g. for LB, we are checking the http status code // Invalidate the cache because ETAG precondition mismatch.
if rerr.HTTPStatusCode == http.StatusPreconditionFailed {
klog.V(3).Infof("LoadBalancer cache for %s is cleanup because of http.StatusPreconditionFailed", ptr.Deref(lb.Name, ""))
_ = az.lbCache.Delete(*lb.Name)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not required anymore.
I originally introduced this error because the error of the erros of UpdateVMs are aggregated, it's a bit different to tell if etagmismatch error happend. but if we etagmismatch and use errors.As, which internally will iterate the errors wrapped by retry error, we can easily find a etagmismatch in the aggregated error. And that's why I updated the rawerror of retry error here as
// RawError indicates the raw error from API.
// It's beneficial to errors.Is() or errors.As() to check the real error type.
RawError error
We can remove mismatch if we decide to invalidate the cache anyway without checking the exact error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- can redirect all StatusPreconditionFailed status code to a etagmismatch error?
- that's how we decide a retry error by if errors.As(err, &re)
BTW because our retry error is not api.RetryError, so even if we have retryafter durtion, all errors including retry errors goes to ratelimited retry, we may need to fix this to honor RetryAfter?
Adding label Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Test ScenariosTwo allow etagmismatch happens stable, we hack the code to add sleep before updating the vmss and vm, and use PUT the vmss or vm through the API. create lb-type serviceupdate vmss with etagmismatch
update vm with etagmismatch
delete lb-type serviceupdate vmss with etagmismatch
update vm with etagmismatch
reset vmss and vmreset vmss and vm in case updating vmss or vm but no change will not update etag
References |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@mainred: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
crp etag support are enabled since 2023-09-01, but we want to use 2024-03-01, on which aks rp is updating.
This maybe a bug since we always invalidate vmss vms' cache after updating them when both adding backendpool or deleting backendpool since vmss vms are updated anyway, but we forgot to invalidate vmss cache.
NOTE:
Putting etag header in the put/patch API call can bring extra effort to update a resources since any update to the resource will change the etag version even though it's not related to the resource field we care, like tags to cloud provider azure.
This PR does not touch vm availabilitySet since vm availabilitySet model and API so far does not support etag, while we use vm as to list vm for cache, also AttachDisk and DetachDisk is also not covered in pkg/provider/azure_controller_vmss.go, which lacks e2e test, but we can consider it as a second phase.
Test report
#7698 (comment)
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: