-
Couldn't load subscription status.
- Fork 9.1k
HADOOP-17377: ABFS: MsiTokenProvider doesn't retry HTTP 429/410 from the Instance Metadata Service #5273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
...adoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/ExponentialRetryPolicy.java
Outdated
Show resolved
Hide resolved
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commented
hadoop-tools/hadoop-azure/pom.xml
Outdated
| <dependency> | ||
| <groupId>org.mockito</groupId> | ||
| <artifactId>mockito-core</artifactId> | ||
| <version>4.11.0</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, hadoop-project defines the version, and through properties. revert this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, cut this now; the version in hadoop project is the one you now expect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taken
hadoop-tools/hadoop-azure/pom.xml
Outdated
| <dependency> | ||
| <groupId>org.mockito</groupId> | ||
| <artifactId>mockito-inline</artifactId> | ||
| <version>4.11.0</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is new to hadoop, declare it in hadoop-project/pom.xml, with versions and exclusions, then declare here without those
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed dependency
...ls/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/oauth2/AzureADAuthenticator.java
Show resolved
Hide resolved
...adoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/ExponentialRetryPolicy.java
Outdated
Show resolved
Hide resolved
| * https://learn.microsoft.com/en-us/azure/active-directory/ | ||
| * managed-identities-azure-resources/how-to-use-vm-token#error-handling | ||
| */ | ||
| private static final int HTTP_TOO_MANY_REQUESTS = 429; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make public and refer from tests, maybe put in a different file for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taken
| public final class ITestAbfsMsiTokenProvider | ||
| extends AbstractAbfsIntegrationTest { | ||
|
|
||
| private static final int HTTP_TOO_MANY_REQUESTS = 429; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refer to the value in the src/main code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taken
...ools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsMsiTokenProvider.java
Outdated
Show resolved
Hide resolved
...ools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsMsiTokenProvider.java
Outdated
Show resolved
Hide resolved
HADOOP-17377: Add retry for HTTP 429 and HTTP 410 apache#5273
This comment was marked as outdated.
This comment was marked as outdated.
|
@anmolanmol1234 still need those (minor) changes -otherwise it is ready to merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh dear, this is a full mockito update now, which is always a PITA.
How about you split out the mockito update into its own JIRA "update mockito to 4.11.0" (and apply the comments i've done on the changes), then make the abfs change depend on it. That way a broader change is visible on its own.
...test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpcMultiDestination.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpcMultiDestination.java
Outdated
Show resolved
Hide resolved
| //Since we enabled (deferred) cgroup controller mounting, no interactions | ||
| //should have occurred, with this mock | ||
| verifyZeroInteractions(privilegedOperationExecutorMock); | ||
| Mockito.verifyNoInteractions(privilegedOperationExecutorMock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use static import for consistency with the others.
...p/yarn/server/nodemanager/containermanager/linux/resources/gpu/TestGpuResourceAllocator.java
Outdated
Show resolved
Hide resolved
This comment was marked as outdated.
This comment was marked as outdated.
|
I'll go with whatever @saxenapranav thinks here...we have seen this ourselves and need a fix. However, that PR to update mockito bounced, so either
|
The mockito upgrade was needed as part of this PR to mock static methods. So would it be fine if we remove that test method or if not I will attempt to upgrade mockito, including the shaded client. |
Will update this change as an iteration of this PR, but will need some time for the mockito upgrade PR. |
This comment was marked as outdated.
This comment was marked as outdated.
|
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, tested for our use case, where throttling caused constant 429 errors, this PR fixed the problem by doing proper retry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends on mockito-inline to compile? Is this part of the mockito-upgrade? or is there a way to cope without it?
| public static final int DEFAULT_AZURE_OAUTH_TOKEN_FETCH_RETRY_MIN_BACKOFF_INTERVAL = 0; | ||
| public static final int DEFAULT_AZURE_OAUTH_TOKEN_FETCH_RETRY_MAX_BACKOFF_INTERVAL = SIXTY_SECONDS; | ||
| public static final int DEFAULT_AZURE_OAUTH_TOKEN_FETCH_RETRY_DELTA_BACKOFF = 2; | ||
| public static final int DEFAULT_AZURE_OAUTH_TOKEN_FETCH_RETRY_DELTA_BACKOFF = 2 * 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use 2_000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taken
hadoop-tools/hadoop-azure/pom.xml
Outdated
| <scope>test</scope> | ||
| </dependency> | ||
|
|
||
| <dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this needed? because its not in the base project pom.
I would rather this PR doesn't need that mockito upgrade as mockito upgrades are always a painful piece of work which never gets backported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed dependency
|
@anujmodi2021 can you revisit this so I can get it in? we do appear to have been using it internally since december 2023, so I'm happy it works. |
|
@steveloughran will backport the PR for merge and make the necessary test changes |
|
sorry, commenting on wrong PR. will cut. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
LGTM.
Some comments need to be addressed, rest good to merge.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
💔 -1 overall
This message was automatically generated. |
|
Thanks @anmolanmol1234 for refreshing up this. The spotbugs adnd javadoc warnings are due to https://issues.apache.org/jira/browse/HADOOP-19731 |
|
:::: AGGREGATED TEST RESULT :::: ============================================================
|
ABFS: MsiTokenProvider doesn't retry HTTP 429 from the Instance Metadata Service
Resolution for the above mentioned issue where we should enable retries for HTTP error code 429 and HTTP error code 410 based on https://learn.microsoft.com/en-in/azure/virtual-machines/linux/instance-metadata-service?tabs=windows#errors-and-debugging and https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/how-to-use-vm-token#error-handling