Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[PECOBLR-587] Azure Service Principal Credential Provider #621

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

jprakash-db
Copy link
Contributor

@jprakash-db jprakash-db commented Jun 30, 2025

Description

This pull request introduces support for Azure Service Principal M2M (SP) authentication to the PySQL Connector

Key Changes

  • AzureServicePrincipalCredentialProvider, this is the provider that will deal with getting the creds using the azure credentials
  • ClientCredentialsTokenSource, is responsible for managing the token lifecycle for the flow where credentials are obtained using the grant_type: client_credentials
  • Introduced a common HTTP client called DatabricksHttpClient that can be used to unify the Http logic across the connector, to ensure standard client level behaviour

New dependencies

  • Introduced library PyJWT which is required for handling JWT parsing functionalities

Tests

Expanded unit tests to cover:

  • AzureServicePrincipalCredentialProvider.
  • ClientCredentialsTokenSource

Manual Testing

  • Tested the Azure M2M SP flow by creating a SP using Azure Entra ID and then mapping it to a workspace SP. Further queries were made to the workspace using the Azure Entra ID credentials to test it on Databricks Workspace

Copy link

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Copy link

github-actions bot commented Jul 1, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Copy link

github-actions bot commented Jul 2, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Copy link

github-actions bot commented Jul 3, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Copy link

github-actions bot commented Jul 3, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Copy link

github-actions bot commented Jul 3, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Copy link

github-actions bot commented Jul 3, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Copy link

github-actions bot commented Jul 3, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Copy link

github-actions bot commented Jul 3, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Copy link

github-actions bot commented Jul 3, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@jprakash-db jprakash-db marked this pull request as ready for review July 3, 2025 05:55
@jprakash-db jprakash-db changed the title Azure Service Principal Credential Provider [PECOBLR-587] Azure Service Principal Credential Provider Jul 3, 2025

[tool.poetry.extras]
pyarrow = ["pyarrow"]

[tool.poetry.dev-dependencies]
[tool.poetry.group.dev.dependencies]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the poetry update have to go in the same PR?


if auth_type == AuthType.AZURE_SP_M2M.value:
pass
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just do if if auth_type != AuthType.AZURE_SP_M2M.value:

oauth_redirect_port_range=[kwargs["oauth_redirect_port"]]
if kwargs.get("oauth_client_id") and kwargs.get("oauth_redirect_port")
if client_id and kwargs.get("oauth_redirect_port")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is behaviour change from before i.e. earlier we required client_id to come in from kwargs and now it is ok even if we derive it from get_client_id_and_redirect_port, is this intended?


# Private API: this is an evolving interface and it will change in the future.
# Please must not depend on it in your applications.
from databricks.sql.experimental.oauth_persistence import OAuthToken, OAuthPersistence

logger = logging.getLogger(__name__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think this is being used, can we add logging?

return app_id

# default databricks resource id
return "2ff814a6-3304-4ab8-85cb-cd0e6f879c1d"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add these IDs as constants at the top so they're in one place


@abstractmethod
def refresh(self) -> Token:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a comment here that we have duplicate code here with the sdk (with the code pointer) and in the long term we should try to unify?

self.token = self.refresh()
return self.token

def refresh(self) -> Token:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is the refresh mechanism being handled for the existing credential providers? is there opportunity to dedup/reuse?


# Singleton class for common Http Client
class DatabricksHttpClient:
## TODO: Unify all the http clients in the PySQL Connector
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't this be done right now? can we not use the existing http client?

@vikrantpuppala vikrantpuppala requested a review from Copilot July 4, 2025 05:28
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for Azure Service Principal M2M authentication to the PySQL Connector by introducing a shared HTTP client, a client-credentials token source, and a new credentials provider for Azure SP.

  • Introduce DatabricksHttpClient for unified HTTP logic
  • Add Token and ClientCredentialsTokenSource to manage OAuth client-credentials flow
  • Implement AzureServicePrincipalCredentialProvider and wire it through get_auth_provider

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/unit/test_thrift_field_ids.py Formatting cleanup and consistent quote style
tests/unit/test_auth.py Added tests for ClientCredentialsTokenSource and SP credential provider, JWT fixtures
src/databricks/sql/common/http.py New singleton DatabricksHttpClient with retry logic
src/databricks/sql/auth/oauth.py Introduced Token, RefreshableTokenSource, and ClientCredentialsTokenSource
src/databricks/sql/auth/common.py Extended AuthType and helper for mapping Azure login app IDs
src/databricks/sql/auth/authenticators.py Added AzureServicePrincipalCredentialProvider
src/databricks/sql/auth/auth.py Updated ClientContext, get_auth_provider, and auth provider resolution for SP
pyproject.toml Added pyjwt, moved dev dependencies under [tool.poetry.group.dev.dependencies]

Comment on lines +61 to +62
with self.session.request(method.value, url, **kwargs) as response:
return response
Copy link
Preview

Copilot AI Jul 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requests.Response object is not a context manager. Remove the with and instead assign response = self.session.request(...) before returning it.

Suggested change
with self.session.request(method.value, url, **kwargs) as response:
return response
response = self.session.request(method.value, url, **kwargs)
return response

Copilot uses AI. Check for mistakes.

return exp_time and (exp_time - buffer_time) <= current_time
except Exception as e:
logger.error("Failed to decode token: %s", e)
return e
Copy link
Preview

Copilot AI Jul 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_expired should return a boolean, not an exception. Consider returning True on decode failure or re-raising the error to avoid returning a non-boolean value.

Suggested change
return e
return True

Copilot uses AI. Check for mistakes.

from databricks.sql.auth.auth import (
AccessTokenAuthProvider,
AuthProvider,
ExternalAuthProvider,
AuthType,
)
import time
from datetime import datetime, timedelta
Copy link
Preview

Copilot AI Jul 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The imports datetime and timedelta are not used in this test file. Consider removing them to keep imports clean.

Suggested change
from datetime import datetime, timedelta

Copilot uses AI. Check for mistakes.

redirect_port_range = kwargs.get("oauth_redirect_port_range")

if auth_type == AuthType.AZURE_SP_M2M.value:
pass
Copy link
Preview

Copilot AI Jul 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The pass under the Azure SP M2M branch appears to be a placeholder. Implement the handling for this branch or remove the pass if it is no longer needed.

Suggested change
pass
return ExternalAuthProvider(
AzureServicePrincipalCredentialProvider(
hostname=normalize_host_name(hostname),
oauth_client_id=kwargs.get("oauth_client_id"),
oauth_client_secret=kwargs.get("oauth_client_secret"),
azure_tenant_id=kwargs.get("azure_tenant_id"),
azure_workspace_resource_id=kwargs.get("azure_workspace_resource_id"),
)
)

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants