Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Entra Authentication Support (credential handler and/or password callback) #230

@PvH-SPC

Description

@PvH-SPC

Feature Request: Microsoft Entra (Azure AD) authentication for Azure Blob and Postgres connectors

Pathway version: 0.30.1
Affected APIs: pw.persistence.Backend.azure(...), pw.io.postgres.write(...), pw.io.postgres.read(...)

Summary

Add support for TokenCredential-based authentication (Azure Identity SDK) to Pathway's Azure Blob persistence backend and Postgres connectors, so that workloads running with Microsoft Entra (Azure AD) managed identity can authenticate without static secrets.

Use case

We run Pathway on AKS with workload-identity-enabled service accounts. Every other Azure-facing component in our stack (Microsoft Graph, Key Vault, Azure Monitor / Application Insights, Azure Storage SDK, etc.) authenticates via DefaultAzureCredential — there are no static long-lived secrets in our Kubernetes Secret resources, environment variables, or CI/CD pipelines.

Pathway's Azure Blob persistence backend and Postgres connectors are now the only components in our deployment that still require static credentials:

  • Blob backend: requires the storage AccountKey (HMAC shared-key auth)
  • Postgres connectors: require a static password in the connection string

This blocks two important Azure security postures:

  1. "No static secrets" Azure Policy compliance — many enterprise tenants mandate this via Azure Policy / Azure Security Benchmark. Pathway becomes a blocker for adoption.
  2. Entra-only authentication on Azure Database for PostgreSQL Flexible Server — when the server is configured for Entra-only auth, the "password" is a short-lived OAuth bearer token (~24h lifetime) that must be refreshed periodically. Pathway has no mechanism to do this today.

Current API (for reference)

# pathway/persistence.py
Backend.azure(
    root_path: str,
    account: str,
    password: str,         # ← static storage AccountKey (HMAC shared-key)
    container: str,
)

# pathway/io/postgres.py
pw.io.postgres.write(
    table=...,
    postgres_settings={
        "host": "...",
        "user": "...",
        "password": "...", # ← static password baked into the connection string
        ...
    },
    table_name="...",
    ...
)

Both APIs assume the credential value is provided once at process startup and remains valid for the lifetime of the run. Neither supports rotation, refresh, or callback-based credential resolution.

Proposed APIs

We see two reasonable shapes; either would unblock us. Option A is more idiomatic for the Azure SDK ecosystem; Option B is the smallest API surface change.

Option A — accept a TokenCredential (preferred)

from azure.identity import DefaultAzureCredential

# Blob backend: replace `password=` with a credential
pw.persistence.Backend.azure(
    root_path="pathway_state",
    account="myaccount",
    container="mycontainer",
    credential=DefaultAzureCredential(),
)

# Postgres: accept a credential + scope, refresh tokens transparently
pw.io.postgres.write(
    table=...,
    postgres_settings={"host": "...", "user": "managed-identity-name", ...},
    credential=DefaultAzureCredential(),
    credential_scope="https://ossrdbms-aad.database.windows.net/.default",
    table_name="...",
)

Pathway internally calls credential.get_token(scope) whenever it needs to (re)open a connection. The Rust azure_storage and tokio-postgres crates Pathway already uses both support OAuth credential providers, so this is feasible without replacing transport layers.

Option B — accept a password-provider callback

def get_pg_password() -> str:
    return DefaultAzureCredential().get_token(
        "https://ossrdbms-aad.database.windows.net/.default"
    ).token

pw.io.postgres.write(
    table=...,
    postgres_settings={"host": "...", "user": "...", ...},
    password_provider=get_pg_password,  # ← called on each (re)connect
    table_name="...",
)

For the blob backend, the equivalent would be a key_provider: Callable[[], str] parameter. Lower-magic; doesn't require Pathway to depend on azure.identity; gives users full control over token lifecycle.

Why this matters

Concern Today With this feature
Static AccountKey in K8s Secret Required Eliminated
Postgres password rotation Coordinated restart Transparent
Azure Policy compliance Pathway blocks adoption Compliant
Entra-only Postgres support Not possible Native
Audit story Multiple identities (managed identity + service-principal-with-key) One identity per pod

Workaround we use today

For the blob backend only:

  1. Store the AccountKey as a secret in Azure Key Vault (one-time, manual rotation)
  2. Pod fetches it at startup using DefaultAzureCredential + azure-keyvault-secrets
  3. Passes the in-memory key to Backend.azure(password=key, ...)

This works because the AccountKey is long-lived. It does not generalize to Postgres with Entra-only auth, because tokens expire every ~24 hours and Pathway has no way to refresh without restarting the process.

Environment

  • Pathway 0.30.1
  • AKS (Kubernetes 1.29) with workload identity
  • Azure Storage Blob (RA-GRS)
  • Azure Database for PostgreSQL Flexible Server
  • Python 3.11

References

Acceptance criteria

A reasonable shipping bar for us would be:

  • pw.persistence.Backend.azure(...) accepts a credential (Option A) or key provider (Option B) in addition to password=
  • pw.io.postgres.write(...) and pw.io.postgres.read(...) accept a credential + scope (Option A) or password provider (Option B)
  • Token refresh is transparent — Pathway calls the credential / provider when (re)establishing connections, not just at startup
  • Existing password= / postgres_settings["password"] paths continue to work (backward compatible)
  • Documentation example showing AKS managed-identity setup end-to-end

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions