Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@justin2004
Copy link
Contributor

if you have multiple users with varying access to the underlying data stores (trino, postgres, etc.) it can be necessary to allow ontop to impersonate each user running queries.

currently in a properties file if you do something like:

jdbc:trino://sometrino.server:8443/some-catalog/some-schema?SSL=true&accessToken=BEARER_TOKEN_HERE

it allows ontop to initialize BUT then ontop can't impersonate the user that runs the SPARQL query.

this PR allows you to set your properties like like the example but then if a query to ontop arrives with an HTTP Authorization: bearer BEARER_TOKEN_HERE header then it will remove the accessToken from the JDBC URL and it will append the Authorization header to the outgoing SQL query (thereby allowing ontop to run SQL as user X).

this approach is specific to trino with OAUTH authentication so i don't expect it to be merged as is but perhaps it could lead to a more general user impersonation solution.

use bearer token in header and strip out the bearer token in the
jdbc url if one if present in the header
@bcogrel
Copy link
Member

bcogrel commented Jul 26, 2025

Hi @justin2004, thank you for opening this discussion on user impersonation with a concrete implementation. It is indeed an important topic and currently Ontop is missing the right abstractions to handle it.

Your PR handles the particular case of using the JWT authentication of Trino, where the JWT token is directly passed to Ontop as a bearer token in the HTTP Authorization header and Ontop directly passes it to Trino without doing any authentication on its side. So here the user query the Ontop SPARQL endpoint with the credentials for another service, Trino.

I foresee that a general solution for user impersonation should also support:

  • Having a middleware/API platform above the Ontop endpoints (that's what we do at Ontopic)
  • Letting Ontop be aware of the user identity to apply access control and/or adapt the SQL query according to the corresponding permissions (see Extract user, group and role information from HTTP headers #753)
  • Exchanging OAuth 2 tokens as the audiences of the Ontop service and the database may be different
  • Reuse JDBC connections for a given user
  • Passing the token to the data source according to its DB-specific conventions (no standard here)

Interestingly, this capability is about to arrive in the Postgres world, with PG 18. It is common with Databricks and Snowflake.

I think the QueryContext object should be used and extended to carry the token around with the query.

An alternative architecture would be to use Ontop as a query reformulation service and then have the client send the generated SQL query directly to Trino, if the latter is directly reachable by the client. However, at the moment, the generated SQL query doesn't perfectly match the SPARQL query as an extra post-processing step is needed. We are planning to add this capability next year.

@bcogrel bcogrel marked this pull request as draft July 26, 2025 10:01
@giovannidegani
Copy link

How about adding support to that to the databricks connector ?

@justin2004
Copy link
Contributor Author

@giovannidegani what kind of http header to you need to pass through to databricks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants