Reduce database calls for evaluation of client roles#46224
Reduce database calls for evaluation of client roles#46224pruivo wants to merge 3 commits intokeycloak:mainfrom
Conversation
Closes keycloak#43726 Signed-off-by: Pedro Ruivo <[email protected]>
8c7ee47 to
3aca30b
Compare
Signed-off-by: Pedro Ruivo <[email protected]>
Signed-off-by: Alexander Schwartz <[email protected]>
ahus1
left a comment
There was a problem hiding this comment.
Thank you for the pull request. Please see below for some comments.
| static { | ||
| var metrics = new CaffeineStatsCounter(Metrics.globalRegistry, "role.name.cache"); | ||
| var cache = Caffeine.newBuilder() | ||
| .maximumSize(10000) | ||
| // do not keep entries forever, as the combination of realm and client roles might lead to different matches eventually | ||
| .expireAfterWrite(Duration.ofHours(1)) | ||
| .recordStats(() -> metrics) | ||
| .<RoleCacheKey, RoleCacheValue>build(); | ||
| metrics.registerSizeMetric(cache); | ||
| ROLE_NAME_FROM_STRING_CACHE = cache; | ||
| } | ||
|
|
There was a problem hiding this comment.
Could you please try to move some of this to the DefaultAlternativeLookupProvider?
You can get hold of the session using KeycloakSessionUtil.getKeycloakSession() ... this would make sure that there is no leaking of information between session factories.
| var clients = realm.searchClientByClientIdStream(clientIdToTest, null, null) | ||
| .collect(Collectors.toMap(ClientModel::getClientId, Function.identity())); |
There was a problem hiding this comment.
I had a look and it will do a full table scan on the clients table as it will run a LIKE operator with % at the beginning and the end.
A future enhancement might be that this method would allow to have percentage at the end, and then it could use an index range scan. Given that the results are now cached, this might be acceptable, still one would need to weight the different lookups.
Sometimes client IDs are URLs (used in OIDC federation), so one could have a lot of clients starting with the same prefix before the dot, and then one would pull a lot of data from the database.
I would like to be conservative here and would prefer the original code that loops over the dots as the response time would depend only on the number of dots in the role name, and not on the total number of clients in the database.
| } | ||
| } | ||
|
|
||
| private record CachedRole(RoleModel cachedRole) {} |
There was a problem hiding this comment.
Could we go without the CachedRole record as a simplification?
| if (role != null) { | ||
| return new CachedRole(role); | ||
| } |
There was a problem hiding this comment.
@ahus1, this (and below) if check will introduce unnecessary invalidations.
The search for the client id will return the same client, and it will invoke client.getRole() again.
There was a problem hiding this comment.
The role might have been removed from the client, therefore I thought it would be good to invalidate it.
It is Friday and getting late, maybe I'm getting tired. Happy to discuss it more on Monday.
Closes #43726
The previous algorithm made
n+1database calls, wherenis the number of dots in the client/role name.This PR changes to a flat 2 database calls.