Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 63 additions & 11 deletions docs/en/operations/external-authenticators/tokens.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ To use token-based authentication, add `token_processors` section to `config.xml
Its contents are different for different token processor types.

**Common parameters**
- `type` -- type of token processor. Supported values: "jwt_static_key", "jwt_static_jwks", "jwt_dynamic_jwks", "azure", "openid". Mandatory. Case-insensitive.
- `type` -- type of token processor. Supported values: `jwt_static_key`, `jwt_static_jwks`, `jwt_dynamic_jwks`, `entra` (`azure` is accepted as a back-compat alias and resolves to the same `entra` processor — see the [Entra](#entra) section), `openid`. Mandatory. Case-insensitive.
- `token_cache_lifetime` -- maximum lifetime of cached token (in seconds). Optional, default: 3600.
- `username_claim` -- name of claim (field) that will be treated as ClickHouse username. Optional, default: "sub".
- `groups_claim` -- name of claim (field) that contains list of groups user belongs to. This claim will be looked up in the token itself (in case token is a valid JWT, e.g. in Keycloak) or in response from `/userinfo`. Optional, default: "groups".
Expand Down Expand Up @@ -129,22 +129,61 @@ For JWKS-based validators (`jwt_static_jwks` and `jwt_dynamic_jwks`), RS* and ES
- `allow_no_expiration` - If `true`, tokens without the `exp` (expiration) claim are accepted. Otherwise they are rejected. Optional, default: `false`.


## Processors with external providers
## IdP-specific presets and generic external providers

Some tokens cannot be decoded and validated locally. External service is needed in this case. "Azure" and "OpenID" (a generic type) are supported now.
This section covers two related kinds of processor: per-IdP convenience presets built on top of the generic JWT processors (currently `entra`), and the generic `openid` processor that talks to an arbitrary OIDC-compliant identity provider.

### Entra (Microsoft Entra ID, pure OIDC) {#entra}

`<type>entra</type>` is a preset for Microsoft Entra ID built on top of `jwt_dynamic_jwks`. Tokens are validated **locally** against Entra's per-tenant JWKS — no Microsoft Graph call, no userinfo round trip, no OIDC discovery fetch. `username_claim` and `groups_claim` are read directly from the JWT payload. Use this when the access token's `aud` is your own app (registered via Entra's *Expose an API* blade), not `https://graph.microsoft.com`.

:::note Migrating from the legacy `azure` processor
`<type>azure</type>` is now an **alias** for `<type>entra</type>` — at config-parse time the type string is rewritten and the rest of the pipeline is identical. The previous `azure` implementation (which round-tripped every token through Microsoft Graph's `/oidc/userinfo` and `/v1.0/me/memberOf` endpoints) has been removed entirely.

For operators upgrading: an `<type>azure</type>` block that previously had no other parameters will now fail to load with `'tenant_id' must be specified for 'entra' processor`. To migrate, add `<tenant_id>` (and ideally `<expected_audience>`) and make sure your application is configured to mint tokens whose `aud` is your own app, not Microsoft Graph. The setup recipe lives in `docs/entra-setup-draft.md`.
:::

Minimum configuration — only `tenant_id` is required; all other parameters have sensible defaults:

### Azure
```xml
<clickhouse>
<token_processors>
<azure_processor>
<type>azure</type>
</azure_processor>
<entra_prod>
<type>entra</type>
<tenant_id>aaaabbbb-0000-cccc-1111-dddd2222eeee</tenant_id>
</entra_prod>
</token_processors>
</clickhouse>
```

No additional parameters are required.
Example with common overrides (audience binding to a specific app, Entra-flavored username/groups claims):

```xml
<entra_prod>
<type>entra</type>
<tenant_id>aaaabbbb-0000-cccc-1111-dddd2222eeee</tenant_id>
<expected_audience>api://clickhouse</expected_audience>
<username_claim>preferred_username</username_claim>
<groups_claim>roles</groups_claim>
</entra_prod>
```

**Parameters:**

- `tenant_id` — Microsoft Entra tenant identifier (a GUID, or an `*.onmicrosoft.com` domain). **Mandatory.** Multi-tenant aliases (`common`, `organizations`, `consumers`) are rejected because `JwksJwtProcessor` does exact-match issuer validation.

All remaining parameters are optional:

- `jwks_uri` — Override for the JWKS endpoint. Default: `https://login.microsoftonline.com/{tenant_id}/discovery/v2.0/keys`. Override only for sovereign clouds (`login.microsoftonline.us`, `login.partner.microsoftonline.cn`).
- `expected_issuer` — Expected value of the `iss` claim. Default: `https://login.microsoftonline.com/{tenant_id}/v2.0` (derived from `tenant_id`). Override for v1.0 tokens (`https://sts.windows.net/{tenant_id}/`) or sovereign clouds.
- `expected_audience` — Expected value of the `aud` claim, normally your app's Application ID URI (e.g. `api://clickhouse`) or client ID. If unset, no audience check is performed (any signature-valid token from the tenant will authenticate); a warning is logged at startup so the gap is visible.
- `username_claim` — JWT claim to use as the ClickHouse username. Default: `sub`. Common Entra alternatives: `preferred_username`, `upn`, `oid`.
- `groups_claim` — JWT claim that carries the array of group identifiers. Default: `groups`. Set to `roles` if you use App Roles in Entra instead of security-group claims.
- `expected_typ`, `verifier_leeway`, `jwks_cache_lifetime`, `claims`, `allow_no_expiration`, `token_cache_lifetime` — Same as for `jwt_dynamic_jwks`.

:::note
The `groups` claim must be enabled in the app registration's manifest (`"groupMembershipClaims": "ApplicationGroup"` is recommended) and exposed in access tokens via `optionalClaims.accessToken`. Group identifiers in the token are object IDs (GUIDs) by default; map them to ClickHouse roles via the user-directory's `roles_mapping` block (see [Identity Provider as an External User Directory](#idp-external-user-directory)).
:::

### OpenID
```xml
Expand Down Expand Up @@ -212,7 +251,7 @@ Example (goes into `users.xml`):
Here, the JWT payload must contain `["view-profile"]` on path `resource_access.account.roles`, otherwise authentication will not succeed even with a valid JWT.

:::note
Per-user `claims` are enforced only when the token is a JWT (validated by a JWT processor such as `jwt_static_key` or `jwt_dynamic_jwks`). When the user authenticates with an opaque (access) token (e.g. via Azure, OpenID, or Google token processors), claims are not checked and authentication succeeds if the token is otherwise valid.
Per-user `claims` are enforced only when the token is a JWT (validated by a JWT processor such as `jwt_static_key`, `jwt_dynamic_jwks`, or `entra`). When the user authenticates with an opaque (access) token (e.g. via OpenID or Google token processors), claims are not checked and authentication succeeds if the token is otherwise valid.
:::

```
Expand Down Expand Up @@ -256,6 +295,16 @@ All this implies that the SQL-driven [Access Control and Account Management](/do
<token_test_role_1 />
</common_roles>
<default_profile>my_profile</default_profile>
<roles_mapping>
<map>
<from>8a1b2c3d-4e5f-6789-abcd-ef0123456789</from>
<to>ch_admin</to>
</map>
<map>
<from>9f8e7d6c-5b4a-3210-fedc-ba0987654321</from>
<to>ch_analyst</to>
</map>
</roles_mapping>
<roles_filter>
\bclickhouse-[a-zA-Z0-9]+\b
</roles_filter>
Expand All @@ -274,5 +323,8 @@ For now, no more than one `token` section can be defined inside `user_directorie
- `processor` — Name of one of processors defined in `token_processors` config section described above. This parameter is mandatory and cannot be empty.
- `common_roles` — Section with a list of locally defined roles that will be assigned to each user retrieved from the IdP. Optional.
- `default_profile` — Name of a locally defined settings profile that will be assigned to each user retrieved from the IdP. If the profile does not exist, a warning will be logged and the user will be created without a profile. Optional.
- `roles_filter` — Regex string for groups filtering. Only groups matching this regex will be mapped to roles. Optional.
- `roles_transform` — Sed-style transform pattern to apply to group names before mapping to roles. Format: `s/pattern/replacement/flags`. The `g` flag applies the replacement globally (all occurrences). Example: `s/-/_/g` converts `clickhouse-grp-dba` to `clickhouse_grp_dba`. Optional.
- `roles_mapping` — Explicit map from incoming group identifier (e.g. an Entra security-group object ID) to a ClickHouse role name. Each entry is a `<map>` element with `<from>` and `<to>` children. Applied **before** `roles_filter` and `roles_transform`; groups absent from the map pass through unchanged, so the filter stage can be used to drop unmapped entries. Optional.
- `roles_filter` — Regex string for groups filtering. Only groups (after `roles_mapping` is applied) that match this regex will be considered. Optional.
- `roles_transform` — Sed-style transform pattern applied to group names (after `roles_mapping` and `roles_filter`) before mapping to roles. Format: `s/pattern/replacement/flags`. The `g` flag applies the replacement globally (all occurrences). Example: `s/-/_/g` converts `clickhouse-grp-dba` to `clickhouse_grp_dba`. Optional.

The three stages run in this order: `roles_mapping` → `roles_filter` → `roles_transform`. Stages are independent and any of them may be omitted.
114 changes: 77 additions & 37 deletions src/Access/TokenAccessStorage.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,38 @@ TokenAccessStorage::TokenAccessStorage(const String & storage_name_, AccessContr
roles_transform_global = parsed.global;
}

/// Explicit `roles_mapping` entries are read as a list of <map><from>X</from><to>Y</to></map>
/// children. The mapping rewrites incoming group names BEFORE `roles_filter` / `roles_transform`,
/// so each subsequent stage operates on the mapped value. Groups not listed here pass through
/// to filter/transform unchanged.
if (config.has(prefix_str + "roles_mapping"))
{
Poco::Util::AbstractConfiguration::Keys map_keys;
config.keys(prefix_str + "roles_mapping", map_keys);

for (const auto & key : map_keys)
{
const String entry_prefix = prefix_str + "roles_mapping." + key;
if (!config.has(entry_prefix + ".from") || !config.has(entry_prefix + ".to"))
throw Exception(ErrorCodes::BAD_ARGUMENTS,
"roles_mapping entry '{}' must contain both 'from' and 'to' subelements", key);

const String from = config.getString(entry_prefix + ".from");
const String to = config.getString(entry_prefix + ".to");

if (from.empty())
throw Exception(ErrorCodes::BAD_ARGUMENTS, "roles_mapping entry '{}': 'from' must not be empty", key);
if (to.empty())
throw Exception(ErrorCodes::BAD_ARGUMENTS, "roles_mapping entry '{}': 'to' must not be empty", key);

auto [it, inserted] = roles_mapping.emplace(from, to);
if (!inserted)
throw Exception(ErrorCodes::BAD_ARGUMENTS,
"roles_mapping has duplicate 'from' value '{}' (already mapped to '{}', cannot remap to '{}')",
from, it->second, to);
}
}

provider_name = config.getString(prefix_str + "processor");
if (provider_name.empty())
throw Exception(ErrorCodes::BAD_ARGUMENTS, "'processor' must be specified for Token user directory");
Expand Down Expand Up @@ -593,51 +625,59 @@ std::optional<AuthResult> TokenAccessStorage::authenticateImpl(
if (!isAddressAllowed(*user, address))
throwAddressNotAllowed(address);

/// Pipeline: incoming group --(roles_mapping)--> mapped name --(roles_filter)--> kept/dropped --(roles_transform)--> CH role name.
/// Each stage is independent and optional; groups absent from `roles_mapping` pass through unchanged.
std::set<String> external_roles;
if (roles_filter.has_value())

/// Defensive: a broken filter regex must NEVER fall through to the permissive
/// "grant everything that survives the rest of the pipeline" branch. Parse-time
/// validation in the constructor already rejects invalid patterns; this guard
/// preserves the invariant in case any future code path constructs the filter
/// without the parse-time check (e.g. config reload).
if (roles_filter.has_value() && !roles_filter->ok())
{
/// Defensive: a broken regex must NEVER cause a fall-through to the
/// permissive "grant all groups" branch. Parse-time validation in the
/// constructor already rejects invalid patterns; this guard ensures the
/// invariant still holds if any future code path constructs the filter
/// without the parse-time check (e.g. config reload).
if (!roles_filter->ok())
{
LOG_ERROR(getLogger(),
"{}: Configured 'roles_filter' is invalid ('{}'); refusing to map any "
"external roles for user '{}' to avoid granting all token groups.",
getStorageName(), roles_filter->error(), credentials.getUserName());
}
else
{
LOG_TRACE(getLogger(), "{}: External role filter found, applying only matching groups", getStorageName());
for (const auto & group: token_credentials.getGroups()) {
if (RE2::FullMatch(group, roles_filter.value()))
{
String transformed_group = group;
if (roles_transform_pattern.has_value() && roles_transform_replacement.has_value())
{
transformed_group = applyTransform(group, roles_transform_pattern.value(), roles_transform_replacement.value(), roles_transform_global);
LOG_TRACE(getLogger(), "{}: Transformed group '{}' to '{}'", getStorageName(), group, transformed_group);
}
external_roles.insert(transformed_group);
LOG_TRACE(getLogger(), "{}: Granted role (group) {} to user", getStorageName(), transformed_group);
}
}
}
LOG_ERROR(getLogger(),
"{}: Configured 'roles_filter' is invalid ('{}'); refusing to map any "
"external roles for user '{}' to avoid granting all token groups.",
getStorageName(), roles_filter->error(), credentials.getUserName());
}
else
{
LOG_TRACE(getLogger(), "{}: No external role filtering set, applying all available groups", getStorageName());
for (const auto & group: token_credentials.getGroups())
const bool has_filter = roles_filter.has_value();
const bool has_transform = roles_transform_pattern.has_value() && roles_transform_replacement.has_value();

for (const auto & group : token_credentials.getGroups())
{
String transformed_group = group;
if (roles_transform_pattern.has_value() && roles_transform_replacement.has_value())
String name = group;

if (!roles_mapping.empty())
{
transformed_group = applyTransform(group, roles_transform_pattern.value(), roles_transform_replacement.value(), roles_transform_global);
LOG_TRACE(getLogger(), "{}: Transformed group '{}' to '{}'", getStorageName(), group, transformed_group);
const auto it = roles_mapping.find(group);
if (it != roles_mapping.end())
{
name = it->second;
LOG_TRACE(getLogger(), "{}: Mapped group '{}' to '{}'", getStorageName(), group, name);
}
}
external_roles.insert(transformed_group);

if (has_filter && !RE2::FullMatch(name, roles_filter.value()))
{
LOG_TRACE(getLogger(), "{}: Group '{}' (after mapping) did not match roles_filter, skipping", getStorageName(), name);
continue;
}

if (has_transform)
{
String transformed = applyTransform(name, roles_transform_pattern.value(), roles_transform_replacement.value(), roles_transform_global);
if (transformed != name)
{
LOG_TRACE(getLogger(), "{}: Transformed '{}' to '{}'", getStorageName(), name, transformed);
name = std::move(transformed);
}
}

external_roles.insert(name);
LOG_TRACE(getLogger(), "{}: Granted role (group) {} to user", getStorageName(), name);
}
}

Expand Down
4 changes: 4 additions & 0 deletions src/Access/TokenAccessStorage.h
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,10 @@ class TokenAccessStorage : public IAccessStorage
const String & prefix;

String provider_name;
/// Explicit mapping from incoming group (e.g. Entra group object ID) to a ClickHouse role name.
/// Applied BEFORE `roles_filter` and `roles_transform`. Groups absent from this map pass through
/// unchanged, so the filter stage can be used to drop unmapped entries.
std::map<String, String> roles_mapping;
std::optional<re2::RE2> roles_filter = std::nullopt;
/// `roles_transform` regex compiled once at construction. Storing the
/// compiled `re2::RE2` (instead of the pattern string) avoids per-call
Expand Down
15 changes: 0 additions & 15 deletions src/Access/TokenProcessors.h
Original file line number Diff line number Diff line change
Expand Up @@ -194,21 +194,6 @@ class GoogleTokenProcessor : public ITokenProcessor
const String expected_audience;
};

class AzureTokenProcessor : public ITokenProcessor
{
public:
AzureTokenProcessor(const String & processor_name_,
UInt64 token_cache_lifetime_,
const String & username_claim_,
const String & groups_claim_,
const String & expected_audience_);

bool resolveAndValidate(TokenCredentials & credentials) const override;

private:
const String expected_audience;
};

class OpenIdTokenProcessor : public ITokenProcessor
{
public:
Expand Down
Loading
Loading