Support mapping to Postgres JSON type#230
Merged
Merged
Conversation
serprex
reviewed
May 6, 2026
serprex
reviewed
May 6, 2026
serprex
approved these changes
May 6, 2026
In addition to the JSONB type mapping, add support for the use of the Postgres JSON type mapped to the ClickHouse JSON type. It already worked for the JSON driver, just hadn't been tested. But adding it to the binary driver required a bit more work. Mainly that work encompasses making the binary type handling functions aware of the Postgres type used in columns. By default, the binary driver supports a short list of types that map to ClickHouse types, but allows for other types by exercising the functions in `convert.c` to convert between compatible types. These conversion require casts. And while Postgres supports casts between JSON and JSONB, they're `COERCION_PATH_COERCEVIAIO` casts, which `convert.c` doesn't handle. We could consider adding support for these conversions, but since it requires using the _in and _out functions to do the conversion, and since for JSON all we really need is the text version, this seems unnecessarily wasteful in terms of CPU cycles and memory. Instead, refactor things so that the binary conversion functions are aware of not only explicitly supported type mappings provided by `get_corr_postgres_type()`, but also the types used for the columns in the foreign tables. Then teach `get_corr_postgres_type()` to inspect these values in the context of the ClickHouse JSON type and allow both. For fetches (selects), on the first call to `ch_binary_read_row()`, populate `coltypes` with the actual column types and teach `make_datum()` to examine it when considering JSON types. This requires that the `List` of attributes be passed to `ch_binary_read_row`, and also that `coltypes` be initialized with zero values. Ideally `coltypes` would be populated by an earlier hook, before fetching starts, but this will do for now. For updates (inserts), on the other hand, create a new `List` and populate it with the requisite types in `clickhouseBeginForeignInsert` and `clickhousePlanForeignModify` (not sure why it needs both, but it follows the pattern of setting column attribute numbers, which are also in both places). THen pass it as a new argument to the `prepare_insert` function, now with an updated signature (the HTTP version currently ignores the new argument). Then fetch the types from it in `ch_binary_prepare_insert()` to pass to `get_corr_postgres_type()`. Teach `init_output_convert_state()` not to try to convert between `JSON` and `JSONB` types, and add comments to explain what's happening with all this type management stuff. Finally, update the function and operator pushdown functions to support the `->` and `->>` operators for JSON, as well as the `json_extract_path_text()` and `json_extract_path()` functions. Rename the relevant constants to `JSON_*` instead of `JSONB_*`, since they now handle both, and `JSON_*` is the more generic. Expand the JSON tests to cover all the same patterns for JSON that were previously covered for JSONB, for both the binary and http drivers. This requires additional alternate expected output files due to minor changes to earlier ClickHouse versions that did not support JSON and returned varying error messages about it. Document the support for JSON, as well as the operators and functions. In fact, the `jsonb_()` functions weren't previously documented, so add them, too. Also fix reversed descriptions for the `->` and `->>` operators and mention that they work for both JSON and JSONB. While at it, update the exception raised when `column_append()` can't handle a specific Postgres type to emit the name of the unsupported type.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In addition to the JSONB type mapping, add support for the use of the Postgres JSON type mapped to the ClickHouse JSON type. It already worked for the JSON driver, just hadn't been tested. But adding it to the binary driver required a bit more work.
Mainly that work encompasses making the binary type handling functions aware of the Postgres type used in columns. By default, the binary driver supports a short list of types that map to ClickHouse types, but allows for other types by exercising the functions in
convert.cto convert between compatible types. These conversion require casts. And while Postgres supports casts between JSON and JSONB, they'reCOERCION_PATH_COERCEVIAIOcasts, whichconvert.cdoesn't handle.We could consider adding support for these conversions, but since it requires using the _in and _out functions to do the conversion, and since for JSON all we really need is the text version, this seems unnecessarily wasteful in terms of CPU cycles and memory.
Instead, refactor things so that the binary conversion functions are aware of not only explicitly supported type mappings provided by
get_corr_postgres_type(), but also the types used for the columns in the foreign tables. Then teachget_corr_postgres_type()to inspect these values in the context of the ClickHouse JSON type and allow both.For fetches (selects), on the first call to
ch_binary_read_row(), populatecoltypeswith the actual column types and teachmake_datum()to examine it when considering JSON types. This requires that theListof attributes be passed toch_binary_read_row, and also thatcoltypesbe initialized with zero values. Ideallycoltypeswould be populated by an earlier hook, before fetching starts, but this will do for now.For updates (inserts), on the other hand, create a new
Listand populate it with the requisite types inclickhouseBeginForeignInsertandclickhousePlanForeignModify(not sure why it needs both, but it follows the pattern of setting column attribute numbers, which are also in both places). THen pass it as a new argument to theprepare_insertfunction, now with an updated signature (the HTTP version currently ignores the new argument). Then fetch the types from it inch_binary_prepare_insert()to pass toget_corr_postgres_type().Teach
init_output_convert_state()not to try to convert betweenJSONandJSONBtypes, and add comments to explain what's happening with all this type management stuff.Finally, update the function and operator pushdown functions to support the
->and->>operators for JSON, as well as thejson_extract_path_text()andjson_extract_path()functions. Rename the relevant constants toJSON_*instead ofJSONB_*, since they now handle both, andJSON_*is the more generic.Expand the JSON tests to cover all the same patterns for JSON that were previously covered for JSONB, for both the binary and http drivers. This requires additional alternate expected output files due to minor changes to earlier ClickHouse versions that did not support JSON and returned varying error messages about it.
Document the support for JSON, as well as the operators and functions. In fact, the
jsonb_()functions weren't previously documented, so add them, too. Also fix reversed descriptions for the->and->>operators and mention that they work for both JSON and JSONB.While at it, update the exception raised when
column_append()can't handle a specific Postgres type to emit the name of the unsupported type.