The IN operator now uses exact value semantics for the Bool type: only 0 and 1 in the set match Bool values. Previously, numeric values greater than 255 in the IN set were incorrectly clamped to true, so SELECT CAST(1, 'Bool') IN (256) returned 1. It now correctly returns 0. #93115 (Ashrith Bandla).
Fixed NOT operator precedence to match the SQL standard: NOT now binds looser than IS NULL, BETWEEN, LIKE, and arithmetic operators. For example, NOT (x) IS NULL is now parsed as NOT (x IS NULL) instead of (NOT x) IS NULL. This may change the result of queries that relied on the previous (non-standard) behavior. #97680 (Alexey Milovidov).
SELECT is no longer allowed as a bareword identifier in a WITH expression list element. #101059 (Aruj Bansal).
The IN operator now rejects lossy Decimal conversions inside composite types (Tuple, Array, Map), making its behavior consistent with top-level scalar comparisons. Previously, precision checks were enforced only for top-level scalar values: for example, CAST('33.3', 'Decimal64(1)') IN (33.33) correctly returned 0, but CAST(['33.3'], 'Array(Decimal64(1))') IN ([33.33]) incorrectly returned 1. Both cases now correctly return 0. #101812 (Nihal Z. Miaji).
Data type serialization versions are now propagated to nested data types. For example, the String serialization version with_size_stream was previously applied only to top-level String columns and Tuple elements; it is now applied to any String type inside any nested type such as Array, Map, Variant, or JSON. This is controlled by the MergeTree setting propagate_types_serialization_versions_to_nested_types, which is now enabled by default. After this change, newly created data parts cannot be read by older versions, but old parts can be read on the new version. Upgrades are safe, but downgrades are not. #94859 (Pavel Kruglov).
Fixed endianness of QBit binary serialization to use little-endian format. #101378 (Raufs Dunamalijevs).
Corrected the metadata of normal projections so that projections with multi-column sorting keys are properly recognized. #91352 (Amos Bird).
Fixed skip index files not respecting the replace_long_file_name_to_hash setting, which caused "File name too long" errors and broken index reads for indices with long names. Skip index filenames are now hashed when they exceed max_file_name_length, similar to column files. This is backward compatible (new servers read old parts), but downgrading (or old servers during a rolling upgrade) may cause long-named indices to be ignored. #97128 (Raúl Marín).
Removed the hypothesis skip index type. It was an obscure, experimental feature with limited practical use. Creating tables with INDEX ... TYPE hypothesis now produces an error. #96874 (Alexey Milovidov).
Changed the default value of mysql_datatypes_support_level from empty to decimal,datetime64,date2Date32, enabling proper mapping of MySQL DATE to Date32, DECIMAL/NUMERIC to Decimal, and DATETIME/TIMESTAMP with precision to DateTime64 by default. Previously, MySQL DATE columns were mapped to Date, which cannot represent dates before 1970-01-01, causing data corruption. #97716 (Alexey Milovidov).
Async insert is now enabled by default — ClickHouse batches all small inserts by default. This is governed by the compatibility setting: setting compatibility to a version lower than 26.2 restores the previous default of false. Async inserts can be turned on or off at several levels: in the config users profiles, for the session, for the query, or for the MergeTree table. #97590 (Sema Checherinda).
mergeTreeAnalyzeIndexes{,UUID} now accepts an array of part names instead of a regexp, since regexps are slow (experimental feature). #98474 (Azat Khuzhin).
The H3 library has been updated to v4, which improves the precision of length, area, and other metric calculations. This change is backward incompatible because the new results differ from previous ones. #100348 (Alexey Milovidov).
Changed how the Merge table handles virtual columns. If the underlying table contains _table or _database, these columns are read from storage; otherwise they are filled after the read step using the expression step. #101742 (Mikhail Artemenko).
Added the naturalSortKey(s) function for natural sort keys. #90322 (Nazarii Piontko).
Added the arrayAutocorrelation(arr [, max_lag]) function, which computes the normalized autocorrelation of a numeric array for each lag. Supports integer, float, and decimal array types. #94776 (Wenyu Chen).
The has() function is now supported for the JSON type to check path existence, similar to Map. #96927 (DQ).
Added the mergeTreeTextIndex(database, table, index) table function, which reads data directly from a text index. It can be used for introspection or for performing aggregations on top of text index data. #97003 (Anton Popov).
Added the caseFoldUTF8 and removeDiacriticsUTF8 functions for Unicode case folding and diacritical mark removal. #98973 (George Larionov).
The printf function now supports non-constant format strings, allowing a different format pattern per row based on column values. #98991 (Yash).
Added the highlight() function, which wraps occurrences of search terms in a text string with HTML tags (default <em>/</em>). Supports ASCII case-insensitive matching, automatic merging of overlapping matches, and custom open/close tags. #99131 (Peng).
Added the normalizeUTF8NFKCCasefold string function for NFKC_Casefold Unicode normalization, which combines NFKC normalization with case folding. #99276 (George Larionov).
Added the unicode_word tokenizer for full-text indexes and the tokens function. It splits text using Unicode word boundary rules: ASCII words are formed with connector characters (underscore, colon, dot, single quote), while non-ASCII Unicode characters become single-character tokens. Configurable stop words default to common CJK punctuation. #99357 (Amos Bird).
Added the arrayTranspose function, which takes a two-dimensional array (matrix) and transposes it. #101214 (Vitaly Baranov).
Added the hasPhrase function (alias matchPhrase) for phrase search (continuous sequences of tokens). Search is brute-force and not yet supported by the text index. #101997 (Elmi Ahmadov).
Added dotProduct as a supported distance function for vector similarity indexes. Sort direction is now validated for all distance functions to prevent silently incorrect results. #102254 (Renzo).
The stem function can now stem all words/tokens in String, FixedString, Array([Fixed]String), Nullable, LowCardinality, and Const columns. #99137 (Jimmy Aguilar Mena).
The stem function is now non-experimental (previously, allow_experimental_nlp_functions had to be enabled). #102399 (Jimmy Aguilar Mena).
Added the JSONAllValues function, which returns all values from a JSON column as Array(String), with values serialized in text representation and ordered by their path names. Added text index support for the JSONAllValues expression on JSON columns: when a text index is created on JSONAllValues(json_column), it is automatically used to filter queries on JSON subcolumns (e.g. json_column.key1 = 'value'). #100730 (Anton Popov).
Added PostgreSQL-compatible units to the EXTRACT operator: EPOCH, DOW, DOY, ISODOW, ISOYEAR, WEEK, CENTURY, DECADE, and MILLENNIUM. Also fixed EXTRACT(WEEK FROM date), which previously threw an error. #100274 (Alexey Milovidov).
Added the toDaysInMonth() function support via parser sugar for the SQL standard OVERLAY function syntax. The overlay function already exists; this adds support for the keyword-based form using PLACING, FROM, and FOR as separators. #101681 (Desel72).
Date and Date32 values can now be added to Time and Time64 values using the + operator, producing a DateTime or DateTime64 result. For example, SELECT toDate('2024-01-15') + toTime('14:30:25') returns 2024-01-15 14:30:25. The result is computed in the session timezone, and out-of-range results are handled according to the date_time_overflow_behavior setting. #102421 (Nihal Z. Miaji).
When a parameter has a Nullable type and is not specified, its value is now assumed to be NULL. #93869 (Vikash Kumar).
ClickHouse can now prune entire data parts in SELECT queries based on min/max statistics. #94140 (zoomxi).
Added support for Materialized CTEs, which evaluate CTEs only once during query execution and store their results in temporary tables. #94849 (Dmitry Novik).
Certain functions can now be used without parentheses in SQL. #95949 (Aly Kafoury).
Added the combined subcolumn syntax json.@path for the JSON type. It returns the literal value as Dynamic if the path holds a scalar, the sub-object as Dynamic if the path holds a nested object, or NULL if the path is absent. #98788 (Pavel Kruglov).
Added the SOME keyword for subquery expressions. It behaves identically to ANY. #99842 (Artem Kytkin).
Added support for SET TIME ZONE 'tz' as an alias for SET session_timezone. #99883 (phulv94).
Added support for the SQL standard VALUES clause as a table expression in FROM, e.g. SELECT * FROM (VALUES (1, 'a'), (2, 'b')) AS t(id, val). #100143 (Desel72).
Added support for SQL-standard compound interval literals with TO range qualifiers, e.g. INTERVAL '1:30' HOUR TO MINUTE, internally decomposed into sums of intervals. #100453 (Desel72).
Added the SYSTEM FLUSH OBJECT STORAGE QUEUE db.table PATH 'x' command for ordered and unordered modes. #100709 (Bharat Nallan).
Added incremental read support for the Paimon table engine with Keeper-backed snapshot progress tracking, including targeted snapshot delta reads via paimon_target_snapshot_id. #93655 (XiaoBinMu).
Added support for auxiliary ZooKeeper in DatabaseReplicated. #95590 (RinChanNOW).
Added support for reading Iceberg v3 nanosecond timestamp types (timestamp_ns and timestamptz_ns), mapped to DateTime64(9) and DateTime64(9, 'UTC'). This enables ClickHouse to read Iceberg data with these types in the schema; it does not constitute full Iceberg v3 support, nor writing Iceberg v3 with these types. #97132 (Dmitry Kovalev).
Added ALTER TABLE ... EXECUTE expire_snapshots('<timestamp>') for Iceberg tables. #97904 (murphy-4o).
Added a new SLRU cache for Parquet metadata to improve read performance by removing the need to re-download files just to read metadata. #98140 (Grant Holly).
Added MergeTree skip index support for JSON columns using JSONAllPaths with bloom_filter, tokenbf_v1, ngrambf_v1, and text (inverted) index types, enabling granule skipping based on the set of JSON paths present in each granule. #98886 (Pavel Kruglov).
Added the commit_order projection index, which reorganizes data in insertion order. #99004 (Mikhail Artemenko).
Added bucketed serialization for Map columns in MergeTree (map_serialization_version = 'with_buckets'). Keys are split into hash-based buckets so that reading a single key (m['key']) only reads one bucket instead of the entire column, providing a 2–49x speedup for single-key lookups depending on map size. The number of buckets and the bucketing strategy can be controlled by the new MergeTree settings map_serialization_version, max_buckets_in_map, map_buckets_strategy, map_buckets_coefficient, and map_buckets_min_avg_size. The setting map_serialization_version_for_zero_level_parts keeps basic serialization for inserts to avoid write overhead while merged parts use buckets. Skip indexes on mapKeys/mapValues are supported with bucketed serialization, and optimize_functions_to_subcolumns rewrites m['key'] to a per-key subcolumn read when buckets are available. #99200 (Pavel Kruglov).
Implemented new behavior of max_insert_block_size_rows, max_insert_block_size_bytes, min_insert_block_size_rows, and min_insert_block_size_bytes in squashing, governed by the compatibility setting use_strict_insert_block_limits. #94207 (Kirill Kopnev).
Added the table_readonly MergeTree setting to mark tables as read-only, preventing inserts and modifications. #97652 (Alexey Milovidov).
Added the use_partition_pruning setting (alias use_partition_key). Set it to false to disable partition pruning based on the partition key. #97888 (Nihal Z. Miaji).
Each type=http entry in <protocols> can now specify a custom <handlers> key pointing to a separate <http_handlers_*> config section, enabling different HTTP routing rules per port. #98414 (Amos Bird).
Added two MergeTree settings — replicated_fetches_min_part_level and replicated_fetches_min_part_level_timeout_seconds — that allow replicas to skip fetching freshly-inserted (unmerged) parts from peers, reducing replication overhead during heavy ingestion. #98625 (tanner-bruce).
Added the restore_access_entities_with_current_grants server setting. When enabled, restored users/roles have their grants limited to what the restoring user is allowed to grant (same semantics as GRANT CURRENT GRANTS), instead of failing with ACCESS_DENIED. #98795 (pufit).
Added the max_skip_unavailable_shards_num and max_skip_unavailable_shards_ratio settings to limit how many shards can be silently skipped when skip_unavailable_shards is enabled. If the number or ratio of unavailable shards exceeds the configured threshold, an exception is thrown instead of returning silently incomplete results. #99369 (Alexey Milovidov).
Implemented quotas by normalized query hash to protect public ClickHouse services from abuse. NORMALIZED_QUERY_HASH is supported as a quota key type — a separate quota bucket per unique normalized query, so CREATE QUOTA q KEYED BY normalized_query_hash tracks each distinct query independently. QUERIES_PER_NORMALIZED_HASH is supported as a quota resource type — limiting the maximum executions of any single normalized query within an interval, so MAX queries_per_normalized_hash = 100 prevents any one query pattern from running more than 100 times. #99586 (Alexey Milovidov).
The auto_statistics_types MergeTree setting now defaults to minmax, uniq, so minmax and uniq statistics are created automatically for all suitable columns in new tables. materialize_statistics_on_insert now defaults to false, so statistics are built during merges rather than at insert time, reducing insert overhead. Set materialize_statistics_on_insert = 1 to restore the old behavior. #101275 (Han Fei).
Added the prefer_dependency_replica refresh setting for materialized view dependency chains to reduce missing data from cross-replica replication lag. #101591 (Seva Potapov).
Added the per-server LDAP config option <follow_referrals> (default false) to control whether the LDAP client follows referrals. Disabling referral chasing avoids timeouts and hangs when searching from an Active Directory domain-root base DN, and produces less log noise. #96765 (paf91).
Added system.histogram_metric_log, a new system table that periodically snapshots all histogram metrics (e.g. S3/Azure latencies, Keeper request processing stage durations). The value column of system.histogram_metrics is now Float64 for greater flexibility and compatibility with the Prometheus data model. #103046 (Miсhael Stetsyuk).
Added the ALP floating-point compression codec (without ALP_rd fallback for non-compressible doubles). #91362 (Nazarii Piontko).
Added experimental lazy type hints for JSON columns. When enabled via allow_experimental_json_lazy_type_hints, ALTER TABLE ... MODIFY COLUMN json JSON(path TypeName) that only adds or modifies type hints completes instantly as a metadata-only operation, without rewriting historical data. Type hints are applied at query time for old parts and materialized during INSERTs and background merges. #97412 (tanner-bruce).
Added support for external SQL dialects using the polyglot library. #99496 (Alexey Milovidov).
Added the query_plan_optimize_join_order_randomize setting, which randomizes statistics used for join reordering, useful for testing. #100643 (Vladimir Cherkasov).
Added AI function support to ClickHouse, allowing users to call OpenAI and Anthropic endpoints using SQL. aiGenerate is included as the first such function. #100831 (George Larionov).
Added the AI functions aiClassify, aiExtract, and aiTranslate for utilizing LLM APIs in ClickHouse. #100832 (George Larionov).
Added support for data lakes on top of Azure with Delta Kernel. #102202 (Smita Kulkarni).
Optimized granule skipping for pointInPolygon for large polygons and fixed pointInPolygon index analysis throwing during primary key pruning. #91633 (Nihal Z. Miaji).
optimize_read_in_order is now respected when reading projections. #95885 (Andrey Zvonov).
Slightly optimized reading from the text index dictionary, improving the overall performance of text index analysis. #97519 (Anton Popov).
Improved performance of text index analysis for queries with combined conditions involving both indexed and non-indexed columns. Previously, early-exit optimization during index analysis was incorrectly disabled in such cases. #98096 (Anton Popov).
Improved the performance of queries with constant expressions that generate very long arrays or maps. #98287 (Alexey Milovidov).
Fixed key condition analysis for DateTime64 primary keys compared with integer constants, which previously resulted in no granule pruning. #98410 (Amos Bird).
The setting optimize_syntax_fuse_functions is now enabled by default. #98424 (Alexey Milovidov).
Partition pruning is now allowed when the predicate contains any comparison operator (=, <, >, !=) and the partition key is wrapped in a deterministic function chain (e.g. with PARTITION BY x, predicates like cityHash64(x) % 5 > 2, toYYYYMM(x) < 2026, toYYYYMM(x) = 2026, or toYYYYMM(x) != 2026 will all use the partition key for pruning). #98432 (Nihal Z. Miaji).
Read-in-order optimization and primary-key pruning are now allowed when the CAST target type is Nullable and the conversion is monotonic; for example, with PRIMARY KEY x, ClickHouse can use read-in-order optimization for ORDER BY x::Nullable(UInt64) and primary-key pruning for predicates such as WHERE x::Nullable(UInt64) > 500000. #98482 (Nihal Z. Miaji).
Index pruning and filter pushdown are now allowed when an integral column is compared with a float literal; for example, predicates like WHERE x < 10.5 can now use the primary key for pruning, and filters such as prime < 1e9 or number < 1e5 are now pushed down for the primes() and numbers() table functions instead of causing unbounded execution. #98516 (Nihal Z. Miaji).
Queries filtering on MergeTree primary key columns with regex alternations over literal strings, such as ^(abc-1|abc-2), can now use primary key pruning when the alternatives share a common prefix. #98988 (Yash).
Added support for the read_in_order_use_virtual_row optimization for reverse-order reads. #99198 (Vladimir Cherkasov).
Faster discontinuous queries for LowCardinality columns with a single dictionary. #99285 (Ivan Babrou).
Fixed negative scaling for short queries with aggregation on machines with many cores. When a query reads few marks, the pipeline no longer expands to max_threads after aggregation, avoiding overhead from mostly-empty streams. #99493 (Alexey Milovidov).
Improved the performance of queries with parallel replicas by correctly selecting the reading task size. #99801 (Nikita Taranov).
Removed redundant trailing ORDER BY elements once all GROUP BY keys are covered in the ORDER BY prefix. #100157 (Alexey Milovidov).
Optimized queries by pushing the LIMIT clause down into UNION ALL. #100364 (Alexey Milovidov).
Added JIT compilation support for String and FixedString column comparisons in ORDER BY, improving merge-phase sort performance by 6–17% for string-heavy sort keys. #100577 (Raúl Marín).
When read_in_order_use_virtual_row is enabled together with the new read_in_order_use_virtual_row_per_block setting, virtual row boundary information is now emitted after each block read from MergeTree, allowing the merge to reprioritize sources mid-stream for parts whose data is fully filtered out by WHERE/PREWHERE/JOIN. #100603 (Vladimir Cherkasov).
Better parallelization of queries with simple views (over an underlying MergeTree table) executed with parallel replicas. #100815 (Igor Nikonov).
Added support for parallel replicas over simple views (including eligible UNION ALL views over MergeTree tables) when parallel_replicas_allow_view_over_mergetree=1. This parallelizes the view's outer query instead of the inner one, increasing query parallelization across nodes. #100958 (Igor Nikonov).
Optimized reading in order of the primary key for full_sorting_merge when filters with IN are present in the query plan. #101261 (Nikita Taranov).
Improved performance of INSERT VALUES for Map, Array, and Tuple columns when values are passed as escaped strings (e.g. '{\'key\':1}'), avoiding unnecessary fallback to the SQL expression parser. #102119 (Joanna Hulboj).
Optimized the avgWeighted aggregate function by using local accumulators instead of per-row store-forwarding through aggregate state, improving performance by up to 27% for Nullable inputs. #98793 (Antonio Andelic).
Sped up var*Stable and stddev*Stable functions for Float64 columns by devirtualizing the inner loop. This enables compiler optimizations (FMA/registers) that alter floating-point results at the ULP level. #99460 (Riyane El Qoqui).
Use the optimized Firedancer base58 encode for inputs of 32/64 bytes (automatic for base58Encode), and allow using the optimized base58 decode if the decoded result is 32/64 bytes (explicit with base58Decode('...', 32)). #99461 (Joanna Hulboj).
Optimized the cutURLParameter function by eliminating unnecessary memory shifts. #100218 (Nikita Semenov).
Faster float-to-string conversion for large integer values by extending the itoa fast path with dragonbox-compatible rounding. #100649 (Raúl Marín).
Replaced dragonbox with zmij for 1.5x–3x faster float-to-string conversion. #100650 (Raúl Marín).
Faster Int128/UInt128 to string conversion by replacing software division with Barrett reduction and unrolling the conversion loop. #100671 (Raúl Marín).
Reduced lock contention during read-only operations on ReplicatedMergeTree tables with finished mutations. #95771 (Eduard Karacharov).
Iceberg can now asynchronously pre-populate metadata into the cache, enabled by setting iceberg_metadata_async_prefetch_period_ms at table creation. SELECT queries from Iceberg tables can now specify the iceberg_metadata_staleness_ms parameter, which allows ClickHouse to rely on the cached version of the metadata if it is fresher than the specified staleness; otherwise the remote Iceberg catalog is queried for the latest metadata. This can eliminate calls to the Iceberg catalog during request processing, bringing a visible performance gain. #96191 (Arsen Muk).
S3Queue ordered mode now uses S3 ListObjectsV2StartAfter to avoid re-listing the full prefix history, reducing ListObjects calls. #96370 (Venkata Vineel).
Improved the performance of data lakes. In previous versions, reading from object storage did not resize the pipeline to the number of processing threads. #99548 (Alexey Milovidov).
Allowed prefetching when reading a remote file through the userspace page cache. #99919 (Alexey Milovidov).
Fixed a significant INSERT performance regression when deduplicate_insert=enable (default since 26.2) by deferring data hash computation from squashing to the sink and using batch column hashing via updateHashWithValueRange, reducing overhead from ~2.5s to ~0.5s for 5M rows with 22 columns. #101494 (Sema Checherinda).
For sync inserts, the original block (needed for deduplication) can now be omitted to save memory. #96661 (Sema Checherinda).
Improved performance and reduced memory usage for parallel window functions in certain scenarios, and for arrayFold workloads with large arrays. This can also reduce page-fault pressure and improve stability under tight memory limits for affected queries. #98892 (filimonov).
Aggregate projections are now correctly supported in views. #88798 (Amos Bird).
optimize_aggregators_of_group_by_keys now correctly optimizes aggregate functions in GROUPING SETS queries. #93935 (Xiaozhe Yu).
Added support for converting OUTER to INNER join optimization with join_use_nulls. #95968 (Vladimir Cherkasov).
Extended cast_keep_nullable to work with Dynamic/JSON types. When set, casting NULL from types that can be Nullable returns NULL; otherwise NULL throws a CANNOT_INSERT_NULL_IN_ORDINARY_COLUMN error. #96504 (Seva Potapov).
Added information about deferred filters as a separate item in EXPLAIN query output (when using row policies/PREWHERE with FINAL). #97374 (Yarik Briukhovetskyi).
Added support for parenthesized table join expressions in the FROM clause, e.g. SELECT * FROM (t1 CROSS JOIN t2). #97650 (Alexey Milovidov).
Apply data skipping indexes during distributed index analysis. #97767 (Azat Khuzhin).
The sorting key of a table can now be an expression like toDate(time), and decisions about not deferring such expressions are made when they are part of filters. #98237 (Yarik Briukhovetskyi).
The join order optimizer now infers transitive equi-join predicates from existing join conditions. For example, given A.x = B.x AND B.x = C.x, the equivalence A.x = C.x is recognized, allowing the optimizer to consider direct joins between transitively-connected tables. This can improve plan quality for star and snowflake schemas. The feature is controlled by the new enable_join_transitive_predicates setting (off by default). #98479 (Alexander Gololobov).
When apply_row_policy_after_final or apply_prewhere_after_final is enabled, compound AND conditions in row policies and PREWHERE are now decomposed to extract sorting-key atoms for primary key index analysis. Previously, if a deferred filter contained a mix of sorting-key and non-sorting-key predicates (e.g. x > 1 AND y != 'foo'), the entire expression was excluded from index analysis. #98513 (Yarik Briukhovetskyi).
Only "alive" (connectable) replicas now participate in distributed index analysis. #98521 (Azat Khuzhin).
Capped the number of nodes used in the automatic parallel replicas heuristic to the actual number of nodes in the cluster (instead of only the max_parallel_replicas setting). #98668 (Nikita Taranov).
Implemented hedged requests and asynchronous reading for distributed index analysis. #98724 (Azat Khuzhin).
Added support for the SAMPLE clause in distributed index analysis. #98931 (Azat Khuzhin).
Added monotonicity support for multiply, enabling primary key pruning for key * constant expressions. #98983 (Amos Bird).
Analyzer error messages no longer dump all columns of a table (which could produce 150KB+ exceptions). Column lists are now capped at 10 entries. #99002 (Yash).
Column statistics from sub-queries with joins are now properly returned so the parent query can use them for join reordering. #99096 (Alexander Gololobov).
ALTER TABLE MODIFY COLUMN x TTL ... is now allowed without specifying the column type. #99208 (Nikolay Degterinsky).
Improved EXPLAIN PLAN pretty=1 output: print top-level query output columns, show join relation labels/symbols with estimated result rows and locality, and include per-step output columns for join/source steps. #99462 (Kirill Kopnev).
Fixed merge() table function failing with an UNKNOWN_IDENTIFIER error when querying columns not present in all underlying distributed/remote tables. #99833 (Alexey Milovidov).
ConditionSelectivityEstimator is no longer used in the optimizer when the condition is not in CNF (for example where a = 0, which is a single condition where statistics cannot help). #100110 (Han Fei).
Applied distributed_index_analysis_min_indexes_bytes_to_activate after partition pruning. #100477 (Azat Khuzhin).
EXPLAIN PIPELINE now supports the distributed=1 setting to include remote pipeline fragments. #100513 (Nikita Taranov).
use_partition_pruning = 0 now also disables MinMax index pruning and count optimization on partition key columns, in addition to disabling pruning based on partition keys. #100904 (Nihal Z. Miaji).
EXPLAIN [PLAN] pretty=1 now prints expressions in a human-readable format. #100927 (Kirill Kopnev).
accurateCastOrNull and accurateCastOrDefault now support Tuple target types, including nested Tuples with Nullable elements. Previously these functions rejected Tuple targets because Tuple could not be inside Nullable. #100942 (Nihal Z. Miaji).
You can now use a trailing comma in the WITH clause before a SELECT query. #101093 (Aruj Bansal).
Implemented lazy column materialization for ReplacingMergeTree with FINAL when the predicate is selective enough. #101647 (Nikolai Kochetov).
Improved default selectivity estimates for conditions without statistics to match industry-standard heuristics: 0.1 for LIKE condition selectivity, 0.33 for unknown condition selectivity. #101653 (Alexander Gololobov).
Inline the VIEW subquery in the query tree to allow more optimizations to be applied to the VIEW. #100830 (Dmitry Novik).
You can now use native JSON/Object input for JSONExtract functions. #96711 (Fisnik Kastrati).
Introduced the tokensForLikePattern SQL function, which tokenizes LIKE patterns while respecting wildcard semantics: % and _ are treated as wildcards, escaped wildcards (\%, \_) are treated as literals, and tokens adjacent to unescaped wildcards are discarded. #97872 (Amos Bird).
Implemented the toDaysInMonth() function, which returns the number of days in the month of the specified date. #99227 (Vitaly Baranov).
Added parser-level syntactic sugar for the SQL standard OVERLAY function syntax. The overlay function already exists; this adds support for the keyword-based form using PLACING, FROM, and FOR as separators. #101681 (Desel72).
The ngrams function now rejects invalid ngram lengths. For example, SELECT ngrams('abc', 0) now returns an error. #101922 (Robert Schulze).
Added support for the array tokenizer in the LIKE optimization. #102880 (Elmi Ahmadov).
Added a dedicated cleanup thread for MergeTree to prevent cleanup delays under heavy merge load. #91574 (Amos Bird).
Tables with DELETE TTL rules can now use the vertical merge algorithm. #97332 (murphy-4o).
Marks of secondary indexes are now prewarmed when the prewarm_mark_cache setting is enabled (loaded into the index mark cache during data part fetches and table startup). #97772 (Anton Popov).
Text indexes can now be built on Nullable([Fixed]String) and Array(Nullable([Fixed]String)) columns. #98118 (Jimmy Aguilar Mena).
Cancel background merges early in DROP DATABASE for ordinary shared merge tree. #98161 (Shaohua Wang).
Added the finalize_projection_parts_synchronously setting to allow synchronous finalization of projection parts during INSERT, reducing peak memory usage for tables with many projections while preserving the existing async behavior by default. #98228 (Amos Bird).
Added the share_nested_offsets MergeTree setting (default true). When set to false, Array columns with dotted names (e.g. n.a, n.b) are treated as independent columns instead of sharing offset files and validating equal array sizes as part of legacy Nested semantics. #98416 (Amos Bird).
Optimized TRUNCATE DATABASE TABLES LIKE by pre-cancelling merges in parallel. #98597 (Shaohua Wang).
Reduced lock contention in MergeTreeBackgroundExecutor by releasing task resources without acquiring the lock. #98604 (Dmitry Novik).
TRUNCATE DATABASE now responds to query cancellation. #98828 (Shaohua Wang).
Restart the statistics cache after changing a merge tree setting. #98520 (Han Fei).
Added the compress_per_column_in_compact_parts MergeTree setting to control how compressed blocks are organized within Compact parts. When true (default, preserving current behavior), each column starts a new compressed block, allowing selective decompression. When false, all columns within a granule are packed into the same compressed block, improving compression ratio and read performance for workloads that always read all columns. #101114 (Amos Bird).
The text index is now GA and stays enabled regardless of the compatibility setting, preventing unexpected disabling during backup restores or when running in compatibility mode. #101518 (Nikita Fomichev).
Added support for the text index built on mapValues(map) with the IN operator. #99286 (Anton Popov).
Improved analysis of multiple text indexes used simultaneously in the same query. The use_skip_indexes_on_data_read setting is no longer required for the direct read optimization of text indexes. #102255 (Anton Popov).
Added text index analysis support for the hasPhrase function via the HINT mode. #102438 (Elmi Ahmadov).
MinMax column statistics now store the minimum and maximum values as Field (typed) instead of Float64. The serialized format includes the column type name alongside the values. The statistics file version is bumped to V2; files written by older versions require re-materialization (ALTER TABLE ... MATERIALIZE STATISTICS ALL). #100605 (Han Fei).
Rebuilt entities in IcebergManifestFile to make code maintenance less error-prone (also fixes some manifest file caching issues). #98231 (Daniil Ivanik).
Enhanced Iceberg ALTER TABLE ... EXECUTE expire_snapshots(...) with richer argument support. #99130 (murphy-4o).
Changed the interface for Iceberg inserts with the catalog. Deprecated the storage_catalog_type, storage_aws_access_key_id, and similar settings. #100334 (Konstantin Vedernikov).
Fixed inconsistent path handling in Iceberg caused by mixed usage of storage paths and metadata paths; enforced that Iceberg tables write a table location that is either a URL or an absolute path; added a fallback for counting file sizes in Azure; handled version-hint.txt in a manner compatible with Spark; introduced type-level abstractions to make path types harder to confuse; and fixed usage of position deletes. #100420 (Daniil Ivanik).
Avoid scanning the whole remote data lake catalog for "Maybe you meant ..." table hints when show_data_lake_catalogs_in_system_tables is disabled. #100452 (Alsu Giliazova).
Object information used for parsing Iceberg data files now contains the number of file rows and the file size in bytes parsed from the manifest file. #100645 (Daniil Ivanik).
Added a {_schema_hash} placeholder for the S3 table engine that inserts a hash of the table's column definitions into the S3 path. #98265 (Miсhael Stetsyuk).
Users can now specify multiple authentication methods in users_xml configuration. #91998 (Flip-Liquid).
You can now set internal_replication settings for a cluster created by the Replicated database. #97228 (Pervakov Grigorii).
Added the allow_nullable_tuple_in_extracted_subcolumns setting, which controls whether extracted Tuple(...) subcolumns from Tuple, Variant, Dynamic, and JSON are returned as Nullable(Tuple(...)) (NULL for missing rows) or as Tuple(...) (default tuple values for missing rows). Disabled by default; can only be changed by restarting the server. #97299 (Nihal Z. Miaji).
type_json_allow_duplicated_key_with_literal_and_nested_object is now enabled by default, avoiding errors about duplicated keys when parsing JSON like {"a" : 42, "a" : {"b" : 42}} that ClickHouse may produce from original data {"a" : 42, "a.b" : 42}. #97423 (Pavel Kruglov).
Column statistics are now GA. The setting allow_experimental_statistics (default false) is obsoleted in favor of allow_statistics (default true); allow_statistics_optimize is promoted from beta to GA; ClickHouse creates minmax and uniq statistics for new columns (MergeTree setting auto_statistics_types); and to avoid slower INSERTs, materialize_statistics_on_insert is now disabled by default. #97487 (Han Fei).
Clarified the relationship between enable_parallel_replicas and automatic_parallel_replicas_mode: a query can use parallel replicas only if enable_parallel_replicas > 0. With automatic_parallel_replicas_mode=1, the decision is made during planning based on collected statistics; with automatic_parallel_replicas_mode=0, parallel replicas are used for all supported queries regardless of statistics. Distributed insert-select with parallel replicas always behaves as if automatic_parallel_replicas_mode=0. #97517 (Nikita Taranov).
Added the access_control_improvements.disallow_config_defined_profiles_for_sql_defined_users setting (allowed by default) that disallows using config-defined settings profiles (except for the default profile) for SQL-defined users. #98662 (Alexander Tokmakov).
Added a setting to control type mismatch behavior for Variant and Dynamic (throw or return null). #99085 (Bharat Nallan).
Changed the default stderr_reaction from throw to log_last for executable UDFs. UDFs that write warnings to stderr no longer fail when the exit code is 0, and exit code exceptions now include stderr content. #99232 (Xu Jia).
Added the input_format_column_name_matching_mode setting, which allows different case sensitivities for input formats. #99346 (manerone).
Auto statistics are no longer enabled for system tables, as they rarely benefit from them. #102862 (Han Fei).
find_super_nodes no longer gets stuck traversing the children of the first super node, making it possible to find more than one. #97819 (pufit).
Mark a ZooKeeper session as expired immediately when finalization starts, instead of waiting for the send thread to exit. This allows other threads to establish a new session without delay. #99102 (Raúl Marín).
Skip stale Keeper requests for sessions that have already disconnected, avoiding unnecessary Raft round-trips. The number of tracked finished sessions is capped by the max_finished_sessions_cache_size coordination setting. #99246 (Antonio Andelic).
Prevent the Keeper mntr command from getting stuck due to lock contention. #99472 (Antonio Andelic).
Reduced lock contention in the Keeper dispatcher by invoking callbacks and dispatching read requests outside the mutex scope, and added profiled lock guards for observability. #99751 (Antonio Andelic).
Added the output_format_trim_fixed_string setting to strip trailing null bytes from FixedString values in text output formats. #97558 (NeedmeFordev).
Made it possible to parse GeoParquet files that contain different Geo types in the same column. #97851 (Mark Needham).
Deserialization of binary AggregateFunction states now requires consuming the full input. If extra redundant trailing bytes are present, ClickHouse throws an exception instead of accepting malformed state data. #98786 (Nihal Z. Miaji).
Users can now write ClickHouse interval data types to the Arrow format. #99519 (Peter Nguyen).
Added native support for importing and exporting UUID data types in Arrow and Parquet formats, with automatic logical inference for top-level UUIDs and support for explicit schema hints for nested UUIDs. #99521 (Ivan).
Tolerate missing padding at the end of the last block of Parquet files. #99857 (Seva Potapov).
Exporting UUIDs to Parquet via the Arrow encoder now includes the correct UUID type annotation, eliminating the need to manually cast FixedString(16) data when reading the files back. #100150 (Ivan).
Added native support for importing Apache Arrow StringView and BinaryView data types into ClickHouse String columns, improving compatibility for Arrow-based ingestion. #100762 (Ivan).
Added support for Nullable(Tuple) in the Arrow, ArrowStream, ORC, and legacy Parquet formats. #101272 (Nihal Z. Miaji).
Added sslmode to the allowed keys for PostgreSQL dictionary sources, making it possible to configure SSL mode for PostgreSQL dictionary connections (e.g. for AWS RDS, which enforces SSL by default). #98014 (mcalfin).
Avoid dropping named collections that are dependencies of dictionary sources. #98127 (Pablo Marcos).
Added support for Map and JSON/Object types as dictionary attributes. Dictionaries can now store and retrieve complex types including Map(String, String), Map(String, Array(String)), JSON, and Nullable(JSON) in both FLAT and HASHED layouts. #98627 (yanglongwei).
SYSTEM RELOAD DICTIONARIES now reloads dictionaries in topological order so that dictionaries sourcing from other dictionaries see fresh data after reload. #98356 (Alexey Milovidov).
Cache dictionaries no longer take an exclusive lock in hasKeys, reducing lock contention by using a shared lock for cache reads. #100796 (liuguangliang).
ACCESS_DENIED hints no longer reveal column names unless the user can show all required columns; database/table names remain visible in the hint. #91067 (filimonov).
Reload the cluster config if the IPs of the local server's hostname change, instead of the IPs of any host. #93726 (Zhigao Hong).
Disable AI SQL generation (?? command) in the embedded client (SSH and WebSocket protocols) to prevent access to the server's environment variables. #100290 (Alexey Milovidov).
Fixed a missing column when a non-standard identifier alias was used in a JOIN with the old analyzer. #95679 (Zhigao Hong).
Fixed an incorrect result when the grace_hash algorithm was used with non-equi joins and the left block could not be processed completely because of size constraints on the join result. #97866 (János Benjamin Antal).
Fixed a logical error with analyzer_compatibility_join_using_top_level_identifier and ARRAY JOIN. #98179 (Vladimir Cherkasov).
Fixed a LOGICAL_ERROR exception when a RIGHT JOIN wrapped in a CROSS JOIN was swapped by the query_plan_join_swap_table optimization in the legacy join step code path. #98279 (Alexey Milovidov).
Fixed a "Pipeline stuck" exception in full_sorting_merge joins caused by a deadlock in PingPongProcessor when the FilterBySetOnTheFly optimization created a circular dependency with MergeJoinTransform. #98454 (Alexey Milovidov).
Fixed a logical error exception when CROSS JOIN was used together with INNER JOIN USING. #98459 (Alexey Milovidov).
Fixed a LOGICAL_ERROR exception when arrayJoin was used in a filter expression with OUTER JOIN and join_use_nulls enabled. #98464 (Alexey Milovidov).
Fixed a BAD_GET exception and incorrect query results when a non-boolean expression (e.g. sin(col)) was used in both WHERE and SELECT with a JOIN, due to filter push-down optimization corrupting shared DAG nodes. #98681 (Alexey Milovidov).
Fixed a LOGICAL_ERROR "Replica decided to read in Default mode, not in WithOrder" when using read_in_order_through_join with parallel replicas. #98685 (Alexey Milovidov).
LEFT ANTI JOIN with multiple join key columns returned wrong results when enable_join_runtime_filters=1 (the default). #98871 (Alexander Gololobov).
Fixed a bug in the query_plan_convert_any_join_to_semi_or_anti_join optimization that returned an incorrect result for unmatched rows. #99112 (Yarik Briukhovetskyi).
Fixed an exception in functions operating on ColumnReplicated with unreferenced rows produced by JOIN. #99564 (Hechem Selmi).
Fixed a rare case where a join with reordering could produce a wrong result. #100790 (Yarik Briukhovetskyi).
Fixed wrong results when a JOIN with shard-by-PK optimization used the query condition cache and some parts were filtered out by cached conditions. #100926 (Groene AI).
Fixed a NOT_FOUND_COLUMN_IN_BLOCK exception when using ARRAY JOIN with JOIN USING and analyzer_compatibility_join_using_top_level_identifier enabled. #101507 (Vladimir Cherkasov).
Fixed incorrect row ordering in queries that use ORDER BY with the grace_hash join algorithm, which could produce silently incorrect output. #102036 (János Benjamin Antal).
Fixed a LOGICAL_ERROR (Unexpected size of index type) that could occur in RIGHT JOIN and FULL JOIN queries when max_bytes_in_join was configured. #102042 (Jimmy Aguilar Mena).
Fixed how an Alias table target is saved as a DDL dependency when not fully qualified: it is now saved with the Alias table database instead of the session database. #95175 (Enric Calabuig).
Fixed a Block structure mismatch in stream error caused by unnecessary columns returned from Lazy materialization. #96682 (Nikolai Kochetov).
Fixed an exception in tuple comparison involving Nothing type elements (e.g. comparing with NULL tuple elements) when used with GROUPING SETS and ORDER BY. #97509 (Alexey Milovidov).
Fixed a "Context has expired" exception for correlated subqueries containing table functions like url(). #97544 (Alexey Milovidov).
Fixed exceptions and incorrect behavior in optimize_syntax_fuse_functions with aggregate projections, Date types, and column name preservation. #97545 (Alexey Milovidov).
Removed an incorrect replaceRegexpOne-to-extract query rewrite that produced wrong results when the regexp did not match; also fixed an exception when replaceRegexpOne was used with GROUP BY ... WITH CUBE and group_by_use_nulls=1. #97546 (Alexey Milovidov).
Fixed LOGICAL_ERROR exceptions caused by LowCardinality inside compound types (Variant, Dynamic, Tuple) in concatWithSeparator, format, IN subqueries, GLOBAL IN, and joins with runtime filters. #97831 (Raúl Marín).
Fixed a LOGICAL_ERROR exception "Chunk info was not set for chunk in MergingAggregatedTransform" when using ARRAY JOIN with the merge() table function over multiple Distributed tables combined with GROUP BY. #97838 (Raúl Marín).
Fixed a bug where it was not possible to use CTEs with distributed insert-selects. #97889 (Yarik Briukhovetskyi).
Fixed a segfault in query plan optimization when converting an outer join to an inner join with arrayJoin in a filter expression. #98147 (Alexey Milovidov).
Fixed a LOGICAL_ERROR "Trying to execute PLACEHOLDER action" when correlated columns from outer queries were referenced inside lambda functions such as arrayMap. #98285 (Alexey Milovidov).
Fixed a logical error exception in caseWithExpression when the CASE expression involved materialize(NULL) or other Nullable(Nothing) arguments. #98290 (Alexey Milovidov).
Fixed a bad cast exception when filtering the _table virtual column in the merge table function. #98291 (Alexey Milovidov).
Fixed an exception when ORDER BY ... WITH FILL was used together with LIMIT BY. #98361 (Alexey Milovidov).
Fixed an exception "Column ... query tree node does not have valid source node" when joining a Merge table (wrapping a Distributed table) with another table. #98376 (Alexey Milovidov).
Fixed an exception "Column identifier is already registered" when count_distinct_optimization was used with a QUALIFY clause. #98433 (Alexey Milovidov).
Fixed an exception "cannot be inside Nullable type" when using IN/NOT IN with LowCardinality column arguments (e.g. a NOT IN (b) where a is LowCardinality(String)). #98443 (Alexey Milovidov).
Fixed a LOGICAL_ERROR exception "Projection cannot increase the number of rows in a block" when merging parts with a TTL that deletes all rows and an aggregate projection with a constant GROUP BY key. #98458 (Alexey Milovidov).
Fixed an exception in DISTINCT queries when using aggregate projections and materialize caused LowCardinality type differences between the query and the projection. #98462 (Alexey Milovidov).
Fixed a logical error exception "Replica decided to read in WithOrder mode, not in ReverseOrder" when using parallel replicas with optimize_aggregation_in_order. #98467 (Alexey Milovidov).
Fixed an unexpected result with read_in_order_use_virtual_row and monotonic functions. #98514 (Vladimir Cherkasov).
Fixed a LOGICAL_ERROR: Not-ready Set is passed as the second argument for function 'in' when using PREWHERE with an IN subquery on MergeTree tables. #98522 (Alexey Milovidov).
Fixed an exception "Sorting column wasn't found in the ActionsDAG's outputs" when query_plan_convert_join_to_in was enabled with query_plan_merge_expressions = 0. #98526 (Alexey Milovidov).
Fixed a LOGICAL_ERROR when an Identifier was empty after parameter substitution. #98530 (Pervakov Grigorii).
Fixed distributed index analysis with expressions (not just columns) in the primary key, which led to zero filtering of redundant granules on remote replicas. #98561 (Azat Khuzhin).
Fixed a logical error "TABLE_FUNCTION is not allowed in expression context" when a table function with an alias appeared multiple times in the same query scope (e.g. in both PREWHERE and QUALIFY clauses). #98557 (Alexey Milovidov).
Fixed an exception "Bad cast from type DB::TableFunctionNode to DB::QueryNode" when using the input table function as an argument of remote. #98694 (Alexey Milovidov).
Fixed an exception in LogicalExpressionOptimizerPass when a boolean function in an equals comparison returned a Variant type. #98712 (Alexey Milovidov).
Fixed a UNKNOWN_IDENTIFIER exception when querying the merge() table function or Merge engine over tables with JSON columns that have different parameters and ALIAS columns referencing JSON sub-paths, with the new analyzer enabled. #98753 (Pavel Kruglov).
Fixed optimize_skip_unused_shards optimization with the analyzer when a Distributed storage was used in a View. #98754 (Nikolai Kochetov).
Fixed a LOGICAL_ERROR exception (Block structure mismatch in removeUnusedColumns) that could occur with FINAL + PREWHERE + a constant WHERE expression + column-independent aggregates like count(). #98778 (Alexey Milovidov).
Fixed an exception "Inconsistent table names" when using the view() table function containing JOINs inside another JOIN (only with the old analyzer). #98809 (Alexey Milovidov).
Fixed a "RPNBuilderFunctionTreeNode has A arguments, attempted to get argument at index B" LOGICAL_ERROR. #98900 (Azat Khuzhin).
Fixed an exception "Scalar doesn't exist" when querying a remote shard with optimize_const_name_size set and enable_scalar_subquery_optimization = 0. #98979 (andriibeee).
Fixed NOT_FOUND_COLUMN_IN_BLOCK for some queries with GROUP BY and expressions that include inverse dictionary lookup, Date/DateTime conversion comparisons, and tuple comparisons. #98980 (Nihal Z. Miaji).
Fixed a bug where EXISTS would ignore LIMIT and OFFSET clauses in subqueries, causing incorrect results when the subquery returned no rows due to an offset or a zero limit. #99005 (andriibeee).
Fixed a "Block structure mismatch" exception when filter push-down optimization encountered an AND expression that short-circuited to a constant with GROUPING SETS. #99010 (Alexey Milovidov).
Fixed a LOGICAL_ERROR when querying a table that has both a ROW POLICY and an ALIAS column using dictGet, caused by premature access to the table expression during ALIAS column resolution in the new analyzer. #99065 (Peng).
Skip unnecessary extra index analysis when read-in-order optimization is applied. #99084 (Vladimir Cherkasov).
Fixed access checks during InverseDictionaryLookupPass by checking access only once before running the optimization pass. #99210 (Mikhail Artemenko).
Fixed optimize_skip_unused_shards with the new analyzer when a Distributed table was used inside an IN subquery. #99436 (Nikolai Kochetov).
Fixed a heap-use-after-free in INTERSECT/EXCEPT when the query produced duplicate column names. #99471 (Alexey Milovidov).
Fixed a logical error in ALTER TABLE ... DROP PART when a typed query parameter was used for the part name. #99489 (Alexey Milovidov).
Fixed a std::length_error exception when querying empty system tables with Pretty format via the HTTP interface. #99541 (Alexey Milovidov).
Fixed performance degradation in the analyzer by pruning unused columns from ARRAY JOIN. #99587 (Dmitry Novik).
Fixed an exception (Bad get: has Tuple, actual type String) in ConditionSelectivityEstimator when a query used IN with a single scalar query parameter (e.g. WHERE col IN ({p:String})) on a table that has column statistics and use_statistics enabled. #99614 (Ilya Yatsishin).
Fixed a "Block structure mismatch" exception in queries with a HAVING clause where the filter expression contained both an aggregate wrapped in a NULL-producing function and materialize(0). #99915 (Alexey Milovidov).
Fixed a logical error with a correlated subquery within an untuple argument. #99917 (Vladimir Cherkasov).
Fixed an "Inconsistent AST formatting" exception for ALTER TABLE ... MODIFY QUERY with nested subqueries containing SETTINGS when the ALTER itself also had SETTINGS. #99938 (Nikita Mikhaylov).
Fixed a LOGICAL_ERROR exception "Not-ready Set" in the IN function during query plan optimization with convertAnyJoinToSemiOrAntiJoin. #99939 (Alexey Milovidov).
Fixed a LOGICAL_ERROR exception "Unexpected node type for table expression ... Actual IDENTIFIER" when a scalar subquery was used inside an unresolved table function argument. #100014 (Alexey Milovidov).
Fixed the quadratic number of run queries when distributed_index_analysis was used with predicates containing IN subqueries. #100287 (Anton Popov).
Fixed a "Block structure mismatch" exception when using GROUP BY ... WITH TOTALS HAVING combined with UNION DISTINCT and nullable expressions. #100293 (Alexey Milovidov).
Fixed an "Inconsistent AST formatting" exception in debug builds when using GROUP BY CUBE(...) WITH ROLLUP or similar combinations. #100376 (Alexey Milovidov).
Fixed an exception when creating a view with column aliases and SELECT * or EXCEPT/INTERSECT queries. #100386 (Alexey Milovidov).
Fixed a TOO_MANY_ROWS exception for SELECT count() queries with max_rows_to_read / force_primary_key when data was split across multiple parts with non-aligned granule boundaries. #100408 (Alexey Milovidov).
Fixed NOT_FOUND_COLUMN_IN_BLOCK cases where the projection SELECT part had columns that did not exist in the original SELECT part of the query. #100623 (Yarik Briukhovetskyi).
Allow passing a sharding key to the cluster() and clusterAllReplicas() table functions when using a table function as the source (e.g. cluster('name', view(...), sharding_key)). #100665 (Sergey Veletskiy).
Fixed incorrect AggregateFunction argument types in optimized trivial count, which caused a NUMBER_OF_ARGUMENTS_DOESNT_MATCH exception when querying expressions like count(v0 + v1) on distributed tables. #100794 (YjyJeff).
Fixed a logical error "Invalid action query tree node" when using INTERSECT ALL / UNION ALL with constant-folded expressions. #100977 (Alexey Milovidov).
Fixed an incorrect UNKNOWN_IDENTIFIER error when the same alias was used for multiple expressions in SELECT; the correct MULTIPLE_EXPRESSIONS_FOR_ALIAS error is now reported. #101040 (Alexey Milovidov).
Fixed an exception in DirectJoinMergeTreeEntity when pipeline blocks contained ColumnConst columns merged with regular columns. #101046 (Alexey Milovidov).
Fixed a spurious space in CTE column alias formatting (WITH t (a, b) → WITH t(a, b)). #101049 (Alexey Milovidov).
Fixed remote/cluster table functions failing with nested table functions like merge when the analyzer was enabled. #101055 (Alexey Milovidov).
Fixed OFFSET being applied twice in distributed queries when prefer_localhost_replica=1, producing fewer rows than expected. #101071 (Nihal Z. Miaji).
Fixed an "Illegal type Decimal64 of start parameter" error for timeseries aggregate functions when using serialize_query_plan=1 with parallel replicas. #101083 (Groene AI).
Fixed wrong query results when a large integer constant (e.g. 256, 2147483648) was used as a boolean predicate in a WHERE clause with AND on MergeTree tables. #101287 (Groene AI).
Fixed a crash with "Logical error: Reading from materialized CTE before materialization" when a scalar subquery referenced a chain of dependent materialized CTEs. #101305 (Groene AI).
Fixed an exception when casting a string with trailing data to the empty Tuple() type. #102011 (Alexey Milovidov).
Fixed a logical error when parsing an incorrect empty tuple string. #102289 (Nihal Z. Miaji).
Fixed incorrect aggregation results (duplicate rows) when using optimize_aggregation_in_order=1 with GROUP BY columns ordered differently from the table's sorting key. #102299 (Groene AI).
Fixed an exception in getStructureOfRemoteTable when the local shard returned empty columns due to concurrent DDL. #102604 (Alexey Milovidov).
Optimized row policy OR-chains to IN in the new analyzer. #102915 (Azat Khuzhin).
Fixed wrong results returned by WHERE x AND toNullable(N) on MergeTree tables when N is an integer wider than UInt8 (e.g. 256, 65535, 2147483648, or any negative integer). #103077 (Groene AI).
Fixed a logical error when attaching a part in MergeTree if there were several chained renames between detaching and attaching. #96351 (Alexey Milovidov).
Fixed non-deterministic uncompressed_hash computation for Compact MergeTree parts when multiple compression codecs are used, which could cause incorrect deduplication behavior. #97522 (Alexey Milovidov).
Fixed MEMORY_LIMIT_EXCEEDED exceptions being incorrectly reported as CORRUPTED_DATA during SummingMergeTree and CoalescingMergeTree merges. #97537 (János Benjamin Antal).
hasPartitionId now returns false if another partition with a higher partition ID exists in the data part set. #97748 (Mikhail Artemenko).
Fixed a mutation after a lightweight update with secondary indices. #98044 (Raúl Marín).
Fixed incorrect results of FINAL queries when mixing primary key and non-primary-key skip indexes. #98097 (Raúl Marín).
If the partition key columns are not covered by the sorting key, partition pruning could incorrectly skip partitions containing rows that should "win" during FINAL deduplication. #98242 (Yarik Briukhovetskyi).
Fixed a potential deadlock when two concurrent MOVE PARTITION operations work with the same pair of tables in opposite directions. #98264 (Alexey Milovidov).
Fixed a LOGICAL_ERROR exception "Invalid binary search result in MergeTreeSetIndex" triggered by toDate conversion on key columns with data crossing the 65535 boundary. #98276 (Alexey Milovidov).
Fixed sporadic deduplication failure where re-inserts were incorrectly deduplicated due to inconsistent cleanup ordering between the blocks/ and deduplication_hashes/ ZooKeeper directories. #98293 (Alexey Milovidov).
Fixed MATERIALIZE INDEX and MATERIALIZE PROJECTION mutations getting stuck when the index or projection is dropped before the mutation finishes. #98369 (Alexey Milovidov).
Fixed incorrect partition pruning results after merging parts with Nullable partition key columns, caused by wrong min-max index bounds. #98405 (Amos Bird).
Fixed a LOGICAL_ERROR exception in renameAndCommitEmptyParts that could occur when TRUNCATE TABLE ran concurrently with OPTIMIZE TABLE using MergeTree transactions. #98508 (Alexey Milovidov).
Fixed outdated data parts resurrection caused by incorrectly cleaning up empty covering parts. #98698 (Shaohua Wang).
Fixed detecting set skip index usefulness with an OR with false (i.e. or(x, 0)) predicate. #98776 (Azat Khuzhin).
Fixed skip indexes (and primary key conditions) not being applied for ALIAS columns when query plan expression merging is disabled (query_plan_merge_expressions = 0 or query_plan_enable_optimizations = 0). #98960 (Peng).
Fixed incorrect or insufficient pruning when startsWith, LIKE, or NOT LIKE were used with a FixedString column. Additionally, the FixedString-to-String cast function can now prune granules when wrapped around a key column. #99001 (Nihal Z. Miaji).
Fixed an exception when reading patch parts (lightweight updates) without the _part_offset column in the query plan. #99023 (Alexey Milovidov).
A query like SELECT * FROM table WHERE pk_id = '' where pk_id is a String primary key now correctly uses the primary key index for filtering granules. #99027 (Shankar Iyer).
Fixed an exception when creating a table with an EPHEMERAL column that has the same name as a virtual column (e.g. _part_offset). #99031 (Alexey Milovidov).
Fixed LOGICAL_ERROR due to patch parts column order mismatch. #99164 (Pablo Marcos).
Fixed ALTER TABLE UPDATE/DELETE failing with a Missing columns error when a table has a MATERIALIZED column whose expression depends on an EPHEMERAL column. #99281 (Yash).
Fixed compatibility when upgrading replicated tables with implicit minmax indices from 25.10 to newer versions. #99392 (Raúl Marín).
Fixed rare incorrect marking of a data part as broken and detaching it after a DETACH/ATTACH TABLE query. #99529 (Anton Popov).
Fixed a vertical merge rows_sources assertion failure when SYSTEM STOP/START MERGES toggles rapidly during a merge of a table with Dynamic columns. #99532 (Alexey Milovidov).
Fixed incorrect partition pruning for toWeek() that caused queries with WHERE toWeek(date, mode) = N to return empty results for weeks 49–52 on tables partitioned by toYYYYMM(date). #99542 (Takumi Hara).
Fixed a LOGICAL_ERROR when using ALTER TABLE ADD COLUMN to create an EPHEMERAL column with the same name as a virtual column (e.g. _part_offset). #99549 (Alexey Milovidov).
Fixed CLEAR COLUMN not rebuilding projections and not re-evaluating materialized columns that depend on the cleared column, which could cause exceptions or data corruption during subsequent merges. #99565 (Desel72).
A part with unknown projections is no longer marked as lost forever. #99623 (Sema Checherinda).
Fixed a dangling reference in injectRequiredColumns causing a crash during merge. #99679 (Tuan Pham Anh).
Fixed a logical error during merge of projection parts after CLEAR COLUMN on a table with projections and compact parts. #100068 (Pavel Kruglov).
Fixed a LOGICAL_ERROR "Stream ... not found" when inserting into a table with nested Array(JSON) columns in wide parts with optimize_on_insert=0. #100475 (Pavel Kruglov).
Evaluate engine arguments for StorageAlias before storing the definition, so that expressions like currentDatabase() are resolved to literals before being saved to the database. #100902 (Nikolay Degterinsky).
Fixed divide and intDiv returning ILLEGAL_DIVISION when used in filter expressions during index analysis in some cases. #100928 (Nihal Z. Miaji).
Fixed a minmax_count_projection and trivial COUNT(*) optimization being permanently disabled after a lightweight delete, even after all parts with a lightweight delete mask were merged away. #101212 (Anton Popov).
Fixed materialize_skip_indexes_on_merge=false not suppressing text (full-text) indexes during merge. Previously, only non-text skip indexes (minmax, set, bloom_filter) were suppressed. #101932 (Groene AI).
Vector similarity index cache entries are now removed after part removal (e.g. during a merge), fixing entries never being evicted due to mismatched cache keys. #99575 (Seva Potapov).
Fixed a false LOGICAL_ERROR exception during filesystem cache dynamic resize due to a race condition in SLRU sub-queue promotion. #99850 (Alexey Milovidov).
Fixed a possible logical error during reading of Map subcolumns. #101641 (Pavel Kruglov).
Fixed exact subcolumn match priority over prefix match in getSubcolumnData to avoid a possible crash. #101645 (Pavel Kruglov).
Fixed using wrong extremes in a min-max index created on a JSON column, which led to wrong query results. #101918 (Pavel Kruglov).
Fixed an SLRU race bug in the filesystem cache (26.1+) that could lead to a space reservation logical error. #101991 (Kseniia Sumarokova).
Fixed a Having zero bytes logical error from the cache arising from a remote object being overwritten between list and read, which previously resulted in stale object metadata. #101219 (Kseniia Sumarokova).
Fixed a file_offset_of_buffer_end <= getFileSize() assertion failure when reading from Log or StripeLog tables on S3 object storage with concurrent writes. #100763 (Alexey Milovidov).
Fixed a LOGICAL_ERROR abort during SLRU filesystem cache dynamic resize caused by shared eviction statistics across sub-queues and an incorrect recovery path for failed candidates. #102396 (Antonio Andelic).
Fixed a minmax_count_projection and trivial COUNT(*) optimization being permanently disabled after a lightweight delete. #102900 (Anton Popov).
Fixed a wrong result or exception during reading of subcolumns of ALIAS columns. #95408 (Pavel Kruglov).
Fixed the sumCount aggregate function being unable to read older serialized states after the introduction of Nullable(Tuple). #97502 (Nihal Z. Miaji).
Fixed a logical error about a missing stream during INSERT SELECT with JSON and buckets in shared data. #97523 (Pavel Kruglov).
Fixed possible crashes during reading of empty granules in advanced shared data in JSON. #97778 (Pavel Kruglov).
Fixed JIT miscompilation of the sign function for integer types wider than Int8 — values outside the -128..127 range could produce an incorrect sign. #98012 (Alexey Milovidov).
Fixed the ProtobufList format not working with the Kafka engine due to read state not being reset between messages. #98151 (Alexey Milovidov).
Fixed silent data corruption when inserting a Parquet/Arrow Date column into an Enum column — the incompatible type conversion is now properly rejected instead of storing invalid enum values. #98364 (Alexey Milovidov).
Fixed an exception when reading an Arrow file with an Array column into a table with a Nested column. #98365 (Alexey Milovidov).
Fixed an exception when reading from Nullable(Tuple(...)) where a Tuple element name collides with the Nullable null subcolumn. #98372 (Alexey Milovidov).
Fixed incorrect Parquet Bool-to-FixedString conversion in the native V3 reader that produced raw bytes instead of a string representation. #98378 (Alexey Milovidov).
Fixed tryGetColumnDescription to filter subcolumns by parent column kind, consistent with other column lookup methods. #98391 (Alexey Milovidov).
Fixed column rollback in the Buffer engine when handling an exception during appending a new block. The old logic could lead to corrupted in-memory column state. #98551 (Pavel Kruglov).
Fixed an exception Bad cast from type ColumnConst to ColumnDynamic in null-safe comparison (<=> / IS NOT DISTINCT FROM) with const Dynamic or Variant columns and NULL. Also fixed IS DISTINCT FROM with Dynamic/Variant vs NULL always incorrectly returning 0. #98553 (Alexey Milovidov).
Fixed parseDateTimeBestEffort incorrectly parsing words starting with month/weekday prefixes. #98742 (Pavel Kruglov).
Fixed a reverseUTF8 exception on invalid (truncated) UTF-8 input. #98770 (Alexey Milovidov).
Fixed a LOGICAL_ERROR exception in financial functions (financialNetPresentValue, financialInternalRateOfReturn, etc.) when BFloat16 type arguments are passed. #98958 (Alexey Milovidov).
SummingMergeTree no longer sums Bool (and other domain type) columns; Bool values are kept as-is instead of being arithmetically summed. #98976 (Yash).
Fixed undefined behavior (null pointer dereference) when altering a version/sign/is_deleted column to EPHEMERAL or ALIAS in MergeTree engines. Such alterations are now properly rejected. #98985 (Alexey Milovidov).
Fixed parseDateTimeBestEffort incorrectly parsing words starting with month prefixes in DD-month-YYYY format. #99350 (Pavel Kruglov).
Fixed CHECK TABLE with sparse serialization inside a Tuple with Dynamic. #99351 (Pavel Kruglov).
Fixed a logical error unordered_map::at: key not found in the Avro output format when serializing Enum8/Enum16 columns with values not present in the enum definition. #99332 (Desel72).
Fixed a bug where comparing Time[64] and DateTime[64] types was confusing; Time[64] values are now promoted to DateTime[64] by adding 1970-01-01 as the date part. #99267 (Yarik Briukhovetskyi).
Fixed \N (NULL) deserialization for the Variant type in Escaped/Raw text formats. #99648 (Pavel Kruglov).
Fixed undefined behavior in the Avro format reader when reading numeric values that overflow the target column type. Queries now fail on overflows instead of silently producing incorrect values. #99697 (asyablue22).
Fixed a false-positive abort in NativeReader when deserializing a Native format stream with a row-count mismatch: changed from LOGICAL_ERROR to INCORRECT_DATA. #99822 (Rahul Nair).
Fixed a process abort in Tuple column deserialization when the serialization kind in the binary stream is DETACHED. #99823 (Rahul Nair).
Fixed the aggregate_functions_null_for_empty setting to work with aggregate functions returning non-Nullable types such as Array or Map (e.g. groupArray, sumMap). #99839 (Alexey Milovidov).
Fixed a LOGICAL_ERROR exception in the midpoint function when called with mixed signed/unsigned integer types. #99867 (Alexey Milovidov).
Fixed a LOGICAL_ERROR exception in queries involving Dynamic columns with cross joins and runtime filters, caused by ColumnVariant::filter sharing variant column pointers instead of cloning them in the hasOnlyNulls optimization path. #100234 (Pavel Kruglov).
Fixed an array-of-variant bug that could reinterpret the data type upon calling arrayFirst/arrayLast. For example, Array(Variant(Date, Bool)) was converted to Bool when the actual underlying variant type was Date. #100255 (timothygk).
Fixed CSV and MsgPack formats not being able to parse Nullable(Tuple) properly. #100038 (Nihal Z. Miaji).
Fixed accurateCastOrDefault and to*OrDefault functions not preserving Const column type for constant inputs. #100132 (Alexey Milovidov).
Omitted query parameters with LowCardinality(Nullable(T)) type now correctly default to NULL, the same as Nullable(T). #100144 (Denys Melnyk).
Fixed signed integer overflow (undefined behavior) in toStartOfInterval when called with very large millisecond/microsecond interval values on high-precision DateTime64. #100156 (Alexey Milovidov).
Fixed join runtime filters on top of the Variant data type. #100182 (Dmitry Novik).
Fixed an exception when computing the common supertype for empty and non-empty tuples with use_variant_as_common_type enabled. #100699 (Antonio Andelic).
Fixed undefined behavior in positiveModulo when the unsigned divisor does not fit in the signed result type. #100705 (Raúl Marín).
Fixed undefined behavior (signed integer overflow) in toStartOfInterval when using Week, Quarter, or Year intervals with an origin argument and extreme interval values. #100817 (Raúl Marín).
Fixed If, Distinct, DistinctIf, and IfState aggregate function combinators with a Tuple return type and one or more Nullable argument being unable to read older serialized states after the introduction of Nullable(Tuple). #100826 (Nihal Z. Miaji).
cast_keep_nullable, when enabled, no longer throws when casting a Dynamic null to a Variant. #100864 (Seva Potapov).
Fixed an exception in intDiv/intDivOrZero on arrays of nullable tuples. #100895 (Raúl Marín).
Fixed undefined behavior (signed integer overflow) in parseDateTimeBestEffort when parsing datetime strings with more than 18 fractional-second digits. #100948 (Vasily Chekalkin).
Fixed the sumCountOrDefault aggregate function with one or more Nullable argument being unable to read older serialized states after the introduction of Nullable(Tuple). #101021 (Nihal Z. Miaji).
Fixed ALIAS columns with DateTime/DateTime64 types not applying timezone conversion when the declared timezone differed from the expression timezone. #101043 (Alexey Milovidov).
Strip Nullable from the result column in arrayIntersect and related functions to avoid a serialization/deserialization mismatch. #101569 (George Larionov).
Fixed positiveModulo(tuple, number) incorrectly dispatching to division instead of modulo. #101709 (ClickGap AI Bot).
Fixed incorrect output in formatDateTime with the %W formatter under certain non-default formatting settings. #101847 (Robert Schulze).
Fixed a type mismatch exception in transform when the default column is const on some blocks. #100616 (Pavel Kruglov).
min/max/argMin/argMax now treat NaN consistently with ORDER BY: NaN is always skipped (returned only when all values are NaN). Previously, results depended on NaN position in the data due to IEEE 754 unordered comparison semantics. #100448 (Raúl Marín).
Fixed wrong date data type inference in case of overflow after timezone adjustment. #102674 (Pavel Kruglov).
Fixed CASE with a Dynamic expression returning ELSE for all rows. #102684 (Pavel Kruglov).
Fixed flattened Dynamic type serialization with binary-encoded data types. #102692 (Pavel Kruglov).
Fixed cast_string_to_date_time_mode being ignored for CAST to Nullable(DateTime). #103035 (Pavel Kruglov).
Made the null representation in serialization of replicated and sparse columns respect settings (e.g. format_tsv_null_representation). #102888 (Hechem Selmi).
Fixed usage of the text index with other skip indexes. Previously, logical errors such as "Trying to get non-existing mark" could be thrown when a query filter used a text index and other regular skip indexes simultaneously. #98555 (Anton Popov).
Fixed rebuild of text indexes on merges with TTL. #99107 (Anton Popov).
Fixed too strict validation of the text index preprocessor. #99359 (Anton Popov).
Removed support for negated functions (notEquals, notLike, notIn) in text index analysis. These functions could never skip any granules, so analyzing the index for them only added overhead. #99393 (Anton Popov).
Fixed a NOT_FOUND_COLUMN_IN_BLOCK exception when a text index predicate (e.g. hasAllTokens) was referenced in both SELECT and WHERE clauses via an alias. #99504 (Anton Popov).
Fixed incorrect results when using hasAllTokens with OR across columns that have separate text indexes. #99505 (Anton Popov).
Fixed reading of the text index in a table with existing lightweight deletes and row policies. #99661 (Anton Popov).
Fixed analysis of predicates with the IN function by the text index with the preprocessor, and fixed a collision of searched tokens that could lead to incorrect results. #99755 (Anton Popov).
Fixed incorrect usage of the text index with startsWith and endsWith functions for tokenizers that do not support substring matching (array, splitByString). Previously, these tokenizers could silently produce wrong results. #100151 (Anton Popov).
Fixed a rare crash when a "Memory Limit" exception was thrown while building the text index. #100213 (Anton Popov).
Fixed "Not-ready Set is passed as the second argument" errors during analysis of a text index built on top of mapValues for predicates containing an IN clause. #100224 (Anton Popov).
Fixed a crash when using a text search index with an IN clause containing a tuple subquery, e.g. WHERE (id, str) IN (SELECT (id, str) FROM ...), or when the number of columns in the subquery did not match the tuple on the left side of IN. #100959 (Anton Popov).
The splitByString tokenizer now rejects empty separator strings. #101928 (Robert Schulze).
The sparseGrams tokenizer generated longer tokens than the provided max length (due to a hard-coded +2 in the implementation). #101934 (Elmi Ahmadov).
Fixed an exception when querying Merge or Distributed tables with a full-text index and combined filter conditions that mix has*Tokens with LIKE, while query_plan_direct_read_from_text_index is enabled. #101939 (Jimmy Aguilar Mena).
Fixed hasToken/hasTokenOrNull with separator-only needles (e.g. '()', '!!!') on columns with a text index: previously the index silently skipped all granules instead of throwing BAD_ARGUMENTS (for hasToken) or returning NULL (for hasTokenOrNull). #102544 (Jimmy Aguilar Mena).
Full text index settings (enable_full_text_index, allow_experimental_full_text_index, use_skip_indexes_on_data_read) are no longer disabled when the compatibility setting points to a version older than 26.1. #102422 (Nikita Fomichev).
Added support for ALIAS columns in the text index direct read optimization. #103037 (Anton Popov).
Fixed a logical race on snapshot version change in the DeltaLake table engine, and removed redundant heavy snapshot reloads. #96226 (Kseniia Sumarokova).
Fixed a DUPLICATE_COLUMN exception and silent NULLs when reading Delta Lake tables that use column mapping "name" mode with struct fields whose names contain dots. #98013 (Caio Ishizaka Costa).
Fixed a bug when using the Unity catalog on top of GCS. #98456 (Melvyn Peignon).
DataLakeCatalog now respects the server's http_forbid_headers configuration when validating the auth_header setting. #98827 (Michael Anastasakis).
Creating, attaching to, and updating local data lake tables outside user paths is now forbidden, checked when creating and updating IDataLakeMetadata. #98936 (Daniil Ivanik).
Fixed Iceberg BigLake reads: ADC credentials are now forwarded to the GCS S3 client (fixing 403 errors), OAuth2 credentials are URL-encoded before sending, and namespace traversal no longer aborts on BigLake HTTP 400 responses. #98998 (Nikita Fomichev).
Fixed an out-of-bounds error when querying only virtual columns from an Iceberg table with Avro data. #99080 (alesapin).
Fixed a crash in ALTER TABLE ... REMOVE SETTINGS for the Iceberg table engine. #99108 (alesapin).
Fixed a very rare crash when an Iceberg table contains files of mixed format (ORC and Parquet). #99168 (alesapin).
Fixed a crash (null pointer dereference) when executing ALTER TABLE ... MODIFY COLUMN ... COMMENT on Iceberg tables. #99838 (Desel72).
ClickHouse now properly handles Spark-style tables (with a full absolute path for each file or a relative path to the common table path). #99935 (alesapin).
Fixed an exception when an Iceberg metadata file path setting contains a null byte. #100283 (Alexey Milovidov).
Fixed a copy-paste bug where delta_lake_snapshot_end_version set without delta_lake_snapshot_start_version was silently ignored instead of producing a BAD_ARGUMENTS error. #100454 (Mohammad Lareb Zafar).
Fixed a logical error in sorted Iceberg tables during mutations. #100499 (alesapin).
Fixed the Polaris catalog with Azure, where since 25.12 the catalog added the bucket at the beginning of the path. #100583 (Konstantin Vedernikov).
Fixed a Logical error: 'partitions_count > 0' exception when performing consecutive ALTER TABLE UPDATE on a partitioned Iceberg table. #101278 (Desel72).
Throw an error when delta_lake_snapshot_version or CDF version settings are used without DeltaKernel enabled, instead of silently returning wrong data. #101489 (Desel72).
Fixed an Iceberg INSERT retry loop failing when the table was created with iceberg_metadata_file_path and the target metadata version already existed. #101548 (Groene AI).
Fixed a server crash (LOGICAL_ERROR) when selecting from a materialized view backed by an IcebergLocal table engine. #101577 (Groene AI).
Fixed a crash in IcebergLocal ALTER TABLE ... UPDATE when using Avro format, caused by LowCardinality/Nullable wrapper types not being unwrapped before serialization. #102337 (Desel72).
Added the missing hostname column to system.delta_lake_metadata_log and system.iceberg_metadata_log. #102162 (Michael Russell).
Fixed a crash where querying files with a glob pattern (e.g. file('dir/**', 'LineAsString')) would throw an unhandled filesystem exception if the directory contained a dangling symlink. Dangling symlinks are now silently skipped. #98143 (Mark Andreev).
Fixed excessive memory usage (~514 MiB) during format auto-detection when reading non-Arrow data (e.g. JSON from url() or file() without an explicit format), caused by the ArrowStream reader misinterpreting the first bytes as a huge metadata length. #98893 (Konstantin Bogdanov).
Fixed a race condition that could cause a "ReadBuffer is canceled" exception in queries using urlCluster or similar cluster table functions. #98955 (Alexey Milovidov).
Fixed a misleading "inflate failed: buffer error" when reading non-existent compressed files via the url() table function with glob patterns. #99034 (Alexey Milovidov).
Fixed cases where numbers with leading zeros in a hive partitioning path caused errors. #99458 (Yarik Briukhovetskyi).
Fixed incorrect seek in AsynchronousReadBufferFromFileDescriptor with O_DIRECT. #99678 (Pavel Kruglov).
Fixed a bug where ClickHouse could skip files if the Content-Length header was missing in their HEAD request response (for example, because of decompressive transcoding in GCS). #99971 (Yarik Briukhovetskyi).
The server no longer fails to start when an Azure blob storage disk is configured but the endpoint is temporarily unreachable (e.g. DNS failure). #100701 (Raúl Marín).
Fixed a segfault in s3Cluster and distributed queries due to connection pool use-after-free. #100837 (Konstantin Bogdanov).
Fixed an exception escaping from the S3 Client::~Client destructor causing server termination. #101798 (Gagan Dhakrey).
Fixed the host and port not being logged correctly during reconnection. #102280 (Grant Holly).
Fixed a server exception in ClusterDiscovery when a static cluster (defined in config) temporarily had no live nodes. #102661 (Kseniia Sumarokova).
S3Queue: fixed a failed assert after a ZooKeeper connection loss but successful commit. #100210 (Kseniia Sumarokova).
Fixed the inverted description text for the alterable column in system.s3_queue_settings and system.azure_queue_settings — swapped the 0 and 1 meanings to match the actual code behavior. #101703 (ClickGap AI Bot).
The loop table function called inner_storage->read() directly, bypassing the interpreter layer where row policies, column-level grants, and other security checks are applied. This allowed a user restricted by row policies to read all rows via loop(table). #97682 (pufit).
Enforce READ ON FILE checks for scalar file() and DESCRIBE TABLE file(). #98115 (Nikolay Degterinsky).
Accept base64 credentials without padding in HTTP Basic Auth. Some HTTP clients omit trailing = padding in the Authorization: Basic header, which previously caused authentication failures. #98392 (Amos Bird).
Fixed an RBAC bypass that allowed users to DESCRIBE any table via remote(), remoteSecure(), cluster(), or clusterAllReplicas() pointed at localhost, without requiring SHOW_COLUMNS privilege. #98669 (pufit).
Fixed an issue where system.grants omitted the regular expression parameters for URL and S3 grants in the access_object column. #98987 (DQ).
Fixed an RBAC bypass that allowed users to obtain table structure via DESCRIBE TABLE or CREATE TABLE AS on table functions (mysql(), postgresql(), sqlite(), arrowFlight(), jdbc(), odbc(), etc.) without the required source access privileges. For functions that infer schema from remote servers, this also allowed triggering outbound connections (SSRF) without authorization. #99122 (pufit).
Clamp settings constraints in the DDL worker for distributed DDL queries. #99317 (Pablo Marcos).
Fixed minor issues with TOTP authentication: the --one-time-password CLI option with an empty password, and validation of <digits> and <period> configuration values. #99322 (Vladimir Cherkasov).
Credentials in JDBC, ODBC, and NATS connection strings are now masked in query logs and SHOW CREATE output. For URI-style connection strings, only the password portion is masked. The nats_token setting is now also masked. #99344 (János Benjamin Antal).
Forbid reading Google credentials from a local file, which was insecure because it allowed reading other credentials if the file path was known. #99584 (Konstantin Vedernikov).
Fixed the MySQL dictionary source bypassing RemoteHostFilter for inline DDL params. #99720 (Shaohua Wang).
Fixed system.completions to correctly filter databases, tables, and columns by access rights in all grant combinations: per-table, per-db, and per-column revoke. #100432 (Shaohua Wang).
Sanitize query_id in the HTTP handler to prevent CRLF injection into response headers via the X-ClickHouse-Query-Id header. #100500 (Pablo Marcos).
Some catalogs could show secrets in the SETTINGS section of SELECT * FROM system.databases. This is now prevented. #100800 (Konstantin Vedernikov).
Row policies were not propagated from sub-planners to the parent planner for logging, so they were missing from query_log for views, subqueries, and insert-selects. They are now kept inside QueryAccessInfo for logging. #101044 (Narasimha Pakeer).
Fixed a data race in the ZooKeeper client between the send thread and receive thread. #97887 (Pablo Marcos).
Fixed Keeper data loss after restart when using Azure Blob Storage with s3_plain metadata for log storage. #97987 (Antonio Andelic).
Set the Watch component for watch responses in aggregated_zookeeper_log instead of leaving it empty. #98202 (Antonio Andelic).
Fixed ClickHouse Keeper disconnecting Java ZooKeeper clients after an addWatch request. The Java client expects a 4-byte ErrorResponse body, but Keeper sent an empty body, causing EOFException and session disconnect. #98499 (Antonio Andelic).
Fixed zk_followers and zk_synced_followers Keeper metrics not decreasing when a follower goes down. Added new zk_learners and zk_synced_non_voting_followers metrics to the mntr command. #98504 (Antonio Andelic).
Fixed Keeper's secure raft port ignoring cipherList and dhParamsFile from the openSSL configuration. #98509 (Antonio Andelic).
Fixed misleading Keeper log messages like "Receiving request for session X took 9963 ms" where the reported time was actually spent waiting idle in poll() between heartbeats. #98510 (Antonio Andelic).
Fixed Keeper TCP connections preventing graceful server shutdown by not responding to the shutdown signal. #98525 (Alexey Milovidov).
Fixed a debug assertion in DDLWorker caused by a stale first_failed_task_name after a ZooKeeper entry was deleted during reinitialization recovery. #99099 (Antonio Andelic).
Fixed a bug in Keeper where a read request could get stuck (causing a session timeout) if a different unrelated session on the same server was closed at just the wrong moment. #99484 (Michael Kolupaev).
Fixed KeeperMapCREATE TABLE failing with "Cannot create metadata for table" when leftover ZooKeeper nodes from a pre-25.1 partial drop were missing the drop_lock_version node. #101623 (Antonio Andelic).
Fixed UDF registry loss when a ZooKeeper session expires during periodic refresh — all user-defined functions could become unavailable until a full refresh succeeds. #101891 (Nikita Fomichev).
Fixed a crash in UDF refresh caused by ZooKeeperRetriesControl retrying on a stale (expired) ZooKeeper session without renewing it. #102059 (Nikita Fomichev).
Fixed a crash in Kusto dialect functions bin(), bin_at(), extract(), and indexof() when empty arguments are provided. #95736 (NeedmeFordev).
Fixed a server crash/assert in mapContainsKey/mapContainsKeyLike with a tokenbf_v1 skip index. #97826 (Shankar Iyer).
Fixed a server crash (std::terminate) caused by an uncaught exception in the HTTP connection pool destructor when the connection group hard limit is reached under high concurrency. #97850 (Antonio Andelic).
Fixed out-of-bounds access in ColumnConst::getExtremes that could cause a crash when extremes = 1 is enabled. #98263 (Alexey Milovidov).
Validate corrupted data during DDSketch deserialization to prevent segfaults, exceptions, infinite loops, and OOM when reading corrupted quantilesDD aggregate function states. #98284 (Alexey Milovidov).
Fixed a rare exception in the pipeline executor, which could manifest as Received signal 6 (in debug builds), when pipeline expansion races with query cancellation. #98428 (Alexey Milovidov).
Fixed a null pointer dereference in dictGetOrDefault when the key argument is Nullable. #98460 (Alexey Milovidov).
Fixed a pipeline deadlock when using sort_overflow_mode = 'break' together with window functions. #98543 (Alexey Milovidov).
Fixed a crash from dereferencing a null pointer in system tables created between snapshotting the tables in IDatabaseTablesIterator::table() and the tables changing in another thread during later iteration. #98792 (Grant Holly).
Fixed memory tracking drift caused by failed allocations not being rolled back, nallocx(0) undefined behavior, and an off-by-one in global peak tracking. Extended tracking to cover io_uring ring buffers. #98915 (Antonio Andelic).
Fixed a server crash (std::terminate) when executing ALTER TABLE ... DROP PART on a patch part after a schema change (e.g. ADD COLUMN). #99036 (Peng).
Fixed a server crash that could occur if a memory limit exceeded exception was thrown during a cached disk read. #99042 (Shankar Iyer).
Fixed a crash triggered by a memory limit exception thrown during patch part application. #99086 (Anton Popov).
Fixed a crash on use of a Buffer table with SAMPLE when the destination does not support it. #99141 (Kseniia Sumarokova).
Fixed a heap-use-after-free when a table is dropped concurrently with a running read query. #99483 (Alexey Milovidov).
Fixed a heap-buffer-overflow in CompressionCodecT64 and a process abort in CompressionCodecMultiple when decompressing malformed compressed data. The codecs now throw an exception instead of crashing. #99680 (Rahul Nair).
Fixed an infinite loop when reading Npy format files with negative shape dimensions. #99812 (Desel72).
Fixed a global-buffer-overflow in the CRC32 function on FixedString arguments when evaluated with zero rows during query plan header computation. #99835 (Alexey Milovidov).
Fixed an assertion failure in sipHash128Keyed (and similar keyed hash functions) when the data argument is a Map with array keys or other nested array types. #99921 (Alexey Milovidov).
Fixed an assertion failure when multiplying NumericIndexedVector aggregate states by an even integer constant, caused by self-XOR on aliased Roaring bitmaps in pointwiseAddInplace. #99976 (Desel72).
Fixed a nullptr dereference in the Parquet reader when the filter-in-decoder path encounters filtered-out pages. #99677 (Alexey Milovidov).
Fixed a local server crash when CREATE DICTIONARY has a definition with a list value containing a non-existing function. #100036 (Yakov Olkhovskiy).
Fixed a server crash (assertion failure) when using parametric aggregate functions with the Array combinator and NULL arguments, such as quantileIfArrayArray(0.5)([[NULL]], [[1]]). #100679 (nerve-bot).
Fixed a heap-buffer-overflow in usearch sorted_buffer_gt::insert() that could crash or silently corrupt memory during vector similarity search. #100537 (Dustin Healy).
Fixed a server crash (logical error "Unexpected return type from __topKFilter") when use_top_k_dynamic_filtering is enabled and the ORDER BY column has Dynamic or Variant type. #100742 (Groene AI).
Fixed a server crash when using the has() function with PREWHERE/WHERE on a Tuple key containing LowCardinality elements. #100760 (Groene AI).
Fixed a null pointer dereference segfault when loading dictionaries during server shutdown. #100839 (Miсhael Stetsyuk).
Fixed a buffer overflow in ULIDStringToDateTime when the input contains non-ASCII bytes. #100843 (Konstantin Bogdanov).
Fixed a crash (LOGICAL_ERROR) when querying a Merge table (or merge() table function) that wraps multiple tables including a Distributed table, with distributed_group_by_no_merge=1 enabled. #100859 (Groene AI).
Fixed a crash when building a polygon dictionary from a MergeTree table that uses sparse columns serialization. #100964 (Anton Popov).
Fixed a crash (Logical error: isConst/isSparse/isReplicated assertTypeEquality) in merge algorithms when lazy column replication (enable_lazy_columns_replication) produces ColumnReplicated columns that flow into merge-sort pipelines with late-arriving inputs. #101036 (Groene AI).
Fixed a server crash (SIGABRT) when using aggregate functions with the internal-only Null combinator (e.g. sumNull, avgNull) and aggregate_functions_null_for_empty = 1. #101147 (Groene AI).
Fixed a use-after-free in the filesystem cache write path that could cause reads from freed memory when logging completed file segments. #101161 (Groene AI).
Fixed a server crash with "Trying to attach external table to a ready set without explicit elements" when distributed index analysis encounters a GLOBAL IN predicate whose set was built without explicit elements. #101178 (Groene AI).
Fixed MAX/MIN aggregate functions on Decimal columns returning incorrect results when JIT compilation is enabled. #101203 (Raúl Marín).
Fixed a server crash (LOGICAL_ERROR: Bad cast from ColumnVector to ColumnLowCardinality) when querying a MergeTree table with ORDER BY CAST(lc_column, 'Type') where lc_column has a LowCardinality type. #101220 (Groene AI).
Fixed UB in mergeTreeAnalyzeIndexes() in case of an invalid optimizations argument. #101253 (Azat Khuzhin).
Fixed a server crash (Logical error: "Variant ... is empty") when a query reads both a Tuple column containing a Dynamic Map type and its subcolumn simultaneously. #101448 (Groene AI).
Fixed a LOGICAL_ERROR crash "Current component is empty" when querying system.part_moves_between_shards with enforce_keeper_component_tracking enabled. #101462 (Groene AI).
Fixed a segmentation fault in DataTypeDynamic::create() when the fuzzer generates a malformed Dynamic type AST. #101464 (Groene AI).
Fixed a crash (LOGICAL_ERROR: "ColumnUnique can't contain null values") when comparing a LowCardinality column with a Variant NULL constant while use_variant_default_implementation_for_comparisons is disabled. #101690 (Groene AI).
Added an empty-stream guard to Bzip2ReadBuffer so it returns EOF instead of throwing UNEXPECTED_END_OF_FILE when the inner stream is empty. #101691 (ClickGap AI Bot).
Fixed a use-after-free crash in the CPU lease scheduler when the wait timer outlives the worker thread whose ProfileEvents::Counters it references. #101761 (Antonio Andelic).
Fixed a use-after-scope in parallel deserialization of Object type dynamic paths, which could cause crashes when reading tables with many dynamic paths. #101823 (Antonio Andelic).
Fixed a SIGSEGV in MergeTreeDataPartWriterWide::cancel when a stream constructor throws during addStreams, leaving a null entry in column_streams. #101936 (Antonio Andelic).
Fixed a segmentation fault in mutations on materialized columns without an expression. #102342 (zoomxi).
Fixed a segfault (or LOGICAL_ERROR in debug builds) when reading Parquet files with bloom filter push-down enabled and WHERE clause equality/inequality conditions, caused by an out-of-bounds memory access in the Parquet prefetcher. #102385 (Groene AI).
Fixed an out-of-bounds read in string search functions (countSubstrings, position, etc.) when searching for a needle consisting entirely of null bytes. #102401 (Raúl Marín).
Fixed an out-of-bounds read in printf with a trailing %. #102472 (Raúl Marín).
Fixed a LOGICAL_ERROR crash "Unexpected number of rows in column subchunk" in the native Parquet V3 reader when reading nullable columns with a WHERE filter. #102628 (Groene AI).
Fixed a server crash (SIGSEGV) when reading Avro files with recursive schemas containing cyclic symbolic type references. Such schemas are now detected and rejected with a clear error message. #102853 (Groene AI).
Fixed a server crash (LOGICAL_ERROR assertion) when a function on a Variant column hits a memory limit or other non-type-conversion exception during result casting in FunctionVariantAdaptor. #102855 (Groene AI).
Fixed a server crash in debug/sanitizer builds when std::length_error is thrown during schema inference. #102859 (Groene AI).
Fixed a crash when using a view with a WHERE clause where the inner query produces columns with different types than the view metadata (e.g. Nullable from a LEFT JOIN with join_use_nulls). #102085 (Miсhael Stetsyuk).
Fixed a bug where explicit settings sent alongside compatibility in the same request could be silently ignored when their value matched the server default. #97078 (Raufs Dunamalijevs).
Fixed the client reporting NETWORK_ERROR instead of the actual parsing error (with the correct row number) when an INSERT with parallel parsing encounters invalid data. #97339 (Alexey Milovidov).
Fixed DROP DATABASE with database_atomic_wait_for_drop_and_detach_synchronously hanging indefinitely when the query is killed. #97586 (Alexey Milovidov).
Fixed KILL QUERY not being able to terminate queries stuck in WITH FILL generation, dictionary loading via dictGet, or ALTER DELETE with mutations_sync=1 on ReplicatedMergeTree. #97589 (Alexey Milovidov).
Fixed a logical error with a data masking policy query with ON CLUSTER. #97594 (Bharat Nallan).
Fixed incorrect partition pruning when using pre-epoch DateTime64 with the toDate() function. #97746 (Yarik Briukhovetskyi).
Fixed a Cannot schedule a fileLOGICAL_ERROR on INSERT into Distributed due to a race between DROP and INSERT. #97822 (Azat Khuzhin).
Fixed a logical error "Bad cast from type DB::ColumnConst to DB::ColumnArray" in kql_array_sort_asc/kql_array_sort_desc when called with constant array arguments. #98251 (Alexey Milovidov).
The HTTP server now returns an error message in the response body for 400 Bad Request responses caused by malformed headers, instead of an empty body. #98268 (Alexey Milovidov).
Fixed wrong results with distributed index analysis (experimental feature) and the query condition cache. #98269 (Azat Khuzhin).
Fixed a MongoDB dictionary source failing with named collections. #98528 (Pablo Marcos).
Disallow dropping a column when its subcolumns are used in other columns' default/alias expressions, and use the analyzer for default expressions on ALTER DROP COLUMN. #98569 (Nikita Mikhaylov).
All DB::Exception from PocoHTTPClient::makeRequestInternalImpl are now non-retryable, including HTTP_CONNECTION_LIMIT_REACHED. #98598 (Sema Checherinda).
Fixed two bugs in JIT expression compilation: a copy-paste error in nativeCast type checking that made integer-to-integer and float-to-float cast branches unreachable, and an incorrect nullptr TargetMachine passed to LLVM PassBuilder. #98660 (Alexey Milovidov).
Fixed an exception "Inconsistent KeyCondition behavior" in debug builds when the primary key contains NaN float values, by making accurateLess and accurateEquals handle NaN consistently with ClickHouse sort order. #98964 (Alexey Milovidov).
Fixed windowFunnel with strict_deduplication returning an incorrect level when a duplicate event was encountered. #99003 (Yash).
Make system.trace_log entries for ClickHouse dictionaries' auto-reloads have non-empty query IDs. #98784 (Miсhael Stetsyuk).
Fixed SYSTEM START REPLICATED VIEW not waking up the refresh task. #98797 (Pablo Marcos).
Fixed WITH FILL STALENESS producing extra filled rows when data is read in multiple chunks (e.g. with a small index_granularity). #98895 (Alexey Milovidov).
Fixed a crash in query obfuscation for identifiers starting with an uppercase letter, such as Ab. #101450 (Xuewei Wang).
Fixed ignoring of TABLE_UUID_MISMATCH for the non-analyzer code path. #99380 (Azat Khuzhin).
Fixed clickhouse format --obfuscate producing invalid SQL by obfuscating skip index types, compression codec names, database engine names, and dictionary layout/source definitions. #99260 (Raúl Marín).
Validate setting changes in create queries when the engine itself also supports settings. #99279 (János Benjamin Antal).
Fixed insert_deduplication_token being silently ignored for INSERT SELECT queries without ORDER BY ALL. Providing insert_deduplication_token is now sufficient to enable deduplication regardless of ORDER BY ALL. #99206 (Desel72).
Delay processing until the server has finished loading all tables. #99700 (Seva Potapov).
Fixed parsing of shell-style quotes in arguments for the executable table function. #99794 (Nikita Semenov).
Fixed async insert queries reporting zero written_rows, read_rows, and result_rows in query_log and client output. #99879 (Sema Checherinda).
Fixed an INSERT with VALUES failing when the data was followed by a trailing SQL comment (-- or /* */) on the next line. The comment is now skipped instead of being parsed as another row. #100016 (Pratima Patel).
Fixed an exception in arrayRemove when comparing tuples with NULL components. #100017 (Alexey Milovidov).
Correctly process negative values inside NumericIndexedVectorDataBSI. #100086 (Daniil Ivanik).
Fixed the primes table function not respecting max_rows_to_read properly in some cases, especially when offset and step are present. #100199 (Nihal Z. Miaji).
Users no longer see NATURAL CROSS JOIN when querying the AST of a natural join query that rewrites to a CROSS JOIN. #100223 (Peter Nguyen).
A few minor function changes: h3 functions now validate boundaries better; readWKB checks size limits (a new setting, max_wkb_geometry_elements); random generator functions limit the maximum iterations. #100270 (Alexey Milovidov).
Fixed an issue where cutURLParameter could incorrectly skip parameters when they appeared as substrings of other parameters. #100280 (Nikita Semenov).
Fixed a LOGICAL_ERROR exception in estimateCompressionRatio when the block_size_bytes parameter is extremely large. #100298 (Alexey Milovidov).
Fixed DROP TABLE hanging indefinitely on Kafka engine tables when consumers are stuck in a rebalance after a heartbeat error. #100388 (Alexey Milovidov).
Validate Npy format shape dimensions against file size and overflow limits to prevent denial of service from crafted .npy files. Also reject empty shapes and cap per-row memory to 2 GiB. #100625 (Raúl Marín).
Fixed session_timezone being ignored when parsing DateTime values during async inserts (TCP) and all inserts over HTTP. #100647 (Sema Checherinda).
Made StorageRabbitMQ::shutdown idempotent and added defensive null checks, since it is now called twice (in StreamingStorageRegistry and DatabaseCatalog). #100455 (Miсhael Stetsyuk).
Fixed a LOGICAL_ERROR exception when using accurateCastOrNull with the QBit target type. #100470 (Raufs Dunamalijevs).
Fixed LIMIT m OFFSET n WITH TIES syntax not working (equivalent to LIMIT n, m WITH TIES, which already worked). #100491 (Nihal Z. Miaji).
Fixed a segfault where SerializationInfoTuple::add did an assert_cast on a plain SerializationInfo reference while merging parts after ALTER MODIFY COLUMN changed a column from a non-Tuple type to a Tuple. #100509 (Miсhael Stetsyuk).
Fixed an exception "No set is registered for key" when using IN with Nullable(Tuple) columns that have named fields and LowCardinality elements. #100523 (Alexey Milovidov).
Fixed EXECUTE AS ignoring FORMAT and INTO OUTFILE clauses specified in the query. #100538 (pufit).
Fixed inconsistent AST formatting for SAMPLE with a query-level OFFSET. #100579 (Pavel Kruglov).
Fixed "Target table doesn't exist" errors for materialized views with inner tables during async startup, caused by incorrect startup dependency ordering. #100946 (Nikolay Degterinsky).
Fixed an exception in optimizeLazyMaterialization when a projection with PREWHERE is used with ORDER BY ... LIMIT. #101115 (Anton Popov).
Fixed an incorrect error message when calling intExp10 with a NaN argument — it said intExp2 instead of intExp10. #101582 (Krishna Chaitanya).
Fixed allow_statistics=0 not blocking ALTER TABLE ADD STATISTICS and ALTER TABLE DROP STATISTICS. #101585 (Krishna Chaitanya).
Fixed CREATE TABLE ... AS merge() ignoring the explicit column list and instead auto-inferring columns from source tables, which caused NOT_FOUND_COLUMN_IN_BLOCK errors when merge_table_max_tables_to_look_for_schema_inference was low enough and source tables had different schemas. #101663 (Miсhael Stetsyuk).
Fixed RANGE_HASHED dictionary creation silently accepting a non-existent MAX range attribute and using the wrong type configuration when min and max range attributes had different types. #101732 (Yakov Olkhovskiy).
Fixed bugs in the arrayLevenshteinDistanceWeighted and arraySimilarity functions. #101767 (Mikhail f. Shiryaev).
Fixed a shouldPatchFunction false negative in SYSTEM INSTRUMENT ADD when the search string first appears inside a template argument of the demangled symbol name. #101885 (Pablo Marcos).
Fixed the system.codecs description for AES_256_GCM_SIV to report AES-256 instead of AES-128. #101917 (Jimmy Aguilar Mena).
Fixed undefined behavior when parsing native protocol query packets with invalid QueryProcessingStage values. #101972 (Raúl Marín).
Close the TCP connection when an exception occurs during initial query parsing, to prevent reading garbage from a desynchronized stream. #101989 (Raúl Marín).
Fixed cases where Time with negative values returned a wrong result on comparison with DateTime. #102056 (Yarik Briukhovetskyi).
Fixed missing spaces when formatting unlock snapshot. #102063 (Han Fei).
Failed to initialize on a fresh replica Alias tables without a target table in Database Replicated. #102397 (Nikolay Degterinsky).
Plain INSERTs without materialized views no longer request excessive ConcurrencyControl slots and threads (max_threads instead of max_insert_threads), preventing CC slot starvation and thread count blowup on clusters with high INSERT throughput. #102961 (Sema Checherinda).
Reintroduced ArrowMemoryPool to allow throwing MEMORY_LIMIT_EXCEEDED to avoid kernel OOM. #102999 (Azat Khuzhin).
Fixed an incorrect argument type reported in error messages of string search functions (e.g. locate, position) when arguments are passed in swapped order (locate(needle, haystack) with function_locate_has_mysql_compatible_argument_order = 1). #103102 (Alex Kuleshov).
Fixed waitForPause hanging indefinitely when disableFailPoint is called with no thread paused at the failpoint. #103119 (Shaohua Wang).