From 5b0ac809cc53b0c10201cc706358f99831df7c0b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= Date: Tue, 14 Apr 2026 17:59:00 +0200 Subject: [PATCH 01/11] Website: Add blog post for 24.0.0 --- _posts/2026-04-20-24.0.0-release.md | 131 ++++++++++++++++++++++++++++ 1 file changed, 131 insertions(+) create mode 100644 _posts/2026-04-20-24.0.0-release.md diff --git a/_posts/2026-04-20-24.0.0-release.md b/_posts/2026-04-20-24.0.0-release.md new file mode 100644 index 00000000000..18245df3b56 --- /dev/null +++ b/_posts/2026-04-20-24.0.0-release.md @@ -0,0 +1,131 @@ +--- +layout: post +title: "Apache Arrow 24.0.0 Release" +date: "2026-04-20 00:00:00" +author: pmc +categories: [release] +--- + + +The Apache Arrow team is pleased to announce the 24.0.0 release. This release +covers over 3 months of development work and includes [**XXX resolved +issues**][1] on [**YYY distinct commits**][2] from [**ZZZ distinct +contributors**][2]. See the [Install Page](https://arrow.apache.org/install/) to +learn how to get the libraries for your platform. + +The release notes below are not exhaustive and only expose selected highlights +of the release. Many other bugfixes and improvements have been made: we refer +you to the [complete changelog][3]. + +## Community + +We recently published our [Community Highlights for 2025](https://arrow.apache.org/blog/2026/03/19/arrow-2025-highlights/), check those out. + +Thanks everyone for your contributions and participation in the project! + +## Arrow Flight RPC Notes + + +## C++ Notes + + +### Compute + + +### Format + + +### Parquet + + +#### Encryption + + +### Miscellaneous C++ changes + + +## Linux Packaging Notes + + +## MATLAB Notes + + +## Python Notes + +### Compatibility notes + + +### New features + + +### Other improvements + + +### Relevant bug fixes + + +## R Notes + +### Compatibility notes + + +### Relevant bug fixes + + +## Ruby and C GLib Notes + + +### Ruby + + +### C GLib + + +## Java, JavaScript, Go, .NET, Swift and Rust Notes + +The Java, JavaScript, Go, .NET, Swift and Rust projects have moved to separate +repositories outside the main Arrow [monorepo](https://github.com/apache/arrow). + +- For notes on the latest release of the [Java +implementation](https://github.com/apache/arrow-java), see the latest [Arrow +Java changelog][7]. +- For notes on the latest release of the [JavaScript +implementation](https://github.com/apache/arrow-js), see the latest [Arrow +JavaScript changelog][8]. +- For notes on the latest release of the [Rust + implementation](https://github.com/apache/arrow-rs) see the latest [Arrow Rust + changelog][5]. +- For notes on the latest release of the [Go +implementation](https://github.com/apache/arrow-go), see the latest [Arrow Go +changelog][6]. +- For notes on the latest release of the [.NET +implementation](https://github.com/apache/arrow-dotnet), see the latest [Arrow .NET changelog][9]. +- For notes on the latest release of the [Swift implementation](https://github.com/apache/arrow-swift), see the latest [Arrow Swift changelog][10]. + +[1]: https://github.com/apache/arrow/milestone/72?closed=1 +[2]: {{ site.baseurl }}/release/24.0.0.html#contributors +[3]: {{ site.baseurl }}/release/24.0.0.html#changelog +[4]: {{ site.baseurl }}/docs/r/news/ +[5]: +[6]: +[7]: +[8]: +[9]: +[10]: From a82e141b6d1704dfe45ae86a987385595accb17c Mon Sep 17 00:00:00 2001 From: Antoine Pitrou Date: Thu, 16 Apr 2026 15:16:34 +0200 Subject: [PATCH 02/11] Add C++ notes --- _posts/2026-04-20-24.0.0-release.md | 31 ++++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/_posts/2026-04-20-24.0.0-release.md b/_posts/2026-04-20-24.0.0-release.md index 18245df3b56..073c5050569 100644 --- a/_posts/2026-04-20-24.0.0-release.md +++ b/_posts/2026-04-20-24.0.0-release.md @@ -40,26 +40,51 @@ We recently published our [Community Highlights for 2025](https://arrow.apache.o Thanks everyone for your contributions and participation in the project! +## Format Notes + +We have written a project-wide [Security Model](https://arrow.apache.org/docs/dev/format/Security.html) +outlining what users should expect when dealing with Arrow data, especially coming +from untrusted sources (GH-48868). + ## Arrow Flight RPC Notes ## C++ Notes +In addition to the aforementioned project-wide Security Model, we have written +a specific [Security Model for Arrow C++](https://arrow.apache.org/docs/dev/cpp/security.html) +covering more concrete topics such as API usage and parameter validity (GH-49274). ### Compute +### Extension Types -### Format - +The canonical type [VariableShapeTensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#variable-shape-tensor) +was finally implemented (GH-38007). ### Parquet +**Breaking change:** The Arrow extension type name for Parquet Variant columns +used to be `parquet.variant` but has been changed to `arrow.parquet.variant` (GH-49081). -#### Encryption +While Parquet C++ could only read unencrypted bloom filters, it now supports +reading encrypted bloom filters as well (GH-48334). In addition, it can also +write bloom filters, though only unencrypted (GH-34785). +An ambitious rewrite of the bit-unpacking utilities and optimizations has led to +significant performance improvements on reading some Parquet columns, up to 50% +faster in some cases (GH-48277). This rewrite is described in more detail +in an [accompanying blog post](https://medium.com/@AntoineProuvost/faster-reads-for-apache-parquet-improving-integer-unpacking-f6e21ce49a85). + +The performance of reading DELTA_BINARY_PACKED-encoded integers has been improved +in some favorable cases (GH-49266). ### Miscellaneous C++ changes +We have migrated to C++20 `std::span`, removing our home-grown implementation +in `arrow::util::span` (GH-48588). + +A bunch of previously deprecated APIs have been removed (GH-49356). ## Linux Packaging Notes From cb2e3da55213baca0c3d6fdeb41979ba253d9c83 Mon Sep 17 00:00:00 2001 From: Sutou Kouhei Date: Fri, 17 Apr 2026 00:29:16 +0900 Subject: [PATCH 03/11] Add Ruby and C GLib notes --- _posts/2026-04-20-24.0.0-release.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/_posts/2026-04-20-24.0.0-release.md b/_posts/2026-04-20-24.0.0-release.md index 073c5050569..219beaa7a3d 100644 --- a/_posts/2026-04-20-24.0.0-release.md +++ b/_posts/2026-04-20-24.0.0-release.md @@ -116,12 +116,35 @@ A bunch of previously deprecated APIs have been removed (GH-49356). ## Ruby and C GLib Notes +* Fixed GC related problems. +* `GArrowListArray`: Added support for returning offset buffer. +* `GArrowLargeListArray`: Added support for returning offset buffer. +* `GArrowUnionArray`: Added support for returning fields. +* Deprecated Feather features. ### Ruby +We've added pure Ruby Apache Arrow writer implementation to the +`red-arrow-format` gem. + +We've marked pure Ruby Apache Arrow reader implementation in the +`red-arrow-format`gem as stable because it passes integration tests +with other implementations. But it still has some missing features. + +The `red-arrow` gem: +* Add support for converting to raw Ruby objects of the following arrays: + * `Arrow::LargeBinaryArray` + * `Arrow::LargeUTF8Array` + * `Arrow::LargeListArray` + * `Arrow::FixedSizeListArray` + * `Arrow::DurationArray` + * `Arrow::DictionaryArray` with `Arrow::LargeBinaryArray` or + `Arrow::LargeUTF8Array` ### C GLib +No C GLib only notes. + ## Java, JavaScript, Go, .NET, Swift and Rust Notes From 551fec9c260d015f57eb22363a9a41b0b67d156e Mon Sep 17 00:00:00 2001 From: Jonathan Keane Date: Sat, 18 Apr 2026 08:27:40 -0500 Subject: [PATCH 04/11] R news --- _posts/2026-04-20-24.0.0-release.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/_posts/2026-04-20-24.0.0-release.md b/_posts/2026-04-20-24.0.0-release.md index 219beaa7a3d..d2fd1063290 100644 --- a/_posts/2026-04-20-24.0.0-release.md +++ b/_posts/2026-04-20-24.0.0-release.md @@ -108,11 +108,20 @@ A bunch of previously deprecated APIs have been removed (GH-49356). ## R Notes +### New Features + +* A number of new `dplyr` bindings [GH-49533](https://github.com/apache/arrow/issues/49533) and [GH-49534](https://github.com/apache/arrow/issues/49534) +* Improvements in metadata handling [GH-33390](https://github.com/apache/arrow/issues/33390) and [GH-48712](https://github.com/apache/arrow/issues/48712) + + ### Compatibility notes +* Arrow no longer builds with GCS enabled on CRAN to avoid failures in their build systems. If you would like a full-featured build of Arrow, we recommend installing from R-universe; see [the Using cloud storage article in the docs](https://arrow.apache.org/docs/r/articles/fs.html) for more information. [GH-49067](https://github.com/apache/arrow/issues/49067) + ### Relevant bug fixes +* `to_arrow()` now retains grouping [GH-40640](https://github.com/apache/arrow/issues/40640) ## Ruby and C GLib Notes From 4b7f2873f8ef5c7009822c05109ec38a0831e834 Mon Sep 17 00:00:00 2001 From: Jonathan Keane Date: Sun, 19 Apr 2026 08:17:24 -0500 Subject: [PATCH 05/11] Update _posts/2026-04-20-24.0.0-release.md Co-authored-by: Nic Crane --- _posts/2026-04-20-24.0.0-release.md | 1 - 1 file changed, 1 deletion(-) diff --git a/_posts/2026-04-20-24.0.0-release.md b/_posts/2026-04-20-24.0.0-release.md index d2fd1063290..9ec7bb7ceab 100644 --- a/_posts/2026-04-20-24.0.0-release.md +++ b/_posts/2026-04-20-24.0.0-release.md @@ -111,7 +111,6 @@ A bunch of previously deprecated APIs have been removed (GH-49356). ### New Features * A number of new `dplyr` bindings [GH-49533](https://github.com/apache/arrow/issues/49533) and [GH-49534](https://github.com/apache/arrow/issues/49534) -* Improvements in metadata handling [GH-33390](https://github.com/apache/arrow/issues/33390) and [GH-48712](https://github.com/apache/arrow/issues/48712) ### Compatibility notes From 7c0ece610c82510f45818ad8bbf8949859458871 Mon Sep 17 00:00:00 2001 From: Jonathan Keane Date: Sun, 19 Apr 2026 08:17:40 -0500 Subject: [PATCH 06/11] Update _posts/2026-04-20-24.0.0-release.md Co-authored-by: Nic Crane --- _posts/2026-04-20-24.0.0-release.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_posts/2026-04-20-24.0.0-release.md b/_posts/2026-04-20-24.0.0-release.md index 9ec7bb7ceab..cd791049fdc 100644 --- a/_posts/2026-04-20-24.0.0-release.md +++ b/_posts/2026-04-20-24.0.0-release.md @@ -110,7 +110,7 @@ A bunch of previously deprecated APIs have been removed (GH-49356). ### New Features -* A number of new `dplyr` bindings [GH-49533](https://github.com/apache/arrow/issues/49533) and [GH-49534](https://github.com/apache/arrow/issues/49534) +* A number of new `dplyr` bindings [GH-49533](https://github.com/apache/arrow/issues/49533), [GH-49256](https://github.com/apache/arrow/issues/49256), [GH-49535](https://github.com/apache/arrow/issues/49535) and [GH-49534](https://github.com/apache/arrow/issues/49534) ### Compatibility notes From 35bfaf5e03f5762dba00f1c5608489aa589649d6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= Date: Mon, 20 Apr 2026 09:46:41 +0200 Subject: [PATCH 07/11] Update _posts/2026-04-20-24.0.0-release.md Co-authored-by: David Li --- _posts/2026-04-20-24.0.0-release.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/_posts/2026-04-20-24.0.0-release.md b/_posts/2026-04-20-24.0.0-release.md index cd791049fdc..2c353c52fe9 100644 --- a/_posts/2026-04-20-24.0.0-release.md +++ b/_posts/2026-04-20-24.0.0-release.md @@ -47,7 +47,9 @@ outlining what users should expect when dealing with Arrow data, especially comi from untrusted sources (GH-48868). ## Arrow Flight RPC Notes +The ODBC driver is still a work-in-progress. The driver now builds on Linux, but currently no builds are distributed (for any platform) ([GH-49463](https://github.com/apache/arrow/issues/49463)). +In C++, we have refactored serialization/deserialization to make low-level functionality accessible for advanced usage ([GH-49548](https://github.com/apache/arrow/issues/49548)). ## C++ Notes From 1dc0873e8c1e083e24c3b74f56452aed484bc2b0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= Date: Mon, 20 Apr 2026 09:46:58 +0200 Subject: [PATCH 08/11] Update _posts/2026-04-20-24.0.0-release.md Co-authored-by: Alenka Frim --- _posts/2026-04-20-24.0.0-release.md | 32 +++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/_posts/2026-04-20-24.0.0-release.md b/_posts/2026-04-20-24.0.0-release.md index 2c353c52fe9..b1f56685fc9 100644 --- a/_posts/2026-04-20-24.0.0-release.md +++ b/_posts/2026-04-20-24.0.0-release.md @@ -95,6 +95,38 @@ A bunch of previously deprecated APIs have been removed (GH-49356). ## Python Notes +## Compatibility notes + +* `pyarrow.gandiva` is deprecated and will be removed in a future version [GH-49227](https://github.com/apache/arrow/issues/49227) + +## New features + +* Type annotations work is starting to be included ([GH-49102](https://github.com/apache/arrow/issues/49102) and + [GH-49452](https://github.com/apache/arrow/issues/49452)) +* Basic arithmetic on arrays and scalars is now supported [GH-32007](https://github.com/apache/arrow/issues/32007) +* Options to control writing of Parquet Bloom filters are added to `parquet.write_table` [GH-49376](https://github.com/apache/arrow/issues/49376) +* OpenTelemetry is enabled in PyArrow wheels [GH-49382](https://github.com/apache/arrow/issues/49382) +* AzureFileSystem is now included in the Windows wheels [GH-44655](https://github.com/apache/arrow/issues/44655) + +## Other improvements + +* Scikit-build-core is now used as a build system [GH-36411](https://github.com/apache/arrow/issues/36411) +* `UUID` objects are now inferred automatically in `pa.scalar()` and `pa.array()` without the need to + specify the type explicitly [GH-48241](https://github.com/apache/arrow/issues/48241) +* Constructing an extension array via `pa.array()` from a list of extension-type scalars is now supported + [GH-48470](https://github.com/apache/arrow/issues/48470) +* There have been some improvements in the documentation ([GH-49278](https://github.com/apache/arrow/issues/49278), + [GH-49269](https://github.com/apache/arrow/issues/49269) and [GH-28859](https://github.com/apache/arrow/issues/28859)) +* CSV and JSON options have improved repr/str methods [GH-47389](https://github.com/apache/arrow/issues/47389) + +## Relevant bug fixes + +* `SparseCOOTensor.__repr__` missing f-string prefix is now fixed [GH-49108](https://github.com/apache/arrow/issues/49108) +* Pickling `SubTreeFileSystem(base_path, AzureFileSystem(...))` is fixed [GH-49078](https://github.com/apache/arrow/issues/49078) +* Casting from `StringArray` to pandas 3.* when element is `None` is fixed [GH-49002](https://github.com/apache/arrow/issues/49002) +* Dictionary key order is now preserved when inferring struct type [GH-40053](https://github.com/apache/arrow/issues/40053) +* Duplicate csv header when table batches start with empty is now fixed [GH-36889](https://github.com/apache/arrow/issues/36889) + ### Compatibility notes From 43b5dad0a628d48af4d4e3c2f8b040869eacbce5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= Date: Mon, 20 Apr 2026 09:55:07 +0200 Subject: [PATCH 09/11] Some minor format fixes and add links for C++ GH issues --- _posts/2026-04-20-24.0.0-release.md | 40 ++++++++++++----------------- 1 file changed, 16 insertions(+), 24 deletions(-) diff --git a/_posts/2026-04-20-24.0.0-release.md b/_posts/2026-04-20-24.0.0-release.md index b1f56685fc9..e689e334043 100644 --- a/_posts/2026-04-20-24.0.0-release.md +++ b/_posts/2026-04-20-24.0.0-release.md @@ -44,9 +44,10 @@ Thanks everyone for your contributions and participation in the project! We have written a project-wide [Security Model](https://arrow.apache.org/docs/dev/format/Security.html) outlining what users should expect when dealing with Arrow data, especially coming -from untrusted sources (GH-48868). +from untrusted sources [GH-48868](https://github.com/apache/arrow/issues/48868). ## Arrow Flight RPC Notes + The ODBC driver is still a work-in-progress. The driver now builds on Linux, but currently no builds are distributed (for any platform) ([GH-49463](https://github.com/apache/arrow/issues/49463)). In C++, we have refactored serialization/deserialization to make low-level functionality accessible for advanced usage ([GH-49548](https://github.com/apache/arrow/issues/49548)). @@ -55,46 +56,50 @@ In C++, we have refactored serialization/deserialization to make low-level funct In addition to the aforementioned project-wide Security Model, we have written a specific [Security Model for Arrow C++](https://arrow.apache.org/docs/dev/cpp/security.html) -covering more concrete topics such as API usage and parameter validity (GH-49274). +covering more concrete topics such as API usage and parameter validity [GH-49274](https://github.com/apache/arrow/issues/49274). ### Compute ### Extension Types The canonical type [VariableShapeTensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#variable-shape-tensor) -was finally implemented (GH-38007). +was finally implemented [GH-38007](https://github.com/apache/arrow/issues/38007). ### Parquet **Breaking change:** The Arrow extension type name for Parquet Variant columns -used to be `parquet.variant` but has been changed to `arrow.parquet.variant` (GH-49081). +used to be `parquet.variant` but has been changed to `arrow.parquet.variant` [GH-49081](https://github.com/apache/arrow/issues/49081). While Parquet C++ could only read unencrypted bloom filters, it now supports -reading encrypted bloom filters as well (GH-48334). In addition, it can also -write bloom filters, though only unencrypted (GH-34785). +reading encrypted bloom filters as well [GH-48334](https://github.com/apache/arrow/issues/48334). In addition, it can also +write bloom filters, though only unencrypted [GH-34785](https://github.com/apache/arrow/issues/34785). An ambitious rewrite of the bit-unpacking utilities and optimizations has led to significant performance improvements on reading some Parquet columns, up to 50% -faster in some cases (GH-48277). This rewrite is described in more detail +faster in some cases [GH-48277](https://github.com/apache/arrow/issues/48277). This rewrite is described in more detail in an [accompanying blog post](https://medium.com/@AntoineProuvost/faster-reads-for-apache-parquet-improving-integer-unpacking-f6e21ce49a85). The performance of reading DELTA_BINARY_PACKED-encoded integers has been improved -in some favorable cases (GH-49266). +in some favorable cases [GH-49266](https://github.com/apache/arrow/issues/49266). ### Miscellaneous C++ changes We have migrated to C++20 `std::span`, removing our home-grown implementation -in `arrow::util::span` (GH-48588). +in `arrow::util::span` [GH-48588](https://github.com/apache/arrow/issues/48588). -A bunch of previously deprecated APIs have been removed (GH-49356). +A bunch of previously deprecated APIs have been removed [GH-49356](https://github.com/apache/arrow/issues/49356). ## Linux Packaging Notes +Added support for Ubuntu 26.04, the next LTS [GH-49341](https://github.com/apache/arrow/issues/49341) ## MATLAB Notes +No major notes for this release on MATLAB. + ## Python Notes + ## Compatibility notes * `pyarrow.gandiva` is deprecated and will be removed in a future version [GH-49227](https://github.com/apache/arrow/issues/49227) @@ -110,7 +115,7 @@ A bunch of previously deprecated APIs have been removed (GH-49356). ## Other improvements -* Scikit-build-core is now used as a build system [GH-36411](https://github.com/apache/arrow/issues/36411) +* Scikit-build-core is now used as the PyArrow build system [GH-36411](https://github.com/apache/arrow/issues/36411) * `UUID` objects are now inferred automatically in `pa.scalar()` and `pa.array()` without the need to specify the type explicitly [GH-48241](https://github.com/apache/arrow/issues/48241) * Constructing an extension array via `pa.array()` from a list of extension-type scalars is now supported @@ -127,19 +132,6 @@ A bunch of previously deprecated APIs have been removed (GH-49356). * Dictionary key order is now preserved when inferring struct type [GH-40053](https://github.com/apache/arrow/issues/40053) * Duplicate csv header when table batches start with empty is now fixed [GH-36889](https://github.com/apache/arrow/issues/36889) - -### Compatibility notes - - -### New features - - -### Other improvements - - -### Relevant bug fixes - - ## R Notes ### New Features From b276dbb9851e703c4bb60fd987ea7d956ca6d2e4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= Date: Tue, 21 Apr 2026 13:08:49 +0200 Subject: [PATCH 10/11] Add number of commits, issues and contributors for the release --- _posts/2026-04-20-24.0.0-release.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_posts/2026-04-20-24.0.0-release.md b/_posts/2026-04-20-24.0.0-release.md index e689e334043..3cf59d37bf9 100644 --- a/_posts/2026-04-20-24.0.0-release.md +++ b/_posts/2026-04-20-24.0.0-release.md @@ -25,8 +25,8 @@ limitations under the License. --> The Apache Arrow team is pleased to announce the 24.0.0 release. This release -covers over 3 months of development work and includes [**XXX resolved -issues**][1] on [**YYY distinct commits**][2] from [**ZZZ distinct +covers over 3 months of development work and includes [**259 resolved +issues**][1] on [**325 distinct commits**][2] from [**57 distinct contributors**][2]. See the [Install Page](https://arrow.apache.org/install/) to learn how to get the libraries for your platform. From d885535c7263b7cb4b46441bada5269a71e79a86 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= Date: Tue, 21 Apr 2026 13:17:55 +0200 Subject: [PATCH 11/11] Update blog post date to today before publishing --- ...026-04-20-24.0.0-release.md => 2026-04-21-24.0.0-release.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename _posts/{2026-04-20-24.0.0-release.md => 2026-04-21-24.0.0-release.md} (99%) diff --git a/_posts/2026-04-20-24.0.0-release.md b/_posts/2026-04-21-24.0.0-release.md similarity index 99% rename from _posts/2026-04-20-24.0.0-release.md rename to _posts/2026-04-21-24.0.0-release.md index 3cf59d37bf9..cc9d989b1f6 100644 --- a/_posts/2026-04-20-24.0.0-release.md +++ b/_posts/2026-04-21-24.0.0-release.md @@ -1,7 +1,7 @@ --- layout: post title: "Apache Arrow 24.0.0 Release" -date: "2026-04-20 00:00:00" +date: "2026-04-21 00:00:00" author: pmc categories: [release] ---