diff --git a/_posts/2026-04-21-24.0.0-release.md b/_posts/2026-04-21-24.0.0-release.md new file mode 100644 index 00000000000..cc9d989b1f6 --- /dev/null +++ b/_posts/2026-04-21-24.0.0-release.md @@ -0,0 +1,213 @@ +--- +layout: post +title: "Apache Arrow 24.0.0 Release" +date: "2026-04-21 00:00:00" +author: pmc +categories: [release] +--- + + +The Apache Arrow team is pleased to announce the 24.0.0 release. This release +covers over 3 months of development work and includes [**259 resolved +issues**][1] on [**325 distinct commits**][2] from [**57 distinct +contributors**][2]. See the [Install Page](https://arrow.apache.org/install/) to +learn how to get the libraries for your platform. + +The release notes below are not exhaustive and only expose selected highlights +of the release. Many other bugfixes and improvements have been made: we refer +you to the [complete changelog][3]. + +## Community + +We recently published our [Community Highlights for 2025](https://arrow.apache.org/blog/2026/03/19/arrow-2025-highlights/), check those out. + +Thanks everyone for your contributions and participation in the project! + +## Format Notes + +We have written a project-wide [Security Model](https://arrow.apache.org/docs/dev/format/Security.html) +outlining what users should expect when dealing with Arrow data, especially coming +from untrusted sources [GH-48868](https://github.com/apache/arrow/issues/48868). + +## Arrow Flight RPC Notes + +The ODBC driver is still a work-in-progress. The driver now builds on Linux, but currently no builds are distributed (for any platform) ([GH-49463](https://github.com/apache/arrow/issues/49463)). + +In C++, we have refactored serialization/deserialization to make low-level functionality accessible for advanced usage ([GH-49548](https://github.com/apache/arrow/issues/49548)). + +## C++ Notes + +In addition to the aforementioned project-wide Security Model, we have written +a specific [Security Model for Arrow C++](https://arrow.apache.org/docs/dev/cpp/security.html) +covering more concrete topics such as API usage and parameter validity [GH-49274](https://github.com/apache/arrow/issues/49274). + +### Compute + +### Extension Types + +The canonical type [VariableShapeTensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#variable-shape-tensor) +was finally implemented [GH-38007](https://github.com/apache/arrow/issues/38007). + +### Parquet + +**Breaking change:** The Arrow extension type name for Parquet Variant columns +used to be `parquet.variant` but has been changed to `arrow.parquet.variant` [GH-49081](https://github.com/apache/arrow/issues/49081). + +While Parquet C++ could only read unencrypted bloom filters, it now supports +reading encrypted bloom filters as well [GH-48334](https://github.com/apache/arrow/issues/48334). In addition, it can also +write bloom filters, though only unencrypted [GH-34785](https://github.com/apache/arrow/issues/34785). + +An ambitious rewrite of the bit-unpacking utilities and optimizations has led to +significant performance improvements on reading some Parquet columns, up to 50% +faster in some cases [GH-48277](https://github.com/apache/arrow/issues/48277). This rewrite is described in more detail +in an [accompanying blog post](https://medium.com/@AntoineProuvost/faster-reads-for-apache-parquet-improving-integer-unpacking-f6e21ce49a85). + +The performance of reading DELTA_BINARY_PACKED-encoded integers has been improved +in some favorable cases [GH-49266](https://github.com/apache/arrow/issues/49266). + +### Miscellaneous C++ changes + +We have migrated to C++20 `std::span`, removing our home-grown implementation +in `arrow::util::span` [GH-48588](https://github.com/apache/arrow/issues/48588). + +A bunch of previously deprecated APIs have been removed [GH-49356](https://github.com/apache/arrow/issues/49356). + +## Linux Packaging Notes + +Added support for Ubuntu 26.04, the next LTS [GH-49341](https://github.com/apache/arrow/issues/49341) + +## MATLAB Notes + +No major notes for this release on MATLAB. + + +## Python Notes + +## Compatibility notes + +* `pyarrow.gandiva` is deprecated and will be removed in a future version [GH-49227](https://github.com/apache/arrow/issues/49227) + +## New features + +* Type annotations work is starting to be included ([GH-49102](https://github.com/apache/arrow/issues/49102) and + [GH-49452](https://github.com/apache/arrow/issues/49452)) +* Basic arithmetic on arrays and scalars is now supported [GH-32007](https://github.com/apache/arrow/issues/32007) +* Options to control writing of Parquet Bloom filters are added to `parquet.write_table` [GH-49376](https://github.com/apache/arrow/issues/49376) +* OpenTelemetry is enabled in PyArrow wheels [GH-49382](https://github.com/apache/arrow/issues/49382) +* AzureFileSystem is now included in the Windows wheels [GH-44655](https://github.com/apache/arrow/issues/44655) + +## Other improvements + +* Scikit-build-core is now used as the PyArrow build system [GH-36411](https://github.com/apache/arrow/issues/36411) +* `UUID` objects are now inferred automatically in `pa.scalar()` and `pa.array()` without the need to + specify the type explicitly [GH-48241](https://github.com/apache/arrow/issues/48241) +* Constructing an extension array via `pa.array()` from a list of extension-type scalars is now supported + [GH-48470](https://github.com/apache/arrow/issues/48470) +* There have been some improvements in the documentation ([GH-49278](https://github.com/apache/arrow/issues/49278), + [GH-49269](https://github.com/apache/arrow/issues/49269) and [GH-28859](https://github.com/apache/arrow/issues/28859)) +* CSV and JSON options have improved repr/str methods [GH-47389](https://github.com/apache/arrow/issues/47389) + +## Relevant bug fixes + +* `SparseCOOTensor.__repr__` missing f-string prefix is now fixed [GH-49108](https://github.com/apache/arrow/issues/49108) +* Pickling `SubTreeFileSystem(base_path, AzureFileSystem(...))` is fixed [GH-49078](https://github.com/apache/arrow/issues/49078) +* Casting from `StringArray` to pandas 3.* when element is `None` is fixed [GH-49002](https://github.com/apache/arrow/issues/49002) +* Dictionary key order is now preserved when inferring struct type [GH-40053](https://github.com/apache/arrow/issues/40053) +* Duplicate csv header when table batches start with empty is now fixed [GH-36889](https://github.com/apache/arrow/issues/36889) + +## R Notes + +### New Features + +* A number of new `dplyr` bindings [GH-49533](https://github.com/apache/arrow/issues/49533), [GH-49256](https://github.com/apache/arrow/issues/49256), [GH-49535](https://github.com/apache/arrow/issues/49535) and [GH-49534](https://github.com/apache/arrow/issues/49534) + + +### Compatibility notes + +* Arrow no longer builds with GCS enabled on CRAN to avoid failures in their build systems. If you would like a full-featured build of Arrow, we recommend installing from R-universe; see [the Using cloud storage article in the docs](https://arrow.apache.org/docs/r/articles/fs.html) for more information. [GH-49067](https://github.com/apache/arrow/issues/49067) + + +### Relevant bug fixes + +* `to_arrow()` now retains grouping [GH-40640](https://github.com/apache/arrow/issues/40640) + +## Ruby and C GLib Notes + +* Fixed GC related problems. +* `GArrowListArray`: Added support for returning offset buffer. +* `GArrowLargeListArray`: Added support for returning offset buffer. +* `GArrowUnionArray`: Added support for returning fields. +* Deprecated Feather features. + +### Ruby + +We've added pure Ruby Apache Arrow writer implementation to the +`red-arrow-format` gem. + +We've marked pure Ruby Apache Arrow reader implementation in the +`red-arrow-format`gem as stable because it passes integration tests +with other implementations. But it still has some missing features. + +The `red-arrow` gem: +* Add support for converting to raw Ruby objects of the following arrays: + * `Arrow::LargeBinaryArray` + * `Arrow::LargeUTF8Array` + * `Arrow::LargeListArray` + * `Arrow::FixedSizeListArray` + * `Arrow::DurationArray` + * `Arrow::DictionaryArray` with `Arrow::LargeBinaryArray` or + `Arrow::LargeUTF8Array` + +### C GLib + +No C GLib only notes. + + +## Java, JavaScript, Go, .NET, Swift and Rust Notes + +The Java, JavaScript, Go, .NET, Swift and Rust projects have moved to separate +repositories outside the main Arrow [monorepo](https://github.com/apache/arrow). + +- For notes on the latest release of the [Java +implementation](https://github.com/apache/arrow-java), see the latest [Arrow +Java changelog][7]. +- For notes on the latest release of the [JavaScript +implementation](https://github.com/apache/arrow-js), see the latest [Arrow +JavaScript changelog][8]. +- For notes on the latest release of the [Rust + implementation](https://github.com/apache/arrow-rs) see the latest [Arrow Rust + changelog][5]. +- For notes on the latest release of the [Go +implementation](https://github.com/apache/arrow-go), see the latest [Arrow Go +changelog][6]. +- For notes on the latest release of the [.NET +implementation](https://github.com/apache/arrow-dotnet), see the latest [Arrow .NET changelog][9]. +- For notes on the latest release of the [Swift implementation](https://github.com/apache/arrow-swift), see the latest [Arrow Swift changelog][10]. + +[1]: https://github.com/apache/arrow/milestone/72?closed=1 +[2]: {{ site.baseurl }}/release/24.0.0.html#contributors +[3]: {{ site.baseurl }}/release/24.0.0.html#changelog +[4]: {{ site.baseurl }}/docs/r/news/ +[5]: +[6]: +[7]: +[8]: +[9]: +[10]: