Skip to content

Commit 60cbea6

Browse files
committed
Upgrade to DataFusion 53
1 parent e42775c commit 60cbea6

File tree

103 files changed

+2352
-1356
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

103 files changed

+2352
-1356
lines changed

Cargo.lock

Lines changed: 153 additions & 202 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -48,15 +48,15 @@ tokio = { version = "1.49", features = [
4848
"rt-multi-thread",
4949
"sync",
5050
] }
51-
pyo3 = { version = "0.26", features = [
51+
pyo3 = { version = "0.28", features = [
5252
"extension-module",
5353
"abi3",
5454
"abi3-py310",
5555
] }
56-
pyo3-async-runtimes = { version = "0.26", features = ["tokio-runtime"] }
56+
pyo3-async-runtimes = { version = "0.28", features = ["tokio-runtime"] }
5757
pyo3-log = "0.13.3"
58-
arrow = { version = "57", features = ["pyarrow"] }
59-
arrow-select = { version = "57" }
58+
arrow = { version = "58", features = ["pyarrow"] }
59+
arrow-select = { version = "58" }
6060
datafusion = { version = "52", features = ["avro", "unicode_expressions"] }
6161
datafusion-substrait = { version = "52", optional = true }
6262
datafusion-proto = { version = "52" }
@@ -70,7 +70,7 @@ mimalloc = { version = "0.1", optional = true, default-features = false, feature
7070
async-trait = "0.1.89"
7171
futures = "0.3"
7272
cstr = "0.2"
73-
object_store = { version = "0.12.4", features = [
73+
object_store = { version = "0.13.1", features = [
7474
"aws",
7575
"gcp",
7676
"azure",
@@ -82,7 +82,7 @@ parking_lot = "0.12"
8282

8383
[build-dependencies]
8484
prost-types = "0.14.3" # keep in line with `datafusion-substrait`
85-
pyo3-build-config = "0.26"
85+
pyo3-build-config = "0.28"
8686

8787
[lib]
8888
name = "datafusion_python"
@@ -91,3 +91,10 @@ crate-type = ["cdylib", "rlib"]
9191
[profile.release]
9292
lto = true
9393
codegen-units = 1
94+
95+
# TODO: remove when datafusion-53 is released
96+
[patch.crates-io]
97+
datafusion = { git = "https://github.com/apache/datafusion.git", rev = "6713439497561fa74a94177e5b8632322fb7cea5" }
98+
datafusion-substrait = { git = "https://github.com/apache/datafusion.git", rev = "6713439497561fa74a94177e5b8632322fb7cea5" }
99+
datafusion-proto = { git = "https://github.com/apache/datafusion.git", rev = "6713439497561fa74a94177e5b8632322fb7cea5" }
100+
datafusion-ffi = { git = "https://github.com/apache/datafusion.git", rev = "6713439497561fa74a94177e5b8632322fb7cea5" }

docs/source/contributor-guide/ffi.rst

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ PyO3 class mutability guidelines
146146

147147
PyO3 bindings should present immutable wrappers whenever a struct stores shared or
148148
interior-mutable state. In practice this means that any ``#[pyclass]`` containing an
149-
``Arc<RwLock<_>>`` or similar synchronized primitive must opt into ``#[pyclass(frozen)]``
149+
``Arc<RwLock<_>>`` or similar synchronized primitive must opt into ``#[pyclass(from_py_object, frozen)]``
150150
unless there is a compelling reason not to.
151151

152152
The :mod:`datafusion` configuration helpers illustrate the preferred pattern. The
@@ -156,7 +156,7 @@ instead of mutating the container directly:
156156

157157
.. code-block:: rust
158158
159-
#[pyclass(name = "Config", module = "datafusion", subclass, frozen)]
159+
#[pyclass(from_py_object, name = "Config", module = "datafusion", subclass, frozen)]
160160
#[derive(Clone)]
161161
pub(crate) struct PyConfig {
162162
config: Arc<RwLock<ConfigOptions>>,
@@ -170,7 +170,7 @@ existing instance in place:
170170

171171
.. code-block:: rust
172172
173-
#[pyclass(frozen, name = "SessionContext", module = "datafusion", subclass)]
173+
#[pyclass(from_py_object, frozen, name = "SessionContext", module = "datafusion", subclass)]
174174
#[derive(Clone)]
175175
pub struct PySessionContext {
176176
pub ctx: SessionContext,
@@ -186,7 +186,7 @@ field updates:
186186
187187
// TODO: This looks like this needs pyo3 tracking so leaving unfrozen for now
188188
#[derive(Debug, Clone)]
189-
#[pyclass(name = "DataTypeMap", module = "datafusion.common", subclass)]
189+
#[pyclass(from_py_object, name = "DataTypeMap", module = "datafusion.common", subclass)]
190190
pub struct DataTypeMap {
191191
#[pyo3(get, set)]
192192
pub arrow_type: PyDataType,
@@ -232,8 +232,11 @@ can then be turned into a ``ForeignTableProvider`` the associated code is:
232232

233233
.. code-block:: rust
234234
235-
let capsule = capsule.downcast::<PyCapsule>()?;
236-
let provider = unsafe { capsule.reference::<FFI_TableProvider>() };
235+
let capsule = capsule.cast::<PyCapsule>()?;
236+
let data: NonNull<FFI_TableProvider> = capsule
237+
.pointer_checked(Some(name))?
238+
.cast();
239+
let codec = unsafe { data.as_ref() };
237240
238241
By convention the ``datafusion-python`` library expects a Python object that has a
239242
``TableProvider`` PyCapsule to have this capsule accessible by calling a function named

docs/source/user-guide/upgrade-guides.rst

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,24 @@
1818
Upgrade Guides
1919
==============
2020

21+
DataFusion 53.0.0
22+
-----------------
23+
24+
This version includes an upgraded version of `pyo3`, which changed the way to extract an FFI object.
25+
Example:
26+
27+
Before:
28+
.. code-block:: rust
29+
let codec = unsafe { capsule.reference::<FFI_LogicalExtensionCodec>() };
30+
31+
Now:
32+
.. code-block:: rust
33+
let data: NonNull<FFI_LogicalExtensionCodec> = capsule
34+
.pointer_checked(Some(c_str!("datafusion_logical_extension_codec")))?
35+
.cast();
36+
let codec = unsafe { data.as_ref() };
37+
38+
2139
DataFusion 52.0.0
2240
-----------------
2341

0 commit comments

Comments
 (0)