Apache Iceberg version
Most recent PyIceberg
Please describe the bug 🐞
See here and the description below for a failing test.
table = catalog.load_table(f"default.{identifier}")
scan = table.scan()
# assert len(scan.to_arrow()) > 0
scan = scan.filter("ts >= '2023-03-05T00:00:00+00:00'")
assert len(scan.to_arrow()) > 0
This code works fine, but uncommenting the first assertion causes the filter call to throw. The stack trace is immediately helpful:
pyiceberg/table/__init__.py:1710: in filter
return self.update(row_filter=And(self.row_filter, _parse_row_filter(expr)))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <pyiceberg.table.DataScan object at 0x11c065cd0>
overrides = {'row_filter': GreaterThanOrEqual(term=Reference(name='ts'), literal=literal('2023-03-05T00:00:00+00:00'))}
def update(self: S, **overrides: Any) -> S:
"""Create a copy of this table scan with updated fields."""
> return type(self)(**{**self.__dict__, **overrides})
E TypeError: TableScan.__init__() got an unexpected keyword argument 'partition_filters'
pyiceberg/table/__init__.py:1694: TypeError
DataScan has a cached_property partition_filters (see here) that will turn up in self.__dict__ below in the update method:
|
def update(self: S, **overrides: Any) -> S: |
|
"""Create a copy of this table scan with updated fields.""" |
|
return type(self)(**{**self.__dict__, **overrides}) |
This will happen if the cache property has been accessed once - i.e. if the scan has already had plan_files called on it (essentially, if it's been read).
Willingness to contribute
Apache Iceberg version
Most recent PyIceberg
Please describe the bug 🐞
See here and the description below for a failing test.
This code works fine, but uncommenting the first assertion causes the
filtercall to throw. The stack trace is immediately helpful:DataScanhas acached_propertypartition_filters(see here) that will turn up inself.__dict__below in theupdatemethod:iceberg-python/pyiceberg/table/__init__.py
Lines 1692 to 1694 in 045dd10
This will happen if the cache property has been accessed once - i.e. if the scan has already had
plan_filescalled on it (essentially, if it's been read).Willingness to contribute