-
Notifications
You must be signed in to change notification settings - Fork 475
Support Appends with TimeTransform Partitions #703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 10 commits
Commits
Show all changes
81 commits
Select commit
Hold shift + click to select a range
0cad231
checkpoint
sungwy 96e5533
checkpoint2
sungwy ddfa9ac
todo: sort with pyarrow_transform vals
sungwy 1a5327a
checkpoint
sungwy e067a28
checkpoint
sungwy 069f3bd
fix
sungwy 615d5e3
tests
sungwy c0a0f32
more tests
sungwy d872245
Remove trailing slash from table location when creating a table (#702)
felixscherz a1f4ba8
Build: Bump mkdocs-section-index from 0.3.8 to 0.3.9 (#696)
dependabot[bot] e2f547d
Build: Bump cython from 3.0.8 to 3.0.10 (#697)
dependabot[bot] 29beaf8
Build: Bump tqdm from 4.66.2 to 4.66.3 (#699)
dependabot[bot] 70a45f6
Build: Bump werkzeug from 3.0.1 to 3.0.3 (#706)
dependabot[bot] 0eb0c1c
Build: Bump jinja2 from 3.1.3 to 3.1.4 in /mkdocs (#707)
dependabot[bot] 6a39eda
adopt review feedback
sungwy 990ce80
Make `add_files` to support `snapshot_properties` argument (#695)
enkidulan 0508667
Add support for categorical type (#693)
sungwy 1f39b59
Build: Bump tenacity from 8.2.3 to 8.3.0 (#714)
dependabot[bot] 50a65e5
Build: Bump mkdocstrings from 0.25.0 to 0.25.1 (#715)
dependabot[bot] 3461305
Build: Bump coverage from 7.5.0 to 7.5.1 (#713)
dependabot[bot] 399a9be
Build: Bump sqlalchemy from 2.0.29 to 2.0.30 (#712)
dependabot[bot] 6f72e30
Build: Bump flask-cors from 4.0.0 to 4.0.1 (#718)
dependabot[bot] d14e137
comment
sungwy 4de207d
Build: Bump mkdocs-material from 9.5.20 to 9.5.21 (#719)
dependabot[bot] d02d7a1
Build: Bump getdaft from 0.2.23 to 0.2.24 (#721)
dependabot[bot] aa361d1
Test, write subset of schema (#704)
kevinjqliu b41c98c
Remove pylintrc file (#724)
ndrluis 444dca7
Add kevinjqliu to collaborators (#729)
Fokko 7904fe5
Build: Bump moto from 5.0.6 to 5.0.7 (#733)
dependabot[bot] 0d98ec8
Build: Bump mkdocs-material from 9.5.21 to 9.5.22 (#732)
dependabot[bot] 6c2ba34
Build: Bump griffe from 0.44.0 to 0.45.0 (#731)
dependabot[bot] 20b7b53
Build: Bump pypa/cibuildwheel from 2.17.0 to 2.18.0 (#730)
dependabot[bot] 6d52325
Hive catalog: Add retry logic for hive locking (#701)
frankliee a268e5b
Add create_namespace_if_not_exists method (#725)
ndrluis b40378b
Remove NoSuchNamespaceError on namespace creation (#726)
ndrluis ac84bd5
Build: Bump pyarrow from 16.0.0 to 16.1.0 (#743)
dependabot[bot] 20c2731
Build: Bump mkdocstrings-python from 1.10.0 to 1.10.1 (#744)
dependabot[bot] 4fddcbe
Build: Bump mkdocstrings-python from 1.10.1 to 1.10.2 (#746)
dependabot[bot] 0a58636
Build: Bump boto3 from 1.34.69 to 1.34.106 (#749)
dependabot[bot] c764d6a
--- (#754)
dependabot[bot] 245ab87
--- (#755)
dependabot[bot] 82df57e
--- (#756)
dependabot[bot] aa5a136
[FEAT]register table using iceberg metadata file via pyiceberg (#711)
MehulBatra 5537cb4
modify doc(backward compatibility) typo (#757)
SeungyeopShin e917660
Bump requests from 2.32.1 to 2.32.2 (#759)
dependabot[bot] 7083b2e
Bump griffe from 0.45.0 to 0.45.1 (#760)
dependabot[bot] 03a0d65
Bump mypy-boto3-glue from 1.34.88 to 1.34.110 (#761)
dependabot[bot] 996afd0
Bump mkdocstrings-python from 1.10.2 to 1.10.3 (#762)
dependabot[bot] eba4bee
Initial implementation of the manifest table (#717)
geruh 42afc43
Fix: Table-Exists if Server returns 204 (#739)
c-thiel 959718a
Bump duckdb from 0.10.2 to 0.10.3 (#764)
dependabot[bot] ed83e84
Bump griffe from 0.45.1 to 0.45.2 (#765)
dependabot[bot] b8023d2
Bump typing-extensions from 4.11.0 to 4.12.0 (#767)
dependabot[bot] a132be1
Bump mkdocs-material from 9.5.24 to 9.5.25 (#770)
dependabot[bot] 8968996
Add azure configuration variables (#745)
kevinzwang ee2a7c5
Bump moto from 5.0.7 to 5.0.8 (#771)
dependabot[bot] 54aacb4
Bump coverage from 7.5.1 to 7.5.2 (#772)
dependabot[bot] 756ae62
Introduce hierarchical namespaces into SqlCatalog (#591)
cccs-eric 4fb8ba2
Bump coverage from 7.5.2 to 7.5.3 (#776)
dependabot[bot] ec8d7dc
Bump pydantic from 2.7.1 to 2.7.2 (#775)
dependabot[bot] 7552e03
Bump requests from 2.32.2 to 2.32.3 (#778)
dependabot[bot] e08cc9d
Bump getdaft from 0.2.24 to 0.2.25 (#779)
dependabot[bot] d3ad61c
Remove `record_fields` from the `Record` class (#580)
Fokko cf3bf8a
Unify to double quotes using Ruff (#781)
HonahX 91973f2
Bump moto from 5.0.8 to 5.0.9 (#783)
dependabot[bot] 0339e7f
Support CreateTableTransaction for SqlCatalog (#684)
HonahX 84a2c04
Support CreateTableTransaction for HiveCatalog (#683)
HonahX 8d79664
Support viewfs scheme along side with hdfs (#777)
yothinix 20f6afd
Update `fsspec.py`to respect `s3.signer.uri property` (#741)
c-thiel 5dd846d
checkpoint
sungwy 6357193
checkpoint2
sungwy c30a57c
todo: sort with pyarrow_transform vals
sungwy 541655f
checkpoint
sungwy afe83b1
checkpoint
sungwy 00ca5f0
fix
sungwy 511e988
tests
sungwy 3b784ab
more tests
sungwy 3711b1b
adopt review feedback
sungwy f16d778
comment
sungwy 80d5064
Merge branch 'transform-partition-writes' of https://github.com/syun6…
sungwy 9f0a92b
rebase
sungwy File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding a public method, how about we maintaining a private list of transform that supports pyarrow transform for the check at the beginning of
append. For transforms that does not yet support pyarrow transform, we could throwNotImplementedErrorinpyarrow_transformWDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was following the existing convention for
can_transformandpreserves_orderthat are specified as class properties. I'm in favor of keeping it consistent with theseUh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing the context. My initial concern was that unlike
can_transformandpreserve_oderwhich behave differently across transforms, in the end thesupport_pyarrow_transformwould returnTruefor all the transforms, making it a little bit redundant. But on second thought, since we will likely not support pyarrow transform for other transforms in 0.7.0 release, thissupport_pyarrow_transformcan be a useful property in the upcoming release.We could deprecate this property after we support pyarrow for all transforms later. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm on the same page! To add to that - I'm not yet sure if we will be able to support the other Transform classes in
support_pyarrow_transform. The key idea here is that we want to use native pyarrow compute functions to apply the equivalent transform without converting the values back and forth between Arrow and Python data types. I'm not yet sure if we'll be able to do the same for BucketTransform for instance, with the existing range of pyarrow compute functions.