Skip to content

fix(parquet/pqarrow): fix propagation of FieldIds for nested fields#324

Merged
zeroshade merged 2 commits intoapache:mainfrom
zeroshade:field-id-metadata
Mar 24, 2025
Merged

fix(parquet/pqarrow): fix propagation of FieldIds for nested fields#324
zeroshade merged 2 commits intoapache:mainfrom
zeroshade:field-id-metadata

Conversation

@zeroshade
Copy link
Member

Rationale for this change

While implementing things for iceberg-go, I found that an Arrow schema with nested fields (struct/map/list) that contains metadata values for FieldID is not respected when writing a file using pqarrow.

What changes are included in this PR?

Fixes the propagation of field ids when constructing a Parquet file from an arrow schema that contains nested fields, while also adding a FileMetadata function to the FileWriter and pqarrow.FileWriter so that you can inspect the metadata of a written file without having to read it back into memory.

Are these changes tested?

Yes

Are there any user-facing changes?

No

@zeroshade zeroshade requested review from kou and lidavidm March 21, 2025 18:30
@zeroshade zeroshade marked this pull request as ready for review March 21, 2025 18:30
@zeroshade zeroshade merged commit 7aec6ac into apache:main Mar 24, 2025
22 of 23 checks passed
@zeroshade zeroshade deleted the field-id-metadata branch March 24, 2025 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants