-
Notifications
You must be signed in to change notification settings - Fork 194
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Search before asking
- I had searched in the issues and found no similar issues.
Please describe the bug 🐞
I’m trying to use XTable to convert a hudi source to a delta target and I am receiving the following exception. The table is active and frequently updated. It is being actively queried as a hudi table.
Is there any other debug information I can provide to make this more useful?
My git head is 4a96627
OS is Linux/Ubuntu
Java 11
Modified log4j2.xml to set level=trace for org.apache.hudi, o.a.xtable
Run with stacktrace:
$ java -jar ./xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig config.yaml
WARNING: Runtime environment or build system does not support multi-release JARs. This will impact location-based features.
2024-06-05 23:22:05 INFO org.apache.xtable.utilities.RunSync:148 - Running sync for basePath s3://hidden-s3-bucket/hidden-prefix/ for following table formats [DELTA]
2024-06-05 23:22:05 INFO org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix
2024-06-05 23:22:05 WARN org.apache.hadoop.util.NativeCodeLoader:60 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2024-06-05 23:22:05 WARN org.apache.hadoop.metrics2.impl.MetricsConfig:136 - Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
2024-06-05 23:22:06 WARN org.apache.hadoop.fs.s3a.SDKV2Upgrade:39 - Directly referencing AWS SDK V1 credential provider com.amazonaws.auth.DefaultAWSCredentialsProviderChain. AWS SDK V1 credential providers will be removed once S3A is upgraded to SDK V2
2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableConfig:276 - Loading table properties from s3://hidden-s3-bucket/hidden-prefix/.hoodie/hoodie.properties
2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from s3://hidden-s3-bucket/hidden-prefix
2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:155 - Loading Active commit timeline for s3://hidden-s3-bucket/hidden-prefix
2024-06-05 23:22:07 INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded instants upto : Option{val=[20240605231910580__clean__COMPLETED__20240605231918000]}
2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix
2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableConfig:276 - Loading table properties from s3://hidden-s3-bucket/hidden-prefix/.hoodie/hoodie.properties
2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from s3://hidden-s3-bucket/hidden-prefix
2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata
2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableConfig:276 - Loading table properties from s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata/.hoodie/hoodie.properties
2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata
2024-06-05 23:22:08 INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded instants upto : Option{val=[20240605231910580__deltacommit__COMPLETED__20240605231917000]}
2024-06-05 23:22:08 INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView:259 - Took 7 ms to read 0 instants, 0 replaced file groups
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.hbase.util.UnsafeAvailChecker (file:/incubator-xtable/xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar) to method java.nio.Bits.unaligned()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.hbase.util.UnsafeAvailChecker
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2024-06-05 23:22:08 INFO org.apache.hudi.common.util.ClusteringUtils:147 - Found 0 files in pending clustering operations
2024-06-05 23:22:08 INFO org.apache.hudi.common.table.view.FileSystemViewManager:243 - Creating View Manager with storage type :MEMORY
2024-06-05 23:22:08 INFO org.apache.hudi.common.table.view.FileSystemViewManager:255 - Creating in-memory based Table View
2024-06-05 23:22:11 INFO org.apache.spark.sql.delta.storage.DelegatingLogStore:60 - LogStore `LogStoreAdapter(io.delta.storage.S3SingleDriverLogStore)` is used for scheme `s3`
2024-06-05 23:22:11 INFO org.apache.spark.sql.delta.DeltaLog:60 - Creating initial snapshot without metadata, because the directory is empty
2024-06-05 23:22:13 INFO org.apache.spark.sql.delta.InitialSnapshot:60 - [tableId=8eda3e8f-9dae-4d19-ac72-f625b8ccb0c5] Created snapshot InitialSnapshot(path=s3://hidden-s3-bucket/hidden-prefix/_delta_log, version=-1, metadata=Metadata(167f7b26-f82d-4765-97b9-b6e47d9147ec,null,null,Format(parquet,Map()),null,List(),Map(),Some(1717629733296)), logSegment=LogSegment(s3://hidden-s3-bucket/hidden-prefix/_delta_log,-1,List(),None,-1), checksumOpt=None)
2024-06-05 23:22:13 INFO org.apache.xtable.conversion.ConversionController:240 - No previous InternalTable sync for target. Falling back to snapshot sync.
2024-06-05 23:22:13 INFO org.apache.hudi.common.table.TableSchemaResolver:317 - Reading schema from s3://hidden-s3-bucket/hidden-prefix/op_date=2024-06-05/3b5d27af-ef39-4862-bbd9-d4a010f6056e-0_0-71-375_20240605231837826.parquet
2024-06-05 23:22:14 INFO org.apache.hudi.metadata.HoodieTableMetadataUtil:927 - Loading latest merged file slices for metadata table partition files
2024-06-05 23:22:14 INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView:259 - Took 1 ms to read 0 instants, 0 replaced file groups
2024-06-05 23:22:14 INFO org.apache.hudi.common.util.ClusteringUtils:147 - Found 0 files in pending clustering operations
2024-06-05 23:22:14 INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView:429 - Building file system view for partition (files)
2024-06-05 23:22:14 DEBUG org.apache.hudi.common.table.view.AbstractTableFileSystemView:435 - #files found in partition (files) =30, Time taken =40
2024-06-05 23:22:14 DEBUG org.apache.hudi.common.table.view.HoodieTableFileSystemView:386 - Adding file-groups for partition :files, #FileGroups=1
2024-06-05 23:22:14 DEBUG org.apache.hudi.common.table.view.AbstractTableFileSystemView:165 - addFilesToView: NumFiles=30, NumFileGroups=1, FileGroupsCreationTime=15, StoreTimeTaken=1
2024-06-05 23:22:14 DEBUG org.apache.hudi.common.table.view.AbstractTableFileSystemView:449 - Time to load partition (files) =57
2024-06-05 23:22:14 INFO org.apache.hudi.metadata.HoodieBackedTableMetadata:451 - Opened metadata base file from s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata/files/files-0000-0_0-67-1304_20240605210834482001.hfile at instant 20240605210834482001 in 9 ms
2024-06-05 23:22:14 INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded instants upto : Option{val=[20240605231910580__clean__COMPLETED__20240605231918000]}
2024-06-05 23:22:14 ERROR org.apache.xtable.utilities.RunSync:171 - Error running sync for s3://hidden-s3-bucket/hidden-prefix/
org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata
at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:127) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.xtable.hudi.HudiDataFileExtractor.getFilesCurrentState(HudiDataFileExtractor.java:116) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.xtable.hudi.HudiConversionSource.getCurrentSnapshot(HudiConversionSource.java:97) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.xtable.spi.extractor.ExtractFromSource.extractSnapshot(ExtractFromSource.java:38) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.xtable.conversion.ConversionController.syncSnapshot(ConversionController.java:183) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.xtable.conversion.ConversionController.sync(ConversionController.java:121) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.xtable.utilities.RunSync.main(RunSync.java:169) [xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
Caused by: java.lang.IllegalStateException: Recursive update
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1739) ~[?:?]
at org.apache.avro.util.MapUtil.computeIfAbsent(MapUtil.java:42) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.specific.SpecificData.getClass(SpecificData.java:257) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:508) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:180) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:355) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:186) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:248) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:180) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:161) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:154) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:263) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:248) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeAvroMetadata(TimelineMetadataUtils.java:209) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeHoodieRollbackMetadata(TimelineMetadataUtils.java:177) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.metadata.HoodieTableMetadataUtil.getRollbackedCommits(HoodieTableMetadataUtil.java:1355) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.metadata.HoodieTableMetadataUtil.lambda$getValidInstantTimestamps$37(HoodieTableMetadataUtil.java:1284) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[?:?]
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) ~[?:?]
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) ~[?:?]
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[?:?]
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[?:?]
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) ~[?:?]
at org.apache.hudi.metadata.HoodieTableMetadataUtil.getValidInstantTimestamps(HoodieTableMetadataUtil.java:1283) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:473) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:429) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getOrCreateReaders$10(HoodieBackedTableMetadata.java:412) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1705) ~[?:?]
at org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:412) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.metadata.HoodieBackedTableMetadata.lookupKeysFromFileSlice(HoodieBackedTableMetadata.java:291) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:255) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:145) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:316) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:125) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
... 6 more
config.yaml:
sourceFormat: HUDI
targetFormats:
- DELTA
datasets:
-
tableBasePath: s3://hidden-s3-bucket/hidden-prefix
tableName: hidden_table
partitionSpec: op_date:VALUE
hoodie.properties from the table:
hoodie.table.timeline.timezone=LOCAL
hoodie.table.keygenerator.class=org.apache.hudi.keygen.SimpleKeyGenerator
hoodie.table.precombine.field=ts_millis
hoodie.table.version=6
hoodie.database.name=
hoodie.datasource.write.hive_style_partitioning=true
hoodie.table.metadata.partitions.inflight=
hoodie.table.checksum=2622850774
hoodie.partition.metafile.use.base.format=false
hoodie.table.cdc.enabled=false
hoodie.archivelog.folder=archived
hoodie.table.name=hidden_table
hoodie.populate.meta.fields=true
hoodie.table.type=COPY_ON_WRITE
hoodie.datasource.write.partitionpath.urlencode=false
hoodie.table.base.file.format=PARQUET
hoodie.datasource.write.drop.partition.columns=false
hoodie.table.metadata.partitions=files
hoodie.timeline.layout.version=1
hoodie.table.recordkey.fields=record_id
hoodie.table.partition.fields=op_date
I submitted this to the dev@ mailing list and received no response, so filing as an issue.
Are you willing to submit PR?
- I am willing to submit a PR!
- I am willing to submit a PR but need help getting started!
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working