Skip to content

[server] Fix zk partition residual when using dynamic partition#1187

Merged
luoyuxia merged 2 commits intoapache:mainfrom
zcoo:20250624_dynamic_create_fix
Jun 26, 2025
Merged

[server] Fix zk partition residual when using dynamic partition#1187
luoyuxia merged 2 commits intoapache:mainfrom
zcoo:20250624_dynamic_create_fix

Conversation

@zcoo
Copy link
Copy Markdown
Contributor

@zcoo zcoo commented Jun 24, 2025

Purpose

Linked issue: close #1185

Brief change log

Tests

API and Format

Documentation

Copy link
Copy Markdown
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zcoo Thanks for the pr. Left some comments. PTAL

}
});
if (!lock.isLocked() && lock.getQueueLength() == 0) {
createPartitionLockMap.remove(physicalTablePath, lock);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any exception happens in above inLock(xxx cde fragement, this lock won't be removed..

}
});
if (!lock.isLocked() && lock.getQueueLength() == 0) {
createPartitionLockMap.remove(physicalTablePath, lock);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the following case:

TBH, I'm not still thinking out a good way to solve it. Maybe a guava cache?

try {
long partitionId = zookeeperClient.getPartitionIdAndIncrement();
// register partition assignments to zk first
zookeeperClient.registerPartitionAssignment(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we wrap registerPartitionAssignment and registerPartition into a zk transation? We let zk to do the concurrent controll so that we won't need the lock. We will have clean code.
The only bad is that we may fire mutiple zk transation for same partition. But it shouldn't happen frequently, so I think it's fine since the current implemenation also will fire mutiple zk operations for same partition.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very valuable suggestion. I have changed the solution.

@luoyuxia luoyuxia requested a review from Copilot June 26, 2025 07:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds per-partition locking around the createPartition flow to prevent residual entries in ZooKeeper when dynamic partitions are created concurrently.

  • Introduce a createPartitionLockMap to manage ReentrantLock per PhysicalTablePath
  • Wrap the partition creation logic in inLock to enforce mutual exclusion
  • Clean up locks from the map after creation completes
Comments suppressed due to low confidence (1)

fluss-server/src/main/java/com/alibaba/fluss/server/coordinator/MetadataManager.java:396

  • The new per-partition locking logic in createPartition introduces concurrency behavior that should be covered by unit or integration tests to verify correct mutual exclusion and lock cleanup under both success and failure scenarios.
        PhysicalTablePath physicalTablePath = PhysicalTablePath.of(tablePath, partitionName);

Comment on lines +484 to +486
});
if (!lock.isLocked() && lock.getQueueLength() == 0) {
createPartitionLockMap.remove(physicalTablePath, lock);
Copy link

Copilot AI Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup removal of the ReentrantLock from createPartitionLockMap will not execute if an exception is thrown inside inLock, potentially leaking locks. Consider wrapping both the inLock call and the cleanup removal logic in a try/finally to ensure removal always runs.

Suggested change
});
if (!lock.isLocked() && lock.getQueueLength() == 0) {
createPartitionLockMap.remove(physicalTablePath, lock);
});
} finally {
if (!lock.isLocked() && lock.getQueueLength() == 0) {
createPartitionLockMap.remove(physicalTablePath, lock);
}

Copilot uses AI. Check for mistakes.
@zcoo zcoo force-pushed the 20250624_dynamic_create_fix branch from 4392148 to 4ab0eb9 Compare June 26, 2025 08:36
@zcoo
Copy link
Copy Markdown
Contributor Author

zcoo commented Jun 26, 2025

@luoyuxia Thanks! I refactor my code to solve problem by using zk transaction instead of Lock Map! PLAT

Copy link
Copy Markdown
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zcoo Thanks for update. Only one comment. Otherwise, LGTM!

}

/** Register partition assignment and metadata in transaction. */
public void registerPartitionAssignmentAndMetadata(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this method, we can remove methods registerPartition & registerPartitionAssignment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes!

Copy link
Copy Markdown
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@luoyuxia luoyuxia merged commit ff4b120 into apache:main Jun 26, 2025
4 checks passed
polyzos pushed a commit to Alibaba-HZY/fluss that referenced this pull request Aug 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dynamic partition create redundant partition assignment in Zookeeper

3 participants