Cleanup the persisted assignment state if no resource is on WAGED rebalancer. #1123

jiajunwang · 2020-06-26T04:50:15Z

Issues

My PR addresses the following Helix issues and references them in the PR description:

Description

Here are some details about my PR, including screenshots of any UI changes:

This is to prevent the WAGED rebalancer reads stale assignment records from the previous rebalance pipeline.
For example,

Resource A was the only resource. And it is rebalanced by WAGED, then we have a persisted assignment for A.
Resource A was reconfigured to using DelayedRebalancer, then we stop the WAGED rebalancer since there is no more resource using WAGED. So the persisted records are still in ZK.
Resource A is recreated and using WAGED again. In this case, the previous persisted assignment is no longer valid. We should treat A as a brand new resource instead of considering the stale assignment record.

Moreover, this change will help to clean up the ZK persisted data if no resource is using WAGED.

Tests

The following tests are written for this issue:

TestAssignmentMetadataStore, TestWagedRebalancer

The following is the result of the "mvn test" command on the appropriate module:

[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestJobQueueCleanUp.testJobQueueAutoCleanUp » ThreadTimeout Method org.testng....
[INFO]
[ERROR] Tests run: 1149, Failures: 1, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:25 h
[INFO] Finished at: 2020-06-26T00:00:39-07:00
[INFO] ------------------------------------------------------------------------

Rerun

[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.508 s - in org.apache.helix.integration.task.TestJobQueueCleanUp
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 23.728 s
[INFO] Finished at: 2020-06-26T00:01:34-07:00
[INFO] ------------------------------------------------------------------------

Commits

My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters (not including Jira issue reference)
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not "adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Documentation (Optional)

In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Code Quality

My diff has been formatted using helix-style.xml
(helix-style-intellij.xml if IntelliJ IDE is used)

…alancer. This is to prevent the WAGED rebalancer reads stale assignment records from the previous rebalance pipeline. For example, 1. Resource A was the only resource. And it is rebalanced by WAGED, then we have a persisted assignment for A. 2. Resource A was reconfigured to using DelayedRebalancer, then we stop the WAGED rebalancer since there is no more resource using WAGED. So the persisted records are still in ZK. 3. Resource A is recreated and using WAGED again. In this case, the previous persisted assignment is no longer valid. We should treat A as a brand new resource instead of considering the stale assignment record. Moreover, this change will help to clean up the ZK persisted data if no resource is using WAGED.

.../src/test/java/org/apache/helix/controller/rebalancer/waged/TestAssignmentMetadataStore.java

...core/src/main/java/org/apache/helix/controller/rebalancer/waged/AssignmentMetadataStore.java

kaisun2000 · 2020-06-30T19:23:44Z

One high level question. public synchronized boolean persistBaseline or bestPossible, the synchronized is to make sure that with in a process, the persisting to ZK is guarded. Is there a case that in a single controller java process, there will be more than one waged pipeline running? My understanding is that for CRUSH full auto, we have per cluster controller pipeline. But for waged, they is only one for all clusters? Am I wrong here?

kaisun2000 · 2020-06-30T19:27:45Z

My follow-up question is actually that there seems still some potential inconsistency issue here not addressed, namely across controllers over different machine.

The pattern is similar to this #1066.

If one controller session expired, and super controller assign another controller to take care the cluster. When first controller come up, his controller pipeline may still run and persists his baseline and best possible again. How do we guard this across java process race condition?

This is a pattern of similar problems. We should think a little bit more generic and address them all long term.

jiajunwang · 2020-06-30T19:42:51Z

One high level question. public synchronized boolean persistBaseline or bestPossible, the synchronized is to make sure that with in a process, the persisting to ZK is guarded. Is there a case that in a single controller java process, there will be more than one waged pipeline running? My understanding is that for CRUSH full auto, we have per cluster controller pipeline. But for waged, they is only one for all clusters? Am I wrong here?

You are wrong.

Both WAGED and the traditional rebalancers are for each cluster.
synchronized is not a must if we make the assumption that it will only be used in the WAGED rebalancer. And we are not going to change the WAGED rebalancer.
Having this to protect is a nice-to-have thing. So do you suggest we remove it? I think it won't hurt.

jiajunwang · 2020-06-30T19:48:01Z

My follow-up question is actually that there seems still some potential inconsistency issue here not addressed, namely across controllers over different machine.

The pattern is similar to this #1066.

If one controller session expired, and super controller assign another controller to take care the cluster. When first controller come up, his controller pipeline may still run and persists his baseline and best possible again. How do we guard this across java process race condition?

This is a pattern of similar problems. We should think a little bit more generic and address them all long term.

Please note that we have an MVCC implemented in the bucket accessor to ensure cross node consistency.

kaisun2000 · 2020-06-30T19:49:17Z

You are wrong.

Both WAGED and the traditional rebalancers are for each cluster.

synchronized is not a must if we make the assumption that it will only be used in the WAGED rebalancer. And we are not going to change the WAGED rebalancer.

Having this to protect is a nice-to-have thing. So do you suggest we remove it? I think it won't hurt.

So the invariant is that each cluster has its own copy of persisted baseline and best-possible. If this is clear, removing sync seems to be safe.

jiajunwang · 2020-06-30T19:52:21Z

You are wrong.

Both WAGED and the traditional rebalancers are for each cluster.

synchronized is not a must if we make the assumption that it will only be used in the WAGED rebalancer. And we are not going to change the WAGED rebalancer.

Having this to protect is a nice-to-have thing. So do you suggest we remove it? I think it won't hurt.

So the invariant is that each cluster has its own copy of persisted baseline and best-possible. If this is clear, removing sync seems to be safe.

But it won't hurt. And IMO, the assumption that we won't change WAGED logic in the future is weak. I prefer to keep it.

kaisun2000 · 2020-06-30T20:02:46Z

But it won't hurt. And IMO, the assumption that we won't change WAGED logic in the future is weak. I prefer to keep it.

As an aside, I recall waged has an advantage as allocating over all clusters to instance. So this is not in this version?

jiajunwang · 2020-06-30T20:04:30Z

"allocating over all clusters to instance" what does that mean? Could you please give me an example? Cheers, -Jiajun

…

________________________________ From: kaisun2000 <[email protected]> Sent: Tuesday, June 30, 2020 1:03 PM To: apache/helix <[email protected]> Cc: Jiajun Wang <[email protected]>; Author <[email protected]> Subject: Re: [apache/helix] Cleanup the persisted assignment state if no resource is on WAGED rebalancer. (#1123) But it won't hurt. And IMO, the assumption that we won't change WAGED logic in the future is weak. I prefer to keep it. As an aside, I recall waged has an advantage as allocating over all clusters to instance. So this is not in this version? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fhelix%2Fpull%2F1123%23issuecomment-652013368&data=02%7C01%7Cjjwang%40linkedin.com%7C6fa9c2a69d514be3a58808d81d3098ee%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637291441844424637&sdata=RooDr9nfa03%2F03oQ97fIe6BJtB3f55prujYogms2P74%3D&reserved=0>, or unsubscribe<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAANYM2BXD7WJPVWJSPTHDE3RZJAHJANCNFSM4OI7NZIA&data=02%7C01%7Cjjwang%40linkedin.com%7C6fa9c2a69d514be3a58808d81d3098ee%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637291441844434630&sdata=9hlYzyVGId9mdz%2BKtuP9cLptJBUDOjJE9KXLedDFBd0%3D&reserved=0>.

kaisun2000 · 2020-06-30T20:07:01Z

Please note that we have an MVCC implemented in the bucket accessor to ensure cross node consistency.

I will try to examine the bucket accessor a little bit more carefully. Here is the question:

Say controller A is for resource R allocation. Then controller A session expires, controller B take care of resource B. Later when A is back, for a while A and B would run at the same time. When serializing say baseline for resource R, A wins over B, is it going to be a problem for this resource as B would later be controller for R.

jiajunwang · 2020-06-30T20:11:12Z

Whichever wins is fine. Since the next pipeline running will take the cached result and keep calculating based on there. * If A wins, then A keeps calculating based on its local state. * If B wins, then because A cache the result after write, it will still calculate based on its local state. And the next pipeline will overwrite B's output. And we are good. In this case, we don't really need B's result. Cheers, -Jiajun

…

________________________________ From: kaisun2000 <[email protected]> Sent: Tuesday, June 30, 2020 1:07 PM To: apache/helix <[email protected]> Cc: Jiajun Wang <[email protected]>; Author <[email protected]> Subject: Re: [apache/helix] Cleanup the persisted assignment state if no resource is on WAGED rebalancer. (#1123) Please note that we have an MVCC implemented in the bucket accessor to ensure cross node consistency. I will try to examine the bucket accessor a little bit more carefully. Here is the question: Say controller A is for resource R allocation. Then controller A session expires, controller B take care of resource B. Later when A is back, for a while A and B would run at the same time. When serializing say baseline for resource R, A wins over B, is it going to be a problem for this resource as B would later be controller for R. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fhelix%2Fpull%2F1123%23issuecomment-652015520&data=02%7C01%7Cjjwang%40linkedin.com%7C61ccb448e537425425a008d81d313072%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637291444385780233&sdata=VIdFasrPljxVkFFnODjJGoxHpYl%2FBmYSxQF5xNHHmzA%3D&reserved=0>, or unsubscribe<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAANYM2AOROJKY37OFQ3OTXTRZJAXHANCNFSM4OI7NZIA&data=02%7C01%7Cjjwang%40linkedin.com%7C61ccb448e537425425a008d81d313072%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637291444385780233&sdata=p2xLzQ1lM%2FPQ2xxMpmqxeMS3cHop%2BzaFqgxPOdOXnc4%3D&reserved=0>.

kaisun2000 · 2020-06-30T20:35:23Z

I see. Thx for the explanation.

jiajunwang · 2020-06-30T20:37:33Z

Thanks for the review : ) Cheers, -Jiajun

…

________________________________ From: kaisun2000 <[email protected]> Sent: Tuesday, June 30, 2020 1:35 PM To: apache/helix <[email protected]> Cc: Jiajun Wang <[email protected]>; Author <[email protected]> Subject: Re: [apache/helix] Cleanup the persisted assignment state if no resource is on WAGED rebalancer. (#1123) I see. Thx for the explanation. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fhelix%2Fpull%2F1123%23issuecomment-652028558&data=02%7C01%7Cjjwang%40linkedin.com%7Cdbb1c17cd78b464a1bcb08d81d352672%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637291461399600513&sdata=Z93RzjQpYlnkcVslXtAP6vv%2Fyu7V6fCD6n8c%2FCb4JqU%3D&reserved=0>, or unsubscribe<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAANYM2BK6BORYRYFMGH3RJ3RZJEBRANCNFSM4OI7NZIA&data=02%7C01%7Cjjwang%40linkedin.com%7Cdbb1c17cd78b464a1bcb08d81d352672%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637291461399600513&sdata=dIiU2fAO5ihW2C1QH11MLsXyYtJzQn43UOzvqcnE2C4%3D&reserved=0>.

huizhilu

LGTM

jiajunwang · 2020-06-30T22:39:51Z

This PR is ready to be merged, approved by @pkuwm

…alancer. (apache#1123) This is to prevent the WAGED rebalancer reads stale assignment records from the previous rebalance pipeline. For example, 1. Resource A was the only resource. And it is rebalanced by WAGED, then we have a persisted assignment for A. 2. Resource A was reconfigured to using DelayedRebalancer, then we stop the WAGED rebalancer since there is no more resource using WAGED. So the persisted records are still in ZK. 3. Resource A is recreated and using WAGED again. In this case, the previous persisted assignment is no longer valid. We should treat A as a brand new resource instead of considering the stale assignment record. Moreover, this change will help to clean up the ZK persisted data if no resource is using WAGED.

Jiajun Wang added 2 commits June 25, 2020 21:49

fix test case.

0e04a57

huizhilu reviewed Jun 26, 2020

View reviewed changes

Address comment.

1e11aa5

kaisun2000 reviewed Jun 30, 2020

View reviewed changes

...core/src/main/java/org/apache/helix/controller/rebalancer/waged/AssignmentMetadataStore.java Show resolved Hide resolved

kaisun2000 reviewed Jun 30, 2020

View reviewed changes

...core/src/main/java/org/apache/helix/controller/rebalancer/waged/AssignmentMetadataStore.java Show resolved Hide resolved

huizhilu approved these changes Jun 30, 2020

View reviewed changes

jiajunwang merged commit 0b98a6e into apache:master Jun 30, 2020

jiajunwang deleted the wagedClean branch June 30, 2020 22:41

Cleanup the persisted assignment state if no resource is on WAGED rebalancer. #1123

Cleanup the persisted assignment state if no resource is on WAGED rebalancer. #1123

Uh oh!

Conversation

jiajunwang commented Jun 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issues

Description

Tests

Commits

Documentation (Optional)

Code Quality

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kaisun2000 commented Jun 30, 2020

Uh oh!

kaisun2000 commented Jun 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jiajunwang commented Jun 30, 2020

Uh oh!

jiajunwang commented Jun 30, 2020

Uh oh!

kaisun2000 commented Jun 30, 2020

Uh oh!

jiajunwang commented Jun 30, 2020

Uh oh!

kaisun2000 commented Jun 30, 2020

Uh oh!

jiajunwang commented Jun 30, 2020 via email

Uh oh!

kaisun2000 commented Jun 30, 2020

Uh oh!

jiajunwang commented Jun 30, 2020 via email

Uh oh!

kaisun2000 commented Jun 30, 2020

Uh oh!

jiajunwang commented Jun 30, 2020 via email

Uh oh!

huizhilu left a comment

Choose a reason for hiding this comment

Uh oh!

jiajunwang commented Jun 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jiajunwang commented Jun 26, 2020 •

edited

Loading

kaisun2000 commented Jun 30, 2020 •

edited

Loading