Skip to content

Conversation

@jiajunwang
Copy link
Contributor

@jiajunwang jiajunwang commented Jun 26, 2020

Issues

  • My PR addresses the following Helix issues and references them in the PR description:

#1120

Description

  • Here are some details about my PR, including screenshots of any UI changes:

This is to prevent the WAGED rebalancer reads stale assignment records from the previous rebalance pipeline.
For example,

  1. Resource A was the only resource. And it is rebalanced by WAGED, then we have a persisted assignment for A.
  2. Resource A was reconfigured to using DelayedRebalancer, then we stop the WAGED rebalancer since there is no more resource using WAGED. So the persisted records are still in ZK.
  3. Resource A is recreated and using WAGED again. In this case, the previous persisted assignment is no longer valid. We should treat A as a brand new resource instead of considering the stale assignment record.

Moreover, this change will help to clean up the ZK persisted data if no resource is using WAGED.

Tests

  • The following tests are written for this issue:

TestAssignmentMetadataStore, TestWagedRebalancer

  • The following is the result of the "mvn test" command on the appropriate module:

[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestJobQueueCleanUp.testJobQueueAutoCleanUp » ThreadTimeout Method org.testng....
[INFO]
[ERROR] Tests run: 1149, Failures: 1, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:25 h
[INFO] Finished at: 2020-06-26T00:00:39-07:00
[INFO] ------------------------------------------------------------------------

Rerun

[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.508 s - in org.apache.helix.integration.task.TestJobQueueCleanUp
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 23.728 s
[INFO] Finished at: 2020-06-26T00:01:34-07:00
[INFO] ------------------------------------------------------------------------

Commits

  • My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation (Optional)

  • In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Code Quality

  • My diff has been formatted using helix-style.xml
    (helix-style-intellij.xml if IntelliJ IDE is used)

Jiajun Wang added 2 commits June 25, 2020 21:49
…alancer.

This is to prevent the WAGED rebalancer reads stale assignment records from the previous rebalance pipeline.
For example,
1. Resource A was the only resource. And it is rebalanced by WAGED, then we have a persisted assignment for A.
2. Resource A was reconfigured to using DelayedRebalancer, then we stop the WAGED rebalancer since there is no more resource using WAGED. So the persisted records are still in ZK.
3. Resource A is recreated and using WAGED again. In this case, the previous persisted assignment is no longer valid. We should treat A as a brand new resource instead of considering the stale assignment record.

Moreover, this change will help to clean up the ZK persisted data if no resource is using WAGED.
@kaisun2000
Copy link
Contributor

One high level question. public synchronized boolean persistBaseline or bestPossible, the synchronized is to make sure that with in a process, the persisting to ZK is guarded. Is there a case that in a single controller java process, there will be more than one waged pipeline running? My understanding is that for CRUSH full auto, we have per cluster controller pipeline. But for waged, they is only one for all clusters? Am I wrong here?

@kaisun2000
Copy link
Contributor

kaisun2000 commented Jun 30, 2020

My follow-up question is actually that there seems still some potential inconsistency issue here not addressed, namely across controllers over different machine.

The pattern is similar to this #1066.

If one controller session expired, and super controller assign another controller to take care the cluster. When first controller come up, his controller pipeline may still run and persists his baseline and best possible again. How do we guard this across java process race condition?

This is a pattern of similar problems. We should think a little bit more generic and address them all long term.

@jiajunwang
Copy link
Contributor Author

One high level question. public synchronized boolean persistBaseline or bestPossible, the synchronized is to make sure that with in a process, the persisting to ZK is guarded. Is there a case that in a single controller java process, there will be more than one waged pipeline running? My understanding is that for CRUSH full auto, we have per cluster controller pipeline. But for waged, they is only one for all clusters? Am I wrong here?

You are wrong.

  1. Both WAGED and the traditional rebalancers are for each cluster.
  2. synchronized is not a must if we make the assumption that it will only be used in the WAGED rebalancer. And we are not going to change the WAGED rebalancer.
  3. Having this to protect is a nice-to-have thing. So do you suggest we remove it? I think it won't hurt.

@jiajunwang
Copy link
Contributor Author

My follow-up question is actually that there seems still some potential inconsistency issue here not addressed, namely across controllers over different machine.

The pattern is similar to this #1066.

If one controller session expired, and super controller assign another controller to take care the cluster. When first controller come up, his controller pipeline may still run and persists his baseline and best possible again. How do we guard this across java process race condition?

This is a pattern of similar problems. We should think a little bit more generic and address them all long term.

Please note that we have an MVCC implemented in the bucket accessor to ensure cross node consistency.

@kaisun2000
Copy link
Contributor

You are wrong.

  1. Both WAGED and the traditional rebalancers are for each cluster.
  2. synchronized is not a must if we make the assumption that it will only be used in the WAGED rebalancer. And we are not going to change the WAGED rebalancer.
  3. Having this to protect is a nice-to-have thing. So do you suggest we remove it? I think it won't hurt.

So the invariant is that each cluster has its own copy of persisted baseline and best-possible. If this is clear, removing sync seems to be safe.

@jiajunwang
Copy link
Contributor Author

You are wrong.

  1. Both WAGED and the traditional rebalancers are for each cluster.
  2. synchronized is not a must if we make the assumption that it will only be used in the WAGED rebalancer. And we are not going to change the WAGED rebalancer.
  3. Having this to protect is a nice-to-have thing. So do you suggest we remove it? I think it won't hurt.

So the invariant is that each cluster has its own copy of persisted baseline and best-possible. If this is clear, removing sync seems to be safe.

But it won't hurt. And IMO, the assumption that we won't change WAGED logic in the future is weak. I prefer to keep it.

@kaisun2000
Copy link
Contributor

But it won't hurt. And IMO, the assumption that we won't change WAGED logic in the future is weak. I prefer to keep it.

As an aside, I recall waged has an advantage as allocating over all clusters to instance. So this is not in this version?

@jiajunwang
Copy link
Contributor Author

jiajunwang commented Jun 30, 2020 via email

@kaisun2000
Copy link
Contributor

Please note that we have an MVCC implemented in the bucket accessor to ensure cross node consistency.

I will try to examine the bucket accessor a little bit more carefully. Here is the question:

Say controller A is for resource R allocation. Then controller A session expires, controller B take care of resource B. Later when A is back, for a while A and B would run at the same time. When serializing say baseline for resource R, A wins over B, is it going to be a problem for this resource as B would later be controller for R.

@jiajunwang
Copy link
Contributor Author

jiajunwang commented Jun 30, 2020 via email

@kaisun2000
Copy link
Contributor

I see. Thx for the explanation.

@jiajunwang
Copy link
Contributor Author

jiajunwang commented Jun 30, 2020 via email

Copy link
Contributor

@huizhilu huizhilu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiajunwang
Copy link
Contributor Author

This PR is ready to be merged, approved by @pkuwm

@jiajunwang jiajunwang merged commit 0b98a6e into apache:master Jun 30, 2020
@jiajunwang jiajunwang deleted the wagedClean branch June 30, 2020 22:41
huizhilu pushed a commit to huizhilu/helix that referenced this pull request Aug 16, 2020
…alancer. (apache#1123)

This is to prevent the WAGED rebalancer reads stale assignment records from the previous rebalance pipeline.
For example,

1. Resource A was the only resource. And it is rebalanced by WAGED, then we have a persisted assignment for A.
2. Resource A was reconfigured to using DelayedRebalancer, then we stop the WAGED rebalancer since there is no more resource using WAGED. So the persisted records are still in ZK.
3. Resource A is recreated and using WAGED again. In this case, the previous persisted assignment is no longer valid. We should treat A as a brand new resource instead of considering the stale assignment record.

Moreover, this change will help to clean up the ZK persisted data if no resource is using WAGED.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants