Skip to content

Conversation

@GrantPSpencer
Copy link
Contributor

@GrantPSpencer GrantPSpencer commented May 10, 2025

Issues

  • My PR addresses the following Helix issues and references them in the PR description:

Performance improvement

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Currently, PropertyKey path is built on every call to PropertyKey.getPath() and performs regex match. This can be expensive and redundant as when we add a listener, we loop through the existing handlers and call propertyKey.getPath() for every handler that exists. The performance hit will be most noticeable in clusters with a large instance count.

Technically, this will make it so that PropertyKey.getPath() will no longer update after instantiation if the underlying PropertyPathBuilder's templateMap is updated. However, I did not find this being done anywhere in the code so in effect there should be no change

Flame graph from 5 minute CPU profile taken during controller transiting to LEADER for clusters with thousands of instances.

Screenshot 2025-05-12 at 8 29 49 AM

Screenshot 2025-05-12 at 8 30 04 AM

Tests

  • The following tests are written for this issue:

None

  • The following is the result of the "mvn test" command on the appropriate module:

N/A

Commits

  • My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Code Quality

  • My diff has been formatted using helix-style.xml
    (helix-style-intellij.xml if IntelliJ IDE is used)

Copy link
Contributor

@junkaixue junkaixue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good finding.

@GrantPSpencer
Copy link
Contributor Author

Pull request approved by: @junkaixue
Commit message: Changes PropertyKey so that it only builds the path once when it is instantiated. PropertyKey.getPath() previously would rebuild the path on every call. This could cause a performance hit as it was performing a regex match each time. getPath now returns the path that was constructed when the PropertyKey object is first instantiated

@GrantPSpencer
Copy link
Contributor Author

CI failed due to known flaky test testEvacuateWithDisabledPartition(org.apache.helix.integration.rebalancer.TestInstanceOperation)
Tracked by Issue #2906

@xyuanlu xyuanlu merged commit ee7a8b8 into apache:master May 21, 2025
2 of 3 checks passed
csudharsanan pushed a commit to csudharsanan/helix that referenced this pull request May 27, 2025
Changes PropertyKey so that it only builds the path once when it is instantiated. PropertyKey.getPath() previously would rebuild the path on every call. This could cause a performance hit as it was performing a regex match each time. getPath now returns the path that was constructed when the PropertyKey object is first instantiated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants