-
Notifications
You must be signed in to change notification settings - Fork 242
Integrate statistical measurement in the final OptimalAssignment output #486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| for (AssignableNode instance : instances) { | ||
| Map<String, Integer> capacityUsage = instance.getCapacityUsage(); | ||
| for (String key : capacityUsage.keySet()) { | ||
| usages.computeIfAbsent(key, k -> new ArrayList<>()).add(capacityUsage.get(key)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you spell out lambda variables? Like capacityKey, usageKey, etc.
| } | ||
|
|
||
| return usages.entrySet().stream() | ||
| .collect(Collectors.toMap(Map.Entry::getKey, e -> getCoefficientOfVariation(e.getValue()))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you spell out lambda variables? Like capacityKey, usageKey, etc.
| * @return a simple cumulative count of total movements where differences of movements in terms of | ||
| * location, size, etc is ignored | ||
| */ | ||
| public int getTotalPartitionMovements(Map<String, ResourceAssignment> baseAssignment) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
baseAssignment -> previousAssignment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented in the javadoc, the purpose of the method is to allow caller to pass any resource assignments (baseline or bestpossible) and it will calculate the movements
| */ | ||
| public int getTotalPartitionMovements(Map<String, ResourceAssignment> baseAssignment) { | ||
| Map<String, ResourceAssignment> optimalAssignment = getOptimalResourceAssignment(); | ||
| int movements = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
movements -> movedPartitionCount to be clear.
| ResourceAssignment lastResourceAssignment = baseAssignment.get(resource); | ||
| for (Partition partition : resourceAssignment.getMappedPartitions()) { | ||
| Map<String, String> thisInstanceToStates = | ||
| new HashMap<>(resourceAssignment.getReplicaMap(partition)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to copy the map? Possible to synchronize on the assignment or something?
I'm worried that this might add a significant amount of GC pressure since we'd be doing this for the entire assignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my understanding, this logic is for tuning/testing purposes. So if they are not in this file, but in some tuning class instead, it should be fine.
| Partition partition = new Partition(replica.getPartitionName()); | ||
| ResourceAssignment resourceAssignment = assignmentMap | ||
| .computeIfAbsent(resourceName, key -> new ResourceAssignment(resourceName)); | ||
| ResourceAssignment resourceAssignment = assignmentMap.computeIfAbsent(resourceName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does formatting keep changing?
| public String getFailures() { | ||
| // TODO: format the error string | ||
| return _failedAssignments.toString(); | ||
| public String getErrorMessage() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'm afraid this might pollute the log big time.
- What value do we have in printing out all failed assignments? I thought as soon as we encounter a failed assignment, we just terminate? So what we'll see here is always just the first failed assignment?
jiajunwang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, the measurement logic is for testing purpose only. It is not a good idea to pollute the production code with testing logic.
Could you please create a separate class for the testing? One option is to create a child class of OptimalAssignment. Then adding the measurement code there. Or you just have a standalone measurement method in the tuning code.
Note that if you put the code in OptimalAssignment, anyone who modifies it will be forced to maintain the measurement code as well.
| String errorMessage = String.format( | ||
| "Unable to find any available candidate node for partition %s; Fail reasons: %s", | ||
| replica.getPartitionName(), optimalAssignment.getFailures()); | ||
| replica.getPartitionName(), optimalAssignment.getErrorMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OptimalAssignment is not an Exception.
The meaning of "ErrorMessage" is not clear.
I would prefer to call it AssignmentFailures or just keep it as it was.
|
|
||
|
|
||
| //TODO: will implement by merging other changes | ||
| public Map<String, Integer> getCapacityUsage() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's don't make it public since it is not supporting the feature.
I would prefer the model class to be clean and no extra method, as long as it is possible.
| ResourceAssignment lastResourceAssignment = baseAssignment.get(resource); | ||
| for (Partition partition : resourceAssignment.getMappedPartitions()) { | ||
| Map<String, String> thisInstanceToStates = | ||
| new HashMap<>(resourceAssignment.getReplicaMap(partition)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my understanding, this logic is for tuning/testing purposes. So if they are not in this file, but in some tuning class instead, it should be fine.
Issues
(#480)
Description
(Write a concise description including what, why, how)
Added CV and movements measurement tools for the statistical understanding of the final optimal assignment
Tests
(List the names of added unit/integration tests)
(Copy & paste the result of "mvn test")
Commits
Documentation
(Link the GitHub wiki you added)
Code Quality