HADOOP-19737: ABFS: Add metrics to identify improvements with read and write aggressiveness#8056
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
| public static final int DEFAULT_WRITE_HIGH_TIER_MEMORY_MULTIPLIER = 16; | ||
|
|
||
| /** Percentage threshold of heap usage at which memory pressure is considered high. */ | ||
| public static final int DEFAULT_WRITE_HIGH_MEMORY_USAGE_THRESHOLD_PERCENTAGE = 60; |
There was a problem hiding this comment.
Since we have used PERCENT in the FileSystemConfiguration FS_AZURE_WRITE_HIGH_MEMORY_USAGE_THRESHOLD_PERCENT, should be better to rename this to DEFAULT_WRITE_HIGH_MEMORY_USAGE_THRESHOLD_PERCENT
| public static final int DEFAULT_WRITE_HIGH_MEMORY_USAGE_THRESHOLD_PERCENTAGE = 60; | ||
|
|
||
| /** Percentage threshold of heap usage at which memory pressure is considered low. */ | ||
| public static final int DEFAULT_WRITE_LOW_MEMORY_USAGE_THRESHOLD_PERCENTAGE = 30; |
| * | ||
| * @return the metric name. | ||
| */ | ||
| public String getName() { |
There was a problem hiding this comment.
This method and below one should be annotated with @OverRide
| * | ||
| * @return the metric name. | ||
| */ | ||
| public String getName() { |
| abfsUriQueryBuilder, cachedSasToken); | ||
|
|
||
| // Retrieve the read thread pool metrics from the ABFS counters. | ||
| AbfsReadResourceUtilizationMetrics readResourceUtilizationMetrics = retrieveReadResourceUtilizationMetrics(); |
There was a problem hiding this comment.
This part of code is common in both DFS and blob client, we can define a method in abfs client class which will add the metrics to tracing Context and that method will be called from both the places.
There was a problem hiding this comment.
already defined the common method retrieveReadResourceUtilizationMetrics in abfs client class, tracing context changes we always do in respective client methods only as per the previous API's. Can take this up if still needed
| case TWO_ID_FORMAT: | ||
| header = TracingHeaderVersion.getCurrentVersion() + COLON | ||
| + clientCorrelationID + COLON + clientRequestId; | ||
| metricHeader += !(metricResults.trim().isEmpty()) ? metricResults : EMPTY_STRING; |
There was a problem hiding this comment.
Why have we removed this?
There was a problem hiding this comment.
So, the metricResults would be empty string unless anything added to it. As per latest version changes it should either be some string, else emptyString. Hence, I made this change:- metricResults + COLON + resourceUtilizationMetricResults;
|
|
||
| static ReadBufferManagerV2 getBufferManager() { | ||
| /** | ||
| * Returns the singleton instance of {@code ReadBufferManagerV2} for the given ABFS client. |
There was a problem hiding this comment.
Comment still need to be fixed
| * @param abfsCounters the {@link AbfsCounters} used for managing read operations. | ||
| */ | ||
| private ReadBufferManagerV2() { | ||
| private ReadBufferManagerV2(AbfsCounters abfsCounters) { |
There was a problem hiding this comment.
This constructor will be called only once for all the FSs
Are abfsCounters and readThreadPoolMetrics also singleton?
What if these are different for different Filesystems?
There was a problem hiding this comment.
So, we just need a placeholder for the metrics initialization, these metrics are filesystem agnostic, they only tell the current resource utilization and what matters is the JVM. We can discuss this more offline.
|
|
||
| /** | ||
| * Private constructor to prevent instantiation as this needs to be singleton. | ||
| * Initializes a new instance of {@code ReadBufferManagerV2} for the given ABFS client. |
| * Schema: version:clientCorrelationId:clientRequestId:fileSystemId | ||
| * :primaryRequestId:streamId:opType:retryHeader:ingressHandler | ||
| * :position:operatedBlobCount:operationSpecificHeader:httpOperationHeader | ||
| * :networkLibrary:operationMetrics |
There was a problem hiding this comment.
Earlier it was 13 right?
We are only adding one more in this PR.
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
============================================================
|
|
💔 -1 overall
This message was automatically generated. |
Introduces new performance metrics in the ABFS driver to monitor and evaluate the effectiveness of read and write aggressiveness tuning. These metrics help in understanding how thread pool behavior, CPU utilization, and heap availability impact overall I/O throughput and latency. By capturing detailed statistics such as active thread count, pool size, and system resource utilization, this enhancement enables data-driven analysis of optimizations made to improve ABFS read and write performance under varying workloads.