-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-19622: [ABFS][ReadAheadV2] Implement Read Buffer Manager V2 with improved aggressiveness #7832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
| public static final int DEFAULT_READAHEAD_V2_MAX_BUFFER_POOL_SIZE = -1; | ||
| public static final int DEFAULT_READAHEAD_V2_EXECUTOR_SERVICE_TTL_MILLIS = 3_000; | ||
| public static final int DEFAULT_READAHEAD_V2_CPU_MONITORING_INTERVAL_MILLIS = 6_000; | ||
| public static final int DEFAULT_READAHEAD_V2_THREAD_POOL_UPSCALE_PERCENTAGE = 20; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some variable you have used Persentage and for percent, we should keep it consistent across all places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken
| public static final int DEFAULT_READAHEAD_V2_EXECUTOR_SERVICE_TTL_MILLIS = 3_000; | ||
| public static final int DEFAULT_READAHEAD_V2_CPU_MONITORING_INTERVAL_MILLIS = 6_000; | ||
| public static final int DEFAULT_READAHEAD_V2_THREAD_POOL_UPSCALE_PERCENTAGE = 20; | ||
| public static final int DEFAULT_READAHEAD_V2_THREAD_POOL_DOWNSCALE_PERCENTAGE = 30; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken
| private static int threadPoolUpscalePercentage; | ||
| private static int threadPoolDownscalePercentage; | ||
| private static int executorServiceKeepAliveTimeInMilliSec; | ||
| private static final double THREAD_POOL_REQUIREMENT_BUFFER = 1.2; // 20% more threads than the queue size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this configurable? or we have fixed this number based on POC data? Is so can we explain about it little more for future understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only be used while upscaling. This is to make sure we have sufficient threads in thread pool to cater to all the queued buffer. This is just to understand if we need to upscale further or not. If we don't have enough queued requests we won't upscale even if cpu is much below threshhold.
The new thread pool size is still computed using configured vaues only.
| setReadAheadBlockSize(readAheadBlockSize); | ||
| } | ||
| private ReadBufferManagerV2() { | ||
| printTraceLog("Creating Read Buffer Manager V2 with HADOOP-18546 patch"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use LOG.trace instead of printTraceLOG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its same only. Just a method call to avoid redundant checks on log
| TimeUnit.MILLISECONDS); | ||
| } | ||
|
|
||
| printTraceLog("ReadBufferManagerV2 initialized with {} buffers and {} worker threads", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, please change it whereever you have used it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a intentional method created to avoid code redundancy
| return stream; | ||
| } | ||
|
|
||
| public String getETag() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this return null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read Buffers are created only during queueing . There we are setting it.
| // while waiting, so no one will be able to change any state. If this becomes more complex in the future, | ||
| // then the latch cane be removed and replaced with wait/notify whenever getInProgressList() is touched. | ||
| } catch (InterruptedException ex) { | ||
| Thread.currentThread().interrupt(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the catch block- we're interrupting the thread but not informing the caller. Is it expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be thrown to caller.
| // getInProgressList(). So this latch is safe to be outside the synchronized block. | ||
| // Putting it in synchronized would result in a deadlock, since this thread would be holding the lock | ||
| // while waiting, so no one will be able to change any state. If this becomes more complex in the future, | ||
| // then the latch cane be removed and replaced with wait/notify whenever getInProgressList() is touched. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can spelling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken
| int cursor = (int) (position - buf.getOffset()); | ||
| int availableLengthInBuffer = buf.getLength() - cursor; | ||
| int lengthToCopy = Math.min(length, availableLengthInBuffer); | ||
| System.arraycopy(buf.getBuffer(), cursor, buffer, 0, lengthToCopy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can skip it for cases when lengthToCopy = 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For length 0 we won't even read.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
| LOG.debug("issuing read ahead requestedOffset = {} requested size {}", | ||
| nextOffset, nextSize); | ||
| readBufferManager.queueReadAhead(this, nextOffset, (int) nextSize, | ||
| getReadBufferManager().queueReadAhead(this, nextOffset, (int) nextSize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs changes in constructor to return an instance of readbuffermanager v2 if enabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
β¦ into RBMV2_HADOOP-19622
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
============================================================
|
|
π -1 overall
This message was automatically generated. |
Description of PR
JIRA: https://issues.apache.org/jira/browse/HADOOP-19622
Implementing ReadBufferManagerV2 as per the new design document.
Following capabilities are added to ReadBufferManager:
For more details on design doc please refer to the design doc attached to parent JIRA: https://issues.apache.org/jira/browse/HADOOP-19596
How was this patch tested?
TBA