Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@anujmodi2021
Copy link
Contributor

Description of PR

JIRA: https://issues.apache.org/jira/browse/HADOOP-19622

Implementing ReadBufferManagerV2 as per the new design document.
Following capabilities are added to ReadBufferManager:

  1. Configurable minimum and maximum number of prefetch threads.
  2. Configurable minimum and maximum size of cached buffer pool
  3. Dynamically adjusting thread pool size and buffer pool size based on workload requirement and resource utilization but within the limits defined by user.
  4. Mapping prefetched data to file ETag so that multiple streams reading same file can share the cache and save TPS.

For more details on design doc please refer to the design doc attached to parent JIRA: https://issues.apache.org/jira/browse/HADOOP-19596

How was this patch tested?

TBA

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@github-actions github-actions bot added the build label Aug 4, 2025
@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

public static final int DEFAULT_READAHEAD_V2_MAX_BUFFER_POOL_SIZE = -1;
public static final int DEFAULT_READAHEAD_V2_EXECUTOR_SERVICE_TTL_MILLIS = 3_000;
public static final int DEFAULT_READAHEAD_V2_CPU_MONITORING_INTERVAL_MILLIS = 6_000;
public static final int DEFAULT_READAHEAD_V2_THREAD_POOL_UPSCALE_PERCENTAGE = 20;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some variable you have used Persentage and for percent, we should keep it consistent across all places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken

public static final int DEFAULT_READAHEAD_V2_EXECUTOR_SERVICE_TTL_MILLIS = 3_000;
public static final int DEFAULT_READAHEAD_V2_CPU_MONITORING_INTERVAL_MILLIS = 6_000;
public static final int DEFAULT_READAHEAD_V2_THREAD_POOL_UPSCALE_PERCENTAGE = 20;
public static final int DEFAULT_READAHEAD_V2_THREAD_POOL_DOWNSCALE_PERCENTAGE = 30;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken

private static int threadPoolUpscalePercentage;
private static int threadPoolDownscalePercentage;
private static int executorServiceKeepAliveTimeInMilliSec;
private static final double THREAD_POOL_REQUIREMENT_BUFFER = 1.2; // 20% more threads than the queue size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this configurable? or we have fixed this number based on POC data? Is so can we explain about it little more for future understanding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only be used while upscaling. This is to make sure we have sufficient threads in thread pool to cater to all the queued buffer. This is just to understand if we need to upscale further or not. If we don't have enough queued requests we won't upscale even if cpu is much below threshhold.

The new thread pool size is still computed using configured vaues only.

setReadAheadBlockSize(readAheadBlockSize);
}
private ReadBufferManagerV2() {
printTraceLog("Creating Read Buffer Manager V2 with HADOOP-18546 patch");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use LOG.trace instead of printTraceLOG.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its same only. Just a method call to avoid redundant checks on log

TimeUnit.MILLISECONDS);
}

printTraceLog("ReadBufferManagerV2 initialized with {} buffers and {} worker threads",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, please change it whereever you have used it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a intentional method created to avoid code redundancy

return stream;
}

public String getETag() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this return null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read Buffers are created only during queueing . There we are setting it.

// while waiting, so no one will be able to change any state. If this becomes more complex in the future,
// then the latch cane be removed and replaced with wait/notify whenever getInProgressList() is touched.
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the catch block- we're interrupting the thread but not informing the caller. Is it expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be thrown to caller.

// getInProgressList(). So this latch is safe to be outside the synchronized block.
// Putting it in synchronized would result in a deadlock, since this thread would be holding the lock
// while waiting, so no one will be able to change any state. If this becomes more complex in the future,
// then the latch cane be removed and replaced with wait/notify whenever getInProgressList() is touched.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can spelling

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken

int cursor = (int) (position - buf.getOffset());
int availableLengthInBuffer = buf.getLength() - cursor;
int lengthToCopy = Math.min(length, availableLengthInBuffer);
System.arraycopy(buf.getBuffer(), cursor, buffer, 0, lengthToCopy);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can skip it for cases when lengthToCopy = 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For length 0 we won't even read.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

LOG.debug("issuing read ahead requestedOffset = {} requested size {}",
nextOffset, nextSize);
readBufferManager.queueReadAhead(this, nextOffset, (int) nextSize,
getReadBufferManager().queueReadAhead(this, nextOffset, (int) nextSize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs changes in constructor to return an instance of readbuffermanager v2 if enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

Copy link
Contributor

@bhattmanish98 bhattmanish98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM

@anujmodi2021
Copy link
Contributor Author

============================================================
HNS-OAuth-DFS

[WARNING] Tests run: 202, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 874, Failures: 0, Errors: 0, Skipped: 214
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 8
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 23

============================================================
HNS-SharedKey-DFS

[WARNING] Tests run: 202, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 877, Failures: 0, Errors: 0, Skipped: 166
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 8
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 10

============================================================
NonHNS-SharedKey-DFS

[WARNING] Tests run: 202, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 716, Failures: 0, Errors: 0, Skipped: 279
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 9
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 11

============================================================
AppendBlob-HNS-OAuth-DFS

[WARNING] Tests run: 202, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 874, Failures: 0, Errors: 0, Skipped: 225
[WARNING] Tests run: 135, Failures: 0, Errors: 0, Skipped: 9
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 23

============================================================
NonHNS-SharedKey-Blob

[WARNING] Tests run: 202, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 723, Failures: 0, Errors: 0, Skipped: 137
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 11

============================================================
NonHNS-OAuth-DFS

[WARNING] Tests run: 202, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 713, Failures: 0, Errors: 0, Skipped: 281
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 9
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 24

============================================================
NonHNS-OAuth-Blob

[WARNING] Tests run: 202, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 720, Failures: 0, Errors: 0, Skipped: 149
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 24

============================================================
AppendBlob-NonHNS-OAuth-Blob

[WARNING] Tests run: 202, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 715, Failures: 0, Errors: 0, Skipped: 195
[WARNING] Tests run: 135, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 24

============================================================
HNS-Oauth-DFS-IngressBlob

[WARNING] Tests run: 202, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 748, Failures: 0, Errors: 0, Skipped: 223
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 8
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 23

============================================================
NonHNS-OAuth-DFS-IngressBlob

[WARNING] Tests run: 202, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 713, Failures: 0, Errors: 0, Skipped: 278
[WARNING] Tests run: 158, Failures: 0, Errors: 0, Skipped: 9
[WARNING] Tests run: 271, Failures: 0, Errors: 0, Skipped: 24

@hadoop-yetus
Copy link

πŸ’” -1 overall

Vote Subsystem Runtime Logfile Comment
+0 πŸ†— reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 πŸ’š dupname 0m 0s No case conflicting files found.
+0 πŸ†— codespell 0m 0s codespell was not available.
+0 πŸ†— detsecrets 0m 0s detect-secrets was not available.
+1 πŸ’š @author 0m 0s The patch does not contain any @author tags.
+1 πŸ’š test4tests 0m 0s The patch appears to include 10 new or modified test files.
_ trunk Compile Tests _
+1 πŸ’š mvninstall 21m 24s trunk passed
+1 πŸ’š compile 0m 25s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 πŸ’š compile 0m 23s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 πŸ’š checkstyle 0m 18s trunk passed
+1 πŸ’š mvnsite 0m 28s trunk passed
+1 πŸ’š javadoc 0m 23s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 πŸ’š javadoc 0m 23s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
-1 ❌ spotbugs 0m 45s /branch-spotbugs-hadoop-tools_hadoop-azure-warnings.html hadoop-tools/hadoop-azure in trunk has 178 extant spotbugs warnings.
+1 πŸ’š shadedclient 14m 11s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 14m 23s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 πŸ’š mvninstall 0m 20s the patch passed
+1 πŸ’š compile 0m 18s the patch passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 πŸ’š javac 0m 18s the patch passed
+1 πŸ’š compile 0m 20s the patch passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 πŸ’š javac 0m 20s the patch passed
+1 πŸ’š blanks 0m 1s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 11s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 1 new + 5 unchanged - 9 fixed = 6 total (was 14)
+1 πŸ’š mvnsite 0m 21s the patch passed
-1 ❌ javadoc 0m 17s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-azure-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 generated 54 new + 1464 unchanged - 0 fixed = 1518 total (was 1464)
-1 ❌ javadoc 0m 16s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-azure-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 generated 22 new + 1390 unchanged - 0 fixed = 1412 total (was 1390)
-1 ❌ spotbugs 0m 46s /new-spotbugs-hadoop-tools_hadoop-azure.html hadoop-tools/hadoop-azure generated 9 new + 168 unchanged - 10 fixed = 177 total (was 178)
+1 πŸ’š shadedclient 14m 19s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 πŸ’š unit 2m 17s hadoop-azure in the patch passed.
+1 πŸ’š asflicense 0m 21s The patch does not generate ASF License warnings.
59m 45s
Reason Tests
SpotBugs module:hadoop-tools/hadoop-azure
Unknown bug pattern AT_NONATOMIC_64BIT_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setOffset(long) At ReadBuffer.java:org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setOffset(long) At ReadBuffer.java:[line 91]
Unknown bug pattern AT_NONATOMIC_64BIT_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setTimeStamp(long) At ReadBuffer.java:org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setTimeStamp(long) At ReadBuffer.java:[line 172]
Unknown bug pattern AT_STALE_THREAD_WRITE_OF_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setAnyByteConsumed(boolean) At ReadBuffer.java:org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setAnyByteConsumed(boolean) At ReadBuffer.java:[line 196]
Unknown bug pattern AT_STALE_THREAD_WRITE_OF_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setBufferindex(int) At ReadBuffer.java:org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setBufferindex(int) At ReadBuffer.java:[line 123]
Unknown bug pattern AT_STALE_THREAD_WRITE_OF_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setFirstByteConsumed(boolean) At ReadBuffer.java:org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setFirstByteConsumed(boolean) At ReadBuffer.java:[line 180]
Unknown bug pattern AT_STALE_THREAD_WRITE_OF_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setLastByteConsumed(boolean) At ReadBuffer.java:org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setLastByteConsumed(boolean) At ReadBuffer.java:[line 188]
Unknown bug pattern AT_STALE_THREAD_WRITE_OF_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setLength(int) At ReadBuffer.java:org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setLength(int) At ReadBuffer.java:[line 99]
Unknown bug pattern AT_STALE_THREAD_WRITE_OF_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setRequestedLength(int) At ReadBuffer.java:org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setRequestedLength(int) At ReadBuffer.java:[line 107]
Unknown bug pattern AT_STALE_THREAD_WRITE_OF_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setStatus(ReadBufferStatus) At ReadBuffer.java:org.apache.hadoop.fs.azurebfs.services.ReadBuffer.setStatus(ReadBufferStatus) At ReadBuffer.java:[line 141]
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7832/30/artifact/out/Dockerfile
GITHUB PR #7832
JIRA Issue HADOOP-19622
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux c80b7679a995 5.15.0-153-generic #163-Ubuntu SMP Thu Aug 7 16:37:18 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c7f986e
Default Java Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7832/30/testReport/
Max. process+thread count 636 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7832/30/console
versions git=2.25.1 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants