Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[improvement] Optimize Poller#69

Merged
zhongkechen merged 17 commits intomainfrom
poller
Feb 13, 2026
Merged

[improvement] Optimize Poller#69
zhongkechen merged 17 commits intomainfrom
poller

Conversation

@zhongkechen
Copy link
Contributor

@zhongkechen zhongkechen commented Feb 10, 2026

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Issue Link, if available

#50

Description

  • Added two parameters to DurableConfig: PollingInterval (default: 1s) and CheckpointDelay (default: 0)
  • Added a helper class ApiRequestBatcher to call API with items in batch
  • Added a method pollForUpdate to CheckpointBatcher that allows operations to poll updates from backend for specific operations.
  • Changed the behavior of polling updates. No thread will be created when an operation polling for updates. Instead, a future will be registered in CheckpointBatcher and completes when an update of the operation is received.
  • Changed the thread requirements in the internal thread pool. One background thread will be run to poll the updates for all operations.
  • Replaced pollUntilReady with pollForOperationUpdates
  • Removed "DAR" from code

Demo/Screenshots

[INFO] AWS Lambda Durable Execution SDK Parent ............ SUCCESS [  0.420 s]
[INFO] AWS Lambda Durable Execution SDK for Java .......... SUCCESS [ 43.037 s]
[INFO] AWS Lambda Durable Execution SDK Testing Utilities . SUCCESS [ 19.164 s]
[INFO] AWS Lambda Durable Execution SDK Integration Tests . SUCCESS [01:27 min]
[INFO] AWS Lambda Durable Execution SDK Examples .......... SUCCESS [01:13 min]

[INFO] Results:
[INFO] 
[INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0

Checklist

  • I have filled out every section of the PR template
  • I have thoroughly tested this change

Testing

Unit Tests

Have unit tests been written for these changes? yes

Integration Tests

Have integration tests been written for these changes? N/A

Examples

Has a new example been added for the change? (if applicable) will follow up with an example of using customized polling interval config/checkpoint config

@zhongkechen zhongkechen self-assigned this Feb 10, 2026
@zhongkechen zhongkechen changed the title [refactor] Optimize Poller [refactor] Optimize Poller [WIP] Feb 10, 2026
private final AtomicBoolean isProcessing = new AtomicBoolean(false);
private final AtomicBoolean isRunning = new AtomicBoolean(true);
private final Duration pollingInterval;
private final Map<String, List<CompletableFuture<Operation>>> pollingFutures = new ConcurrentHashMap<>();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too sure if one operation can poll more than once. Using a list here for this case

@zhongkechen zhongkechen changed the title [refactor] Optimize Poller [WIP] [refactor] Optimize Poller Feb 11, 2026
@zhongkechen zhongkechen marked this pull request as ready for review February 11, 2026 06:40
@zhongkechen zhongkechen changed the title [refactor] Optimize Poller [improvement] Optimize Poller Feb 11, 2026
*/
class CheckpointBatcher {
private static final int MAX_BATCH_SIZE_BYTES = 750 * 1024; // 750KB
private static final int MAX_ITEM_COUNT = 100; // max updates in one batch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty conservative, I think Python and JS are using 250, that said it is just a number we picked:

aws/aws-durable-execution-sdk-js#427
aws/aws-durable-execution-sdk-python#281

MAX_ITEM_COUNT, MAX_BATCH_SIZE_BYTES, CheckpointBatcher::estimateSize, this::checkpointBatch);
}

/** Queues a checkpoint request for batched execution */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's a "batched execution"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

This sounds better:

Queues an operation update for batched checkpoint

// The polling will complete the phaser when the backend reports SUCCEEDED
Instant firstPoll = Instant.now().plus(remainingWaitTime).plusMillis(25);
pollForOperationUpdates(firstPoll, Duration.ofMillis(200));
pollForOperationUpdates(remainingWaitTime);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it, why do we need to poll for updates?

So basically, we need to "poll" CheckpointDurableExecution because we're doing this checkpoint operation batching? Wouldn't the future on the operation complete complete and allow the code to keep executing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the execution doesn't get suspended (some async steps running), SDK needs to poll the updates for the operations like in this case for Wait. pollForOperationUpdates here just register a poller so that the CheckpointBatcher will keep polling until we receive an update for this operation.

@zhongkechen zhongkechen merged commit c4f440f into main Feb 13, 2026
9 of 13 checks passed
@zhongkechen zhongkechen deleted the poller branch February 13, 2026 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants