Conversation
sdk/src/main/java/com/amazonaws/lambda/durable/execution/CheckpointBatcher.java
Show resolved
Hide resolved
| private final AtomicBoolean isProcessing = new AtomicBoolean(false); | ||
| private final AtomicBoolean isRunning = new AtomicBoolean(true); | ||
| private final Duration pollingInterval; | ||
| private final Map<String, List<CompletableFuture<Operation>>> pollingFutures = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Not too sure if one operation can poll more than once. Using a list here for this case
sdk/src/main/java/com/amazonaws/lambda/durable/operation/BaseDurableOperation.java
Show resolved
Hide resolved
| */ | ||
| class CheckpointBatcher { | ||
| private static final int MAX_BATCH_SIZE_BYTES = 750 * 1024; // 750KB | ||
| private static final int MAX_ITEM_COUNT = 100; // max updates in one batch |
There was a problem hiding this comment.
This is pretty conservative, I think Python and JS are using 250, that said it is just a number we picked:
aws/aws-durable-execution-sdk-js#427
aws/aws-durable-execution-sdk-python#281
| MAX_ITEM_COUNT, MAX_BATCH_SIZE_BYTES, CheckpointBatcher::estimateSize, this::checkpointBatch); | ||
| } | ||
|
|
||
| /** Queues a checkpoint request for batched execution */ |
There was a problem hiding this comment.
what's a "batched execution"?
There was a problem hiding this comment.
Good catch.
This sounds better:
Queues an operation update for batched checkpoint
| // The polling will complete the phaser when the backend reports SUCCEEDED | ||
| Instant firstPoll = Instant.now().plus(remainingWaitTime).plusMillis(25); | ||
| pollForOperationUpdates(firstPoll, Duration.ofMillis(200)); | ||
| pollForOperationUpdates(remainingWaitTime); |
There was a problem hiding this comment.
I don't get it, why do we need to poll for updates?
So basically, we need to "poll" CheckpointDurableExecution because we're doing this checkpoint operation batching? Wouldn't the future on the operation complete complete and allow the code to keep executing?
There was a problem hiding this comment.
If the execution doesn't get suspended (some async steps running), SDK needs to poll the updates for the operations like in this case for Wait. pollForOperationUpdates here just register a poller so that the CheckpointBatcher will keep polling until we receive an update for this operation.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Issue Link, if available
#50
Description
PollingInterval(default: 1s) andCheckpointDelay(default: 0)pollForUpdatetoCheckpointBatcherthat allows operations to poll updates from backend for specific operations.pollUntilReadywithpollForOperationUpdatesDemo/Screenshots
Checklist
Testing
Unit Tests
Have unit tests been written for these changes? yes
Integration Tests
Have integration tests been written for these changes? N/A
Examples
Has a new example been added for the change? (if applicable) will follow up with an example of using customized polling interval config/checkpoint config