tool_operate: keep failed partial download for retry auto-resume #15333

jay · 2024-10-18T18:16:01Z

Keep data from a failed download instead of discarding it on retry in some limited cases when we know it's ok (currently only HTTP 200/206).

Prior to this change on failed transfer the tool truncated any outfile data written before retrying the transfer. This change adds an exception for HTTP downloads when the user requested auto-resume, because in that case we can keep the outfile data and resume from the new position.

Closes #xxxx

I tested this briefly last night and it seems to work as expected. (I've since added some more conditionals..)

dfandrich · 2024-10-18T19:44:25Z

Analysis of PR #15333 at 5d9ddcda:

Test 498 failed, which has NOT been flaky recently, so there could be a real issue in this PR.

Generated by Testclutch

bagder · 2024-10-21T07:18:02Z

src/tool_operate.c

+          outs->bytes = 0;
+          config->resume_from = outs->init;
+          curl_easy_setopt(curl, CURLOPT_RESUME_FROM_LARGE,
+                           config->resume_from);


Isn't there a chance/risk that this option might get set in a retry scenario due to all the conditions being correct, but then the next attempt fails again and then in a second retry round the conditions are different and this line is not executed?

If so, the CURLOPT_RESUME_FROM_LARGE value set in the previous round will remain set, which then probably is wrong?

If so, the CURLOPT_RESUME_FROM_LARGE value set in the previous round will remain set, which then probably is wrong?

I updated outs->init and config->resume_from so let's say your scenario happens then it will truncate from the the updated resume from position which is still correct. The file is truncated to outs->init bytes. truncation can still happen if the conditions are not met.

If so, the CURLOPT_RESUME_FROM_LARGE value set in the previous round will remain set, which then probably is wrong?

I updated outs->init and config->resume_from so let's say your scenario happens then it will truncate from the the updated resume from position which is still correct. The file is truncated to outs->init bytes. truncation can still happen if the conditions are not met.

@bagder I added a test and I made it cover what I think is your scenario. The test takes approximately 5 seconds to run because it has to make several retry requests to cover different scenarios and the minimum delay time for each retry is 1 second unless there is some way around that I'm not thinking of.

- Keep data from a failed download instead of discarding it on retry in some limited cases when we know it's ok (currently only HTTP 200/206). Prior to this change on failed transfer the tool truncated any outfile data written before retrying the transfer. This change adds an exception for HTTP downloads when the user requested auto-resume, because in that case we can keep the outfile data and resume from the new position. Closes #xxxx

if the requested range is open ended then a server usually replies with the rest of the content. note however this response i changed is intentionally set to return less than content length to cause a retry

jay added HTTP cmdline tool enhancement labels Oct 18, 2024

bagder reviewed Oct 21, 2024

View reviewed changes

jay mentioned this pull request Dec 16, 2024

Curl options --range and --continue-at clash without warning #15646

Closed

jay added 2 commits January 26, 2025 03:03

squashme: add a test

cf1f3e5

jay force-pushed the better_auto_resume branch from 5d9ddcd to cf1f3e5 Compare January 26, 2025 08:05

github-actions bot added the tests label Jan 26, 2025

jay added 4 commits January 26, 2025 03:17

fixup make test a little more realistic

3ca36f8

if the requested range is open ended then a server usually replies with the rest of the content. note however this response i changed is intentionally set to return less than content length to cause a retry

fixup limit retry to 4

79776a5

squashme: limit to GET requests (this probably needs more work)

06e9b21

squashme: add two additional checks to limit to GET requests

0bb53cf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tool_operate: keep failed partial download for retry auto-resume #15333

tool_operate: keep failed partial download for retry auto-resume #15333

jay commented Oct 18, 2024 •

edited

Loading

dfandrich commented Oct 18, 2024

bagder Oct 21, 2024

jay Oct 21, 2024

jay Jan 26, 2025

tool_operate: keep failed partial download for retry auto-resume #15333

Are you sure you want to change the base?

tool_operate: keep failed partial download for retry auto-resume #15333

Conversation

jay commented Oct 18, 2024 • edited Loading

dfandrich commented Oct 18, 2024

Generated by Testclutch

bagder Oct 21, 2024

Choose a reason for hiding this comment

jay Oct 21, 2024

Choose a reason for hiding this comment

jay Jan 26, 2025

Choose a reason for hiding this comment

jay commented Oct 18, 2024 •

edited

Loading