Thanks to visit codestin.com
Credit goes to github.com

Skip to content

tool_operate: keep failed partial download for retry auto-resume #15333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

jay
Copy link
Member

@jay jay commented Oct 18, 2024

  • Keep data from a failed download instead of discarding it on retry in some limited cases when we know it's ok (currently only HTTP 200/206).

Prior to this change on failed transfer the tool truncated any outfile data written before retrying the transfer. This change adds an exception for HTTP downloads when the user requested auto-resume, because in that case we can keep the outfile data and resume from the new position.

Closes #xxxx


I tested this briefly last night and it seems to work as expected. (I've since added some more conditionals..)

@dfandrich
Copy link
Contributor

Analysis of PR #15333 at 5d9ddcda:

Test 498 failed, which has NOT been flaky recently, so there could be a real issue in this PR.

Generated by Testclutch

outs->bytes = 0;
config->resume_from = outs->init;
curl_easy_setopt(curl, CURLOPT_RESUME_FROM_LARGE,
config->resume_from);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a chance/risk that this option might get set in a retry scenario due to all the conditions being correct, but then the next attempt fails again and then in a second retry round the conditions are different and this line is not executed?

If so, the CURLOPT_RESUME_FROM_LARGE value set in the previous round will remain set, which then probably is wrong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, the CURLOPT_RESUME_FROM_LARGE value set in the previous round will remain set, which then probably is wrong?

I updated outs->init and config->resume_from so let's say your scenario happens then it will truncate from the the updated resume from position which is still correct. The file is truncated to outs->init bytes. truncation can still happen if the conditions are not met.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, the CURLOPT_RESUME_FROM_LARGE value set in the previous round will remain set, which then probably is wrong?

I updated outs->init and config->resume_from so let's say your scenario happens then it will truncate from the the updated resume from position which is still correct. The file is truncated to outs->init bytes. truncation can still happen if the conditions are not met.

@bagder I added a test and I made it cover what I think is your scenario. The test takes approximately 5 seconds to run because it has to make several retry requests to cover different scenarios and the minimum delay time for each retry is 1 second unless there is some way around that I'm not thinking of.

jay added 2 commits January 26, 2025 03:03
- Keep data from a failed download instead of discarding it on retry in
  some limited cases when we know it's ok (currently only HTTP 200/206).

Prior to this change on failed transfer the tool truncated any outfile
data written before retrying the transfer. This change adds an exception
for HTTP downloads when the user requested auto-resume, because in that
case we can keep the outfile data and resume from the new position.

Closes #xxxx
@jay jay force-pushed the better_auto_resume branch from 5d9ddcd to cf1f3e5 Compare January 26, 2025 08:05
@github-actions github-actions bot added the tests label Jan 26, 2025
jay added 4 commits January 26, 2025 03:17
if the requested range is open ended then a server usually replies with
the rest of the content. note however this response i changed is
intentionally set to return less than content length to cause a retry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants