-
Notifications
You must be signed in to change notification settings - Fork 656
feat(services/onedrive): implement additional OneDrive features #5784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| if let Some(version) = args.version() { | ||
| let versions = self.core.onedrive_list_versions(path).await?; | ||
| for item_version in versions { | ||
| if item_version.id == version { | ||
| meta.set_version(version); | ||
| return Ok(RpStat::new(meta)); | ||
| } | ||
| } | ||
|
|
||
| return Err(Error::new(ErrorKind::NotFound, "item version not found")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big sad regarding this API.
I can imagine that list_with(path).versions(true) and stat_with(item).versions(abc) could be very slow.
I noticed that we're using id as the version ID. How about supporting onedrive_stat_by_id?
OneDrive supports two call conventions:
GET /me/drive/items/{item-id}
GET /me/drive/root:/{item-path}If we already have the id, we can use it to make a direct call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The item ID is not the same as the version ID, unfortunately. Using versions is going to generate lots of API calls indeed. I agree with you this implementation could be better. I'll work on the get version API. Will also look into if I can take advantage of the item ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The item ID is not the same as the version ID
Sorry I didn't take deep look over the onedrive API. The id is not unique across the entire onedrive? Or onedrive has other concept that similiar to version_id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, I'll explain. The id for a file is unique.
I noticed that we're using id as the version ID.
The version ID belongs to a file.
GET /me/drive/items/{item-id}/versions
GET /me/drive/root:/{item-path}/versions
An example response is:
{
"value": [
{
"id": "2.0",
"lastModifiedDateTime": "2025-03-16T17:02:49Z",
"size": 74758
},
{
"id": "1.0",
"lastModifiedDateTime": "2025-03-12T21:59:54Z",
"size": 74756
}
]
}This is a sub-resource for a file. I didn't find out until I tried
https://graph.microsoft.com/v1.0/me/drive/root:/test.json?$expand=versions. This is a stat_with_version. I am going to work on versions support based on this API parameter.
Listing items with versions is not supported. GET https://graph.microsoft.com/v1.0/me/drive/root/children?$expand=versions returns 400 and a message saying "Operation not supported".
In summary:
list_with(path).versions(true): no API support except N+1 HTTP callsstat_with(item).versions(abc): one API call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, onedrive...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version ID belongs to a file.
Ok, I got it. So version ID can't be used as item-id.
list_with(path).versions(true): no API support except N+1 HTTP calls
Seems it's the best we can do by given APIs.
stat_with(item).versions(abc): one API call
Happy to know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed it's a bit disappointing. Maybe we will find the list with version support from OneDrive a few years later.
| // Read more at https://support.microsoft.com/en-us/office/restrictions-and-limitations-in-onedrive-and-sharepoint-64883a5d-228e-48f5-b3d2-eb39e07630fa#individualfilesize | ||
| // However, we can't enable this, otherwise OpenDAL behavior tests will try to test creating huge | ||
| // file up to this size. | ||
| // write_total_max_size: Some(250 * 1024 * 1024 * 1024), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to submit an issue for this? I remember that S3 also set this.
| Ok(( | ||
| RpWrite::default(), | ||
| oio::OneShotWriter::new(OneDriveWriter::new(self.core.clone(), args, path)), | ||
| oio::OneShotWriter::new(OneDriveWriter::new( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we already support chunked writing, we could implement something like MultipartUploadWrite instead of OneShotWrite, which only allows a single write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, a good time to work on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is large enough, it's fine to split in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I will fix your other comments this week and ping you. I'm tied up at the moment.
|
|
||
| let status = response.status(); | ||
| match status { | ||
| match response.status() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can implement read_with(path).version(id) in the same way like stat.
|
All fixed. Please have a look. I must add a new commit to fix "copy"'s behavior. Not sure what OneDrive has updated. I will rebase once you approve the PR. OPENDAL_TEST=onedrive cargo test behavior --features tests,services-onedrive -- --show-output |
|
Hi, thank you @erickguan for working on this. This PR is almost good to go, would you like to resolve conflicts? |
Also: - Add documentation about consistency - Allow retry on 409 Conflict (for create_dir) Technically, we can allow retries on more operations. We wait bug reports or feature requests about this nuance since the behavior tests is likely the most heavily operator among OpenDAL OneDrive users.
4a17e32 to
1e9bcd5
Compare
Also remove a few data fields from models that the OneDrive service doesn't use.
1e9bcd5 to
1c393b8
Compare
|
@Xuanwo rebased. I don't know what happens to the huggingface test. Test details |
Xuanwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @erickguan for working on this!
A bit flaky 😆 |
|
No worries.
Alright! |
Which issue does this PR close?
Part of #5557.
Part of #2611.
Also, add a few features:
Rationale for this change
Mostly for the issues and to implement:
What changes are included in this PR?
I implemented features that I found easy to get to. Also, this PR addresses a few nuances about OneDrive API, including:
Content-TypeAre there any user-facing changes?
Added a new option and doc.
Review
Sorry about making many changes across the OneDrive service, which makes this PR difficult to review. I made some efforts in structuring the change in a few commits. Reviewing this PR commit by commit is easier.
Also kudos to Microsoft's services I guess - they didn't ban me. I saw the behavior tests generated a huge load for their servers (50x from OneDrive).
OPENDAL_TEST=onedrive cargo test behavior --features tests,services-onedrive -- --show-output