Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@awong234
Copy link
Contributor

@awong234 awong234 commented Jun 20, 2018

Problem: [Issue #74]

Offset-based pagination has an offset limit of 300,000 which is not stated in the Box API documentation, but will be added following conversation with the Box team. The original function for box_pagination has been renamed as box_paginate_offset to preserve original functionality.

Solution

The team recommended the use of marker-based pagination. Marker-based pagination is, according to the Box API docs, "the preferred method and is most performant", and so this change makes marker-based pagination the default method for GET-ting folder items.

Tests

My tests succeed in comparing and syncing my local set of files to the remote set of my box folder with quantity of files >330,000 where offset-based pagination returns an HTTP 400 error. Microbenchmark of the tests show comparable time to query 100,000 file names

Unit: seconds
   expr      min       lq     mean   median       uq      max neval
 OFFSET 37.52012 40.04692 41.27804 40.93159 41.68023 50.97874   100
 MARKER 40.87141 42.31870 44.24652 42.98368 43.90257 59.99026   100

@ijlyttle
Copy link
Member

Hi Alec,

I appreciate this PR. Unfortunately it may be a few days before I will have time to look at it properly.

Ian

@awong234
Copy link
Contributor Author

Certainly, take your time! I will let you know if I encounter any issues within my fork.

Copy link
Member

@ijlyttle ijlyttle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Alec,

Thanks for putting this together - I have a few ideas to make things a bit cleaner.

Unless you have a reason to preserve offset-based pagination, I would redo this by converting box_pagination() to use markers, and dropping support for offsets.

Thanks!

DESCRIPTION Outdated
Type: Package
Title: Interface for the 'Box.com API'
Version: 0.3.4.99999
Version: 0.3.4.999999
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave the version number the same, for the time-being.

R/boxr_misc.R Outdated

out = list()

url$query$usemarker = T
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use TRUE instead of T

R/boxr_misc.R Outdated

n_so_far = 0

out = list()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be here twice?

R/boxr_misc.R Outdated
box_pagination <- function(url, max = 200) {
box_paginate_marker = function(url, max){

out = list()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the convention of the package to use <- for assignment

box_ls <- function(dir_id = box_getwd(), limit = 100, max = Inf, fields = NULL, pageMode = 'marker') {

# maybe some logic here to check that limit <= 1000

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're here, could you add the logic make sure that limit is no more than 1000? If it is, I think it would be appropriate to set it to 1000 and issue a warning. Thanks!

R/boxr_misc.R Outdated

if (req$status_code == 404) {
message("box.com indicates that no results were found")
return()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might return NULL here.

This is picky, but it may be better to use if (identical(resp$status_code, 404L))

R/boxr_misc.R Outdated

url$query$usemarker = T

while(!is.null(marker)){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use a logical variable here, next_page, like box_paginate_offset().

@awong234
Copy link
Contributor Author

Hi Ian,

I agree with dropping support for offset-based pagination; there was no benefit that I could see other than preserving legacy function. I will get these changed soon. Thanks!

awong234 added 4 commits June 25, 2018 23:45
…and arguments.

Pagination function is now once again `box_pagination()`.

Warning emitted when limit > 1000

Other minor changes as requested.
@awong234
Copy link
Contributor Author

Hi Ian,

Wasn't sure if you were notified of this - I made the changes requested. Thanks for reviewing the code!

@ijlyttle
Copy link
Member

ijlyttle commented Jul 2, 2018

Hi Alec,

I hope to get to this in the next couple of days, thanks!

@awong234
Copy link
Contributor Author

awong234 commented Jul 2, 2018 via email

Copy link
Member

@ijlyttle ijlyttle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Alec,

This looks much cleaner - I am not surprised that it is working well for you and your colleagues. This will be a great improvement to the package as a whole.

My only requests are:

  • formatting, to follow the existing style.
  • to add an item to NEWS.md - you will see an item there describing your previous improvement (on fields). Please refer to the original issue (#74, I believe)
  • if you have not already, please add fixes #74 to a commit message.

Thanks!

Ian

R/boxr_misc.R Outdated
#'
#' @author Brendan Rocks \email{foss@@brendanrocks.com} and Ian Lyttle
#'
#' @author Brendan Rocks \email{foss@@brendanrocks.com} and Ian Lyttle
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add yourself to @authors of this function?

R/boxr_misc.R Outdated
box_ls <- function(dir_id = box_getwd(), limit = 100, max = Inf, fields = NULL) {

# maybe some logic here to check that limit <= 1000
if(limit > 1000){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks good here - could you format to match the spacing and indentation of the code at L41, and use <- for assignment?

Could you do this for the other logic blocks (if, while) and use <- for assignment, generally?

@awong234
Copy link
Contributor Author

Thanks Ian! I will get to these soon; I'm just returning from travel so I can address the last bits now.

awong234 added 5 commits July 13, 2018 18:54
One of my users attempted to connect to a shared folder that they were not a member of, and the additional information would have been helpful to isolate the problem from my end.
NEWS.md : Added to improvements list for PR r-box#79. Added to bug fixes Issue r-box#74 solution.
@awong234
Copy link
Contributor Author

Hi Ian,

I made those changes that you requested.

I also took the liberty to make the 404 error a little more explicit (see 12834e4; I had a user try to connect to a shared folder that they were not a member of, and I think the extra information could have made me see the problem sooner.

Thanks for the review!

Alec

Copy link
Member

@ijlyttle ijlyttle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks super, thanks!

@ijlyttle ijlyttle merged commit b7c257a into r-box:master Jul 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants