Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@DarianGill
Copy link
Collaborator

@DarianGill DarianGill commented Apr 2, 2025

This moves the code from the MVP that was only set up to run on my local environment into a re-producible R package. It also contains a new app.R file following the convention set forth in Shiny App Packages for easy launching during development (which causes a Non-standard file/directory found at top level Note when running devtools::check()). I stubbed out some simple tests for the functions in server.R and ui.R as well and am open to hearing feedback about any other necessary changes required to make this "officially" a package before I get back to work on the functionality of the app.

DarianGill and others added 8 commits March 31, 2025 15:43
This problem has been around for over a decade.
When using tidyverse packages like ggplot2 and dplyr that use non-standard
evaluation to map data frame columns into variables in the functions, those
variables are not visible during R CMD check and show up as undeclared global
variables, typically showing up as a NOTE such as:

  server: no visible binding for global variable ‘latitude’

These are fixed by importing `.data` from ggplot2 or rlang, which
then makes the undeclared variables accessible as `.data$variable`.
See https://stackoverflow.com/a/57496617 for background.

Also fixed a single undeclared reference to the `tags` variable.
@mbjones
Copy link
Member

mbjones commented Apr 2, 2025

@DarianGill Overall this package structure looked great. I am not sure which linting issues you are talking about, but I did see some standard problems with using tidyverse non-standard evaluation approaches inside an R package that produced NOTEs on devtools:::check().

This problem has been around for over a decade. When using tidyverse packages like ggplot2 and dplyr that use non-standard evaluation to map data frame columns into variables in the functions, those variables are not visible during R CMD check and show up as undeclared global variables, typically showing up as a NOTE such as server: no visible binding for global variable ‘latitude’.

These are fixed by importing .data from ggplot2 or rlang, which then makes the undeclared variables accessible as .data$variable. See https://stackoverflow.com/a/57496617 for background.

I pushed changes to the branch to fix these, and I also fixed a single undeclared reference to the tags variable. Does that clear up your linter issues? If not, can you provide more details?

@mbjones
Copy link
Member

mbjones commented Apr 2, 2025

@DarianGill At some point we should do a more complete review of your package (e.g., the package is missing tests), but its looking good as is as a starting point and probably not needed now. Let us know when you want a deeper code review in this branch or in develop -- don't want to hold you up at all and keep momentum going.

@DarianGill DarianGill marked this pull request as ready for review April 8, 2025 02:30
@DarianGill DarianGill requested review from mbjones and regetz April 8, 2025 02:40
Copy link
Collaborator

@regetz regetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff @DarianGill, thanks! I made a bunch of comments. Nothing critical. Almost entirely either minor editing changes for text docs, and code suggestions that either simplify the code or make it more readable (IMHO). Happy to chat more about any of them.

I'm marking this as Approved to leave it in your hands to change what you think makes sense, follow up more on anything as wish, and then be unblocked to merge to develop when you see fit.

Tests are automatically run via GitHub Actions. Check the root `README.md` file
for this GitHub Actions status badge and make sure it says "Passing":
Tests are automatically run via GitHub Actions. Check the root
`README.md` file for e GitHub Actions status badge and make sure it
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be "the" GitHub Actions status badge?

the `develop` branch can be fast-forwarded to sync with `main` to
start work on the next release.
3. Releases can be downloaded from the [GitHub releases
page](https://github.com/NCEAS/vegbankr/releases).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


- add an [issue](https://github.com/DataONEorg/REPONAME/issues) describing your planned changes, or add a comment to an existing issue;
- on GitHub, fork the [repository](https://github.com/DataONEorg/REPONAME)
- add an [issue](https://github.com/NCEAS/vegbankr/issues) describing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- on GitHub, fork the [repository](https://github.com/DataONEorg/REPONAME)
- add an [issue](https://github.com/NCEAS/vegbankr/issues) describing
your planned changes, or add a comment to an existing issue;
- on GitHub, fork the [repository](https://github.com/NCEAS/vegbankr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same “printed page” as the copyright notice for easier identification within
third-party archives.

Copyright [yyyy] [name of copyright owner]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright [2025] [Regents of the University of California]

dates_df <- dates_df[!is.na(dates_df$parsed), ]
dates_df <- dates_df[!duplicated(dates_df$parsed), ]
top_dates <- head(dates_df[order(dates_df$parsed, decreasing = TRUE), ], n)
top_dates <- utils::head(dates_df[order(dates_df$parsed, decreasing = TRUE), ], n)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for completeness, here's an equivalent way to get the top dates using chained dplyr verbs, similar pattern to what I suggested in other comments. You could make an argument that it's better to stick with base R functionality especially when you can do so with readable code, though we already depend on dplyr, and these are very standard usages of core, stable dplyr functions for tabular data manipulatino. Toss up in my mind.

    date_formats <- c("a, d b Y H:M:S z", "d b Y H:M:S", "Y-m-d H:M:S")
    top_dates <- data |>
        dplyr::select(original = date_field) |>
        dplyr::mutate(parsed = lubridate::parse_date_time(
            data[[date_field]], orders = date_formats)) |>
        dplyr::filter(!is.na(parsed), !duplicated(parsed)) |>
        dplyr::arrange(dplyr::desc(parsed)) |>
        utils::head(n)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this more readable than the previous two suggestions, but am warming up to all of them, thanks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@regetz Thanks for these suggestions, I think ultimately we may want this summarization to occur on the backend where it can access the totality of the data or whatever we've queried. The overviews are currently not functional after I've started using the paginated data in the table, but it's valuable seeing the other ways to phrase the same functionality using dplyr anyway.

aes(x = long, y = lat, group = group), # nolint: object_usage_linter.
fill = "white", color = "gray70", size = 0.3
ggplot2::aes(
x = .data$long,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment to elsewhere, you should be able to drop the .data$ accessor throughout the ggplot expressions here. But then I think you'll need to use longitude and latitude, assuming those are the full column names in the data. The $ in R does partial matching, I guess for convenience, but in my experience it's just a source of bugs!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ggplot2::aes(
x = stats::reorder(.data$name, .data$count),
y = .data$count
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure the linter won't balk at this, but I'd move this closing parenthesis to the end of the previous line, then indent the parenthesis on the next line by two spaces. That way the entire ggplot expression is easier to see because all of the continuation lines are intended relative to the initiating ggplot2::ggplot(...) call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linter doesn't like it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kinda surprised me to see this file in the root directory, but I do see precedent for it (like here). Seems like another option is maybe in inst/? Just wondering out loud what is best. No need to change anything now. Important thing is that wherever it is, (a) the package validates fine, and (b) we're able to easily deploy and start the Shiny app in both development and production modes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same. I don't know where it should go, but it seems like the root isn't the best default location. It would be a shiny-specific recommendation is my guess. Mastering Shiny says to put in the R directory. Maybe let's discuss on slack. https://mastering-shiny.org/scaling-packaging.html#converting-an-existing-app

} else {
data_grouped <- data %>%
dplyr::group_by(latitude, longitude) %>% # nolint: object_usage_linter.
data_grouped <- data |>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not totally sure what's happening here, but the embedded mapply seems atypical in this context. Seems like neither than nor the sprintf are needed? I think this simplified code yields the same thing:

      data_grouped <- data %>%
        dplyr::group_by(.data$latitude, .data$longitude) |>
        dplyr::mutate(
          authorobscode_label =
            paste0(
                "<a href=\"#\" onclick=\"Shiny.setInputValue('label_link_click',
                '", obsaccessioncode, "', {priority:'event'})\">", authorobscode,
                "</a>", collapse = "<br>")
        ) |>
        dplyr::ungroup()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a second question/comment: With the combination of group_by and mutate, the code here is producing one output record per input record. So for records with the same lat and lon, it's producing a "grouped" HTML snippet (per the collapse = "<br>"), but repeatedly for every member of that group. Is that the intent? If the goal is to produce one single HTML snippet for each unique lat x lon grouping, then we should use dplyr::summarize rather than dplyr::mutate.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this is what I was referring to when I said I realized after cutting the PR that I was making too many labels and misusing the grouping. I've since changed it to summarize, but I appreciate the validation that that was the right move.

@DarianGill
Copy link
Collaborator Author

Thanks @regetz! I'll implement these in my new branch on Monday. Have a great weekend 🤙

Copy link
Member

@mbjones mbjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything built and ran for me this time, and I made a few more comments, but all looks like a great package start,

@DarianGill DarianGill deleted the feature-21-create-r-package branch July 16, 2025 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants