add depth_in_meters field to get_acoustic_detections() #274

PietrH · 2023-01-02T15:47:41Z

resolves #261
resolves #275
resolves #268

Main changes

Added single line to sql query, adapted test to expect the new column
Added missing to datapackage.json schema
I also made some small stylistic changes to some other functions, removing trailing whitespace, reducing column width.
Bumped minor version number
Created NEWS.md
swapped expect_equal() to expect_identical() in tests, see Replace expect_equal() with expect_identical() in tests where possible #268

Changes to tests

The test for download_acoustic_dataset() now uses the OS temp dir instead of a folder in the package root.
Add dependency on frictionless. In test-download_acoustic_dataset.R we now use frictionless to actually check if the produced datapackage can be read without any warnings. I've also added checks for any fields that might be missing from the schema or in the wrong order. This turned out to be the case, these were added to the datapackage.json file that is used for testing.
I've also switched over test-download_acoustic_dataset.R over to using snapshots to check for the download messaging instead of the boolean character matching from before. This often caused the test to fail for reasons that were unclear, this method stores a standardized markdown file that is committed of the messaging that can be examined via a diff. This workflow is part of the 3rd edition of testthat, and I've switched the package over to this version. This included making some changes to other tests, namely switching over expect_is() to expect_type().
The snapshot could also notify us if the result of demer_2014 changes, and this is how I implemented it originally, but at the moment I'm only checking the console out (cat). And not the actual files generated as these should be covered in the other tests. If we want to include this data in the package, we need to make sure the rights are cleared under the repo license.
I've silenced some of the messaging internal to the tests to clean up the console output during package checking.
Added a test for list_values to check the messaging

Notes / Possible new issues

The test for get_acoustic_detections() is slow
A number of optimizations are possible, most of the time goes to multi-select queries like get_acoustic_detections(con, acoustic_tag_id = c("A69-1601-16129", "A69-1601-16130")) that can take around ten seconds to complete. I was expecting queries like this to be quicker.
Quite a bit of code is more then 80 columns wide

We could consider opening issues for these. But neither is urgent since they only impact development workflow.

citation("etn") output is missing a year:

citation("etn")
#> Warning in citation("etn"): no date field in DESCRIPTION file of package 'etn'
#> Warning in citation("etn"): could not determine year for 'etn' from package
#> DESCRIPTION file
#> 
#> To cite package 'etn' in publications use:
#> 
#>   Peter Desmet, Damiano Oldoni and Stijn Van Hoey (NA). etn: Access
#>   Data from the European Tracking Network. https://github.com/inbo/etn,
#>   https://inbo.github.io/etn.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {etn: Access Data from the European Tracking Network},
#>     author = {Peter Desmet and Damiano Oldoni and Stijn {Van Hoey}},
#>     note = {https://github.com/inbo/etn, https://inbo.github.io/etn},
#>   }

^{Created on 2023-02-03 with reprex v2.0.2}

Slightly different locally:

citation("etn")
#> Warning in citation("etn"): no date field in DESCRIPTION file of package 'etn'
#> Warning in citation("etn"): could not determine year for 'etn' from package
#> DESCRIPTION file
#> 
#> To cite package 'etn' in publications use:
#> 
#>   Desmet P, Oldoni D, Van Hoey S (????). _etn: Access Data from the
#>   European Tracking Network_. https://github.com/inbo/etn,
#>   https://inbo.github.io/etn.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {etn: Access Data from the European Tracking Network},
#>     author = {Peter Desmet and Damiano Oldoni and Stijn {Van Hoey}},
#>     note = {https://github.com/inbo/etn, https://inbo.github.io/etn},
#>   }

^{Created on 2023-02-03 with reprex v2.0.2}

I suggest we add a year (year of latest release) for easy of copy pasting.

PietrH · 2023-01-03T09:18:29Z

Should we do a minor version bump for this change?

peterdesmet

Nice!

Any reasoning behind the position of the field?

Please also include the field in

etn/inst/assets/datapackage.json

Lines 569 to 572 in 35e70f3

    
           { 
        
             "name": "signal_to_noise_ratio", 
        
             "type": "integer" 
        
           },

(there is currently no test for this, will create issue for that)

Minor version bump sounds good to me

PietrH · 2023-01-04T14:28:47Z

Nice!

1. Any reasoning behind the position of the field?

2. Please also include the field in https://github.com/inbo/etn/blob/35e70f3d284cb64494313e667f238378564d494b/inst/assets/datapackage.json#L569-L572
    (there is currently no test for this, will create issue for that)

3. Minor version bump sounds good to me

1. Position of field

The documentation for get_acoustic_detections refers to the field definitions (which seem out of date), which refer to the following csv file: https://github.com/inbo/etn/blob/main/inst/assets/etn_fields.csv, in this csv on row 112, there is a field sensor_value_depth, which I suspect depth_in_meters is based on. I'm also reasoning that depth_in_meters is a property of the sensor, not the animal itself. Thus it should follow fields to do with the sensor rather than the animal.

I don't have strong feelings about this, apart that I didn't want to add it to the end. I'm fine with moving it to be grouped with the animal fields such as scientific_name

Move depth_in_meters to after deploy_longitude

datapackage

Add test to validate generated data package #275
Add depth in meters to datapackage.json

Minor version bump

Will do! Thank you for the review

peterdesmet · 2023-01-04T15:05:53Z

Thanks, I would suggest to have the field immediately after deploy_longitude then, as part of the "location" information.

PietrH · 2023-01-04T15:11:20Z

Will do

… for all tests

…duced frictionless datapackage

PietrH · 2023-01-19T15:04:39Z

datapackage.json is in the wrong order, and is missing 5 more fields from the tags table: length, diameter, weight, floating, and archive_memory.

However, I can't find any examples of these fields in use in the tables I have access to, apart from archive memory in these 3 examples:

[1] "shad_scheldt_dst"
# A tibble: 1 × 5
  length diameter weight floating archive_memory
   <dbl>    <dbl>  <dbl> <chr>    <chr>         
1     NA       NA     NA NA       2 MB          
[2] "FISHINTEL"
# A tibble: 2 × 5
  length diameter weight floating archive_memory
   <dbl>    <dbl>  <dbl> <chr>    <chr>         
1     NA       NA     NA NA       2             
2     NA       NA     NA NA       8             
[3] "2018_EC"
# A tibble: 1 × 5
  length diameter weight floating archive_memory
   <dbl>    <dbl>  <dbl> <chr>    <chr>         
1     NA       NA     NA NA       64 MB

It seems like archive_memory should be a string, length, diameter and weight seem to be doubles in the returned table, but I have no values to make sure. floating seems to be a string as well.

This didn't come to light earlier because there was no test for this in test-download_acoustic_dataset.R

peterdesmet · 2023-01-20T10:58:20Z

@PietrH these fields are directly taken from:

etn/R/get_tags.R

Lines 129 to 133 in 35e70f3

    
                 tag_device.archive_length AS length, 
        
                 tag_device.archive_diameter AS diameter, 
        
                 tag_device.archive_weight AS weight, 
        
                 tag_device.archive_floating AS floating, 
        
                 tag_device.device_internal_memory AS archive_memory,

The data type for those fields in the database is:

DBI::dbGetQuery(con, "
  SELECT
    pg_typeof(archive_length) AS length,
    pg_typeof(archive_diameter) AS diameter,
    pg_typeof(archive_weight) AS weight,
    pg_typeof(archive_floating) AS floating,
    pg_typeof(device_internal_memory) AS memory
  FROM
    common.tag_device_limited
  LIMIT 1
")

            length         diameter           weight floating            memory
1 double precision double precision double precision  boolean character varying

Looks like dplyr interpreted the boolean as string 🤷‍♂️

So in frictionless:

length: number
diameter: number
weight: number
floating: boolean
memory: string

…eters

PietrH · 2023-03-15T14:43:10Z

R/get_acoustic_detections.R

  assertthat::assert_that(is.logical(limit), msg = "limit must be a logical: TRUE/FALSE.")
  if (limit) {
-    limit_query <- glue::glue_sql("LIMIT 100", .con = connection)
+    limit_query <- glue::glue_sql("ORDER BY det.id_pk LIMIT 100", .con = connection)


Ordering is necessary to ensure the result of the filter is always the same, if we leave out the ORDER BY, then a test will fail. However, this step is very expensive, defeating the point of the LIMIT.

Consider restoring to unordered limiting, but documenting the behavior and perhaps rewriting the test.

Ordering makes the examples and testing stages too slow:

✔ checking examples (40m 42.8s) Examples with CPU or elapsed time > 5s user system elapsed get_animals 12.828 0.092 15.457 get_acoustic_detections 10.664 0.552 2406.708 get_tags 6.940 0.040 8.048

Limiting actually seems to be a little bit slower than not limiting, at least in some circumstances:

Unit: seconds expr get_acoustic_detections(con, acoustic_project_code = "demer", limit = TRUE) get_acoustic_detections(con, acoustic_project_code = "demer", limit = FALSE) min lq mean median uq max neval 151.7327 154.1687 164.0103 156.6048 170.1491 183.6935 3 110.5260 112.2894 123.5281 114.0527 130.0292 146.0056 3

…mparing

PietrH · 2023-05-04T08:15:07Z

tests/testthat/test-get_acoustic_detections.R


  # Selection is case insensitive
-  expect_equal(
-    get_acoustic_detections(con, acoustic_project_code = "demer", limit = TRUE),


These two (equivalent) queries each create a 33MB object, we should probably look for acoustic_project_code values that result in smaller objects for testing, that would speed things up.

Try 2015_HOMARUS

We are not sorting the limit after all, because the sort operation is so expensive there would be no point limiting anymore.

…ed out code

PietrH · 2023-05-04T12:57:57Z

@peterdesmet ready for review

1 test failing:

etn/tests/testthat/test-get_acoustic_detections.R

Line 19 in 7871292

expect_identical(nrow(df), length(unique(df$detection_id)))

Sometimes it passes, sometimes it does not. Seems to fail when doing devtools::check() but pass in console or when running tests for a single file. The test itself seems fine, I suspect there might be duplicate detection_id's in the acoustic detentions table.

The object that we test here df could be different every run, because it uses a limit without an order by statement.

Feel free to fix style issues directly, or just let me know.

peterdesmet

Reviewed, sorry it took so long! 😅

I would move R/testthat-helpers.R to tests/testthat/helper.R (cf. camtraptor)
Made a number of stylistic changes, but would be good if all are tested again
Are changes made to the citation now?

PietrH · 2023-11-21T10:31:17Z

Tests pass

PietrH added 2 commits January 2, 2023 16:05

add depth_in_meters to sql query as per #261

06bbf1a

add new column to test

4be1302

peterdesmet self-requested a review January 3, 2023 14:40

peterdesmet approved these changes Jan 3, 2023

View reviewed changes

peterdesmet mentioned this pull request Jan 3, 2023

Add test to validate generated data package #275

Closed

PietrH added 6 commits January 6, 2023 11:11

dev dependency on frictionless

3959918

switch to tempdir() for writing files out in test, only query db once…

eb6ad8d

… for all tests

depend on R memory management to remove temp files

8496d1a

move depth_in_meters column as per feedback in #274

ca886dd

don't check first line of messages: now contains tempdir()

1ca8241

add depth_in_meters to datapackage.json, add test for validity of pro…

0955569

…duced frictionless datapackage

PietrH added 9 commits January 24, 2023 11:10

add missing fields to tags schema

c79479f

add test to check for datapackage.json order

bda340e

up testthat to 3e, change message testing to snapshot, test datapackage

9a8e773

update snapshot for 2014_demer

952114d

expect_is() is depriciated in testthat 3e

cd3c389

expect_is() is depriciated in testthat 3e

3376f16

update to testthat 3e

473e246

replace expect_is to conform with testthat 3e

b9a9944

add dev/test depends on withr

9702de2

PietrH self-assigned this Feb 2, 2023

PietrH and others added 3 commits February 2, 2023 15:24

Moved test helper function to testthat-helpers.R

da3d87d

bump minor version number

57af8b2

suppress messages in value test

49480e7

PietrH added 3 commits March 14, 2023 15:50

ensure limited results are always the same

f9c5f6d

Merge branch 'depth-in-meters' of github.com:inbo/etn into depth-in-m…

af6f1d6

…eters

update RoxygenNote

db2065b

PietrH commented Mar 15, 2023

View reviewed changes

PietrH added 2 commits May 3, 2023 16:17

check downloaded files, add comments, remove commented out code

ed1038e

reuse query, remove limit and sort to improve test consistency for co…

0572b2f

…mparing

PietrH commented May 4, 2023

View reviewed changes

PietrH added 4 commits May 4, 2023 10:32

arrange both return objects, order isn't guaranteed

1677b78

remove commented out line

a401284

We are not sorting the limit after all, because the sort operation is so expensive there would be no point limiting anymore.

switch 2014_demer to smaller and faster 2015_homarus + remove comment…

cad682f

…ed out code

small style fixes

7871292

PietrH requested a review from damianooldoni June 1, 2023 07:46

peterdesmet added 6 commits September 25, 2023 14:45

Update NEWS.md to use active tense

a2b57a6

Add empty lines to example output

a72ff86

Use file.path() over paste()

3342627

Update code whitespace + some test names

542399f

Remove boilerplate text from testthat.R

15729c8

Update function documentation for fetch_schema_fields()

23ce3ea

peterdesmet approved these changes Sep 25, 2023

View reviewed changes

PietrH and others added 2 commits November 21, 2023 11:07

move helpers for testthat to tests dir

c2f2905

update snapshot

0ab5c32

PietrH and others added 3 commits November 21, 2023 11:37

Merge branch 'main' into depth-in-meters

2f2b1f4

Add self as maintainer

6ab8375

usethis::use_citation() and fill in using frictionless-r as example

9553689

PietrH merged commit 326d9e8 into main Nov 21, 2023

PietrH deleted the depth-in-meters branch November 21, 2023 14:31

This was referenced Apr 23, 2025

Add depth in meters to get_acoustic_detections()` inbo/etnservice#71

Closed

Add depth_in_meters field to get_acoustic_detections() inbo/etnservice#72

Merged

add depth_in_meters field to get_acoustic_detections() #274

add depth_in_meters field to get_acoustic_detections() #274

Uh oh!

Conversation

PietrH commented Jan 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Main changes

Changes to tests

Notes / Possible new issues

Uh oh!

PietrH commented Jan 3, 2023

Uh oh!

peterdesmet left a comment

Choose a reason for hiding this comment

Uh oh!

PietrH commented Jan 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Position of field

datapackage

Minor version bump

Uh oh!

peterdesmet commented Jan 4, 2023

Uh oh!

PietrH commented Jan 4, 2023

Uh oh!

PietrH commented Jan 19, 2023

Uh oh!

peterdesmet commented Jan 20, 2023

Uh oh!

PietrH Mar 15, 2023

Choose a reason for hiding this comment

Uh oh!

PietrH May 3, 2023

Choose a reason for hiding this comment

Uh oh!

PietrH May 3, 2023

Choose a reason for hiding this comment

Uh oh!

PietrH May 4, 2023

Choose a reason for hiding this comment

Uh oh!

peterdesmet May 4, 2023

Choose a reason for hiding this comment

Uh oh!

PietrH commented May 4, 2023

Uh oh!

peterdesmet left a comment • edited by PietrH Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PietrH commented Nov 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PietrH commented Jan 2, 2023 •

edited

Loading

PietrH commented Jan 4, 2023 •

edited

Loading

peterdesmet left a comment •

edited by PietrH

Loading