Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@balbesina
Copy link

@balbesina balbesina commented Jul 25, 2024

Summary

Default makeValid provides very sad (and useless) resolution for intersected polygons of multipolygon. However, i dont wanna make sql requests with SELECT ST_MakeValid(geom, 'method=structure') from app to db.

Here is an extension of #make_valid that accepts kwargs. "Strange" arguments choice mimics the behaviour of SQL func

[symbol] method=:linework|:structure
applies method structure only if :structure was provided. else acts as if :linework was given.

[bool] keep_collapsed=true|false
applies keepCollapsed when set only for method structure. else acts as if false was given.

Other Information

It is still possible to not pass anything, persisting existing functionality.

I'm not sure what made devs working on SQL func to use such a complex way to define method with a string variable. Mb we should use bool flag (e.g. method_structure) instead? Or it's planned to extend makeValid with more possible methods? RN it works like linework=0 by default and can also take structure=1. So it's kinda bool. In geos_c.h it is an enum with two variants. keepCollapsed on the other hand is simple int 1/0 - typical flag.

[symbol] method=:linework|:structure
applies method structure only if :structure was provided. else acts as if :linework was given.

[bool] keep_collapsed=true|false
applies keepCollapsed when set only for method structure. else acts as if false was given.

see https://postgis.net/docs/ST_MakeValid.html
Copy link
Member

@keithdoggett keithdoggett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@balbesina thanks for the PR! I think the code looks good, but can you add some tests to demonstrate/confirm usage?

@keithdoggett
Copy link
Member

@BuonOmo the arg parsing looks good to me but I know you're more familiar with this than I am so maybe you have a different opinion

@BuonOmo
Copy link
Member

BuonOmo commented Aug 7, 2024

@keithdoggett I think it is fine. I'll review further to make sure we are not generating some extra memory leak in the process once you fix CI and add tests @balbesina, thank you for the contribution!

@balbesina
Copy link
Author

hey, everyone. i've added several tests. let me know if it suits. i took my test examples for multipolygons and linesting example from postgis docs (since i dont use linestings)

@BuonOmo
Copy link
Member

BuonOmo commented Aug 8, 2024

@balbesina you still have the lint and memcheck failing. Do you need any help with these? I suggest running these tests on your forks so you don't have to wait for approval every time.

I would also add a test for method: :linework

apply correct defaults for non-param method call.
improve tests.
@balbesina
Copy link
Author

@BuonOmo thx for tips. i've reworked code a little bit to make it cleaner. also thx to geos guys answers i applied default params for non-param make_valid call (should be method=linework, keepCollapsed=true).

i ran clang-format and the change test is blaming me for (removing spaces before comment in file factory.c:1013) was done by bin/clang-format. if you want - i can roll it back.

for memcheck i think it should be an exclusion added to valgrind/suppressions, because i do call GEOSMakeValidParams_destroy - exactly as it is used in geos tests: link1, link2

but when i try to build docker image for valgrind it fails in test/valgrind/Dockerfile:14 COPY . .
ERROR: failed to solve: cannot copy to non-directory: /var/lib/docker/overlay2/long-random-hash/merged/bin

any advice on that?

@BuonOmo
Copy link
Member

BuonOmo commented Aug 12, 2024

i ran clang-format and the change test is blaming me for (removing spaces before comment in file factory.c:1013) was done by bin/clang-format

Please do, I'll check that. Thanks for reporting.

for memcheck i think it should be an exclusion added to valgrind/suppressions, because i do call GEOSMakeValidParams_destroy - exactly as it is used in geos tests: link1, link2

The error can come from some other place. I'll check that. I also see that you have a failure in ruby 2.7 but maybe can just remove support for that branch as it is not maintained anymore. I'll push that upstream so you can rebase

but when i try to build docker image for valgrind it fails in test/valgrind/Dockerfile:14 COPY . .
ERROR: failed to solve: cannot copy to non-directory: /var/lib/docker/overlay2/long-random-hash/merged/bin

I've never had that issue and I'm not Docker expert. Could you dig into it? And if you do, self-answer here for later users! I could try to help soonish but I have very limited bandwidth unfortunately...

EDIT: my dockerfile has those local changes below. It is better for you that way?

diff --git a/test/valgrind/Dockerfile b/test/valgrind/Dockerfile
index f7885f2..81a3217 100644
--- a/test/valgrind/Dockerfile
+++ b/test/valgrind/Dockerfile
@@ -1,9 +1,8 @@
 FROM ruby:latest
 
-ARG work_dir=/ bundle_dir=/usr/local/bundle
+ARG work_dir=/app
 
 WORKDIR ${work_dir}
-# RUN bundle config set --local path ${bundle_dir}
 
 RUN apt update && apt install -yqq libgeos-dev valgrind
 
@@ -12,7 +11,7 @@ COPY lib/rgeo/version.rb ./lib/rgeo/
 RUN bundle install
 
 COPY . .
-RUN rake compile
+RUN MAINTAINER_MODE=1 rake compile
 
 
 CMD ["rake", "test:valgrind"]

@BuonOmo BuonOmo mentioned this pull request Aug 12, 2024
@balbesina
Copy link
Author

balbesina commented Sep 1, 2024

You are right, after changing work_dir=/app docker built it just fine.

In order to generate suppression i uncommented valgrind_generate_suppressions: true in the rakefile. And when I added it to supp file the initial error disappears. However, new memcheck error shows up and I don't see any connection between those methods (my makeValid changes and method_geometry_contains = which is an equivalent for "contains?" ruby method).

288 (24 direct, 264 indirect) bytes in 1 blocks are definitely lost in loss record 16,757 of 20,094
malloc (vg_replace_malloc.c:381)
objspace_xmalloc0 (gc.c:12617)
rb_id_table_create (id_table.c:98)
cache_callable_method_entry (vm_method.c:1439)
callable_method_entry_or_negative (vm_method.c:1497)
callable_method_entry (vm_method.c:1517)
rb_callable_method_entry (vm_method.c:1524)
gccct_method_search_slowpath (vm_eval.c:456)
gccct_method_search (vm_eval.c:505)
rb_call0 (vm_eval.c:546)
rb_class_new_instance_kw (object.c:2152)
rb_exc_new (error.c:1385)
*rbimpl_exc_new_cstr (string.h:1482)
*rgeo_convert_to_geos_geometry (factory.c:849)
*method_geometry_contains (geometry.c:605)
vm_call0_cfunc_with_frame (vm_eval.c:173)
vm_call0_cfunc (vm_eval.c:187)
vm_call0_body (vm_eval.c:233)
vm_call0_cc (vm_eval.c:110)
rb_vm_call0 (vm_eval.c:70)
rb_vm_call_kw (vm_eval.c:330)
vm_call_cfunc_with_frame_ (vm_insnhelper.c:3502)
vm_call_cfunc_array_argv (vm_insnhelper.c:3581)
vm_call_cfunc_only_splat (vm_insnhelper.c:3602)
vm_sendish (vm_insnhelper.c:5593)
vm_exec_core (insns.def:834)
rb_vm_exec (vm.c:2486)
vm_yield_with_cref (vm.c:1634)
vm_yield (vm.c:1642)
rb_yield_0 (vm_eval.c:1366)
rb_yield (vm_eval.c:1382)
rb_ary_each (array.c:2538)
vm_call_cfunc_with_frame_ (vm_insnhelper.c:3502)
vm_call_cfunc_with_frame (vm_insnhelper.c:3530)
vm_sendish.constprop.0 (vm_insnhelper.c:5593)
vm_exec_core (insns.def:814)
rb_vm_exec (vm.c:2486)
vm_yield_with_cref (vm.c:1634)
vm_yield (vm.c:1642)
rb_yield_0 (vm_eval.c:1366)
rb_yield (vm_eval.c:1382)
rb_ary_collect (array.c:3633)
vm_call_cfunc_with_frame_ (vm_insnhelper.c:3502)
vm_call_cfunc_with_frame (vm_insnhelper.c:3530)
vm_sendish.constprop.0 (vm_insnhelper.c:5593)
vm_exec_core (insns.def:814)
rb_vm_exec (vm.c:2486)
rb_proc_call_kw (proc.c:957)
exec_end_procs_chain (eval_jump.c:105)
rb_ec_exec_end_proc (eval_jump.c:120)
rb_ec_teardown (eval.c:159)
rb_ec_cleanup (eval.c:212)
ruby_run_node (eval.c:328)
rb_main (main.c:39)
main (main.c:58)

It is easy to suppress it too and the memcheck error goes away. But i'm a bit worried if it's good solution, since the reason of that behaviour is unclear.

@balbesina
Copy link
Author

hey, @keithdoggett. tests were added. pls review

Copy link
Member

@BuonOmo BuonOmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found where your new suppression come from !

@BuonOmo
Copy link
Member

BuonOmo commented Sep 6, 2024

Actually I just saw that there is now a RUBY_FREE_AT_EXIT env variable that allow us to remove the suppression file ! (https://blog.peterzhu.ca/finding-memory-leaks-in-the-ruby-ecosystem/) => #372

Copy link
Member

@keithdoggett keithdoggett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are great thank you for handling that. I agree with @BuonOmo we should secure that one point to ensure there's no memory leak, but like I said in the comment, I think rb_protect should be easy enough since we shouldn't have to free too much that early in the function.

Copy link
Member

@BuonOmo BuonOmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. I just have a tiny doubt on changing the make_valid method in validity_check.rb, which is supposed to be generic. Also we loose the check of ValidityCheck loading before any call to make_valid is made. I remember this being pretty important. I don't think I'll have time to check this in the next few days. Maybe you have an idea on that @keithdoggett. Otherwise if you prefer @balbesina we can roll back to the c version and just fix it (If you're ok I could do it, no big deal). What do you guys think ?

@balbesina
Copy link
Author

Also we loose the check of ValidityCheck loading before any call to make_valid is made. I remember this being pretty important

could someone explain the importance of validity_check for #make_valid method to me? i see that this module was totally devoted to #check_validity!, but not for #make_valid.

Otherwise if you prefer @balbesina we can roll back to the c version and just fix it

i was thinking about my initial implementation for all these weeks and it bothered me. c, cpp in geos work with 0/1 integers for makeValidWithParams. adding artificial string/symbol parameter method handling to c level feels bad. and since we have a mixture of ruby+c, then why not to split the responsibility?

but its up to you, maintainers, to make a decision.

@BuonOmo
Copy link
Member

BuonOmo commented Sep 11, 2024

and since we have a mixture of ruby+c, then why not to split the responsibility?

The issue is not about splitting the responsibility here, I definitely like this idea actually. The issue is that this module is not supposed to be geos specific (other implementations are welcome!). And we make it geos specific here.

could someone explain the importance of validity_check for #make_valid method to me? i see that this module was totally devoted to #check_validity!, but not for #make_valid.

That's actually a fair point. After a brief check I cannot see why we did it nor why we did it for invalid_reason. I think it is written somewhere in our PR history. I'll check further as I know this was a tedious implementation and there might be a very precise reason. @keithdoggett any memory about it?

@keithdoggett
Copy link
Member

@balbesina @BuonOmo the reason we included make_valid in that module is because this, in combination with valid_op is supposed to be a generic interface for validity testing and we figured lumping in make_valid with this makes sense.

I agree with @BuonOmo we should not modify the generic definition in ValidityCheck. Since this is just a C implementation, we can override make_valid in the Geos classes only (similar to the FFI classes) and have that handle the kwargs. I'd be open to modifying the generic implmentation to handle args in case they are accidentally passed to a cartesian factory geometry for example and we don't want an argument error to be raised (though an error would be raised anyways since we don't implement make_valid for them).

@balbesina
Copy link
Author

balbesina commented Sep 17, 2024

@BuonOmo I still believe there is an issue with valgrind that requires suppression. Or else, please, explain how to avoid "memory loss".

I did several tests with different ways of processing. Removed absolutely everything except for GEOSMakeValidParams_create and GEOSMakeValidParams_destroy calls: there is no kwargs parsing, there is no arity check, there are no extra variables that are defined. Only params for make valid instantiated then destroyed.

in this case valgrind raises memory loss
8 bytes in 1 blocks are definitely lost in loss record 3 of 96

static VALUE
method_geometry_make_valid(int argc, VALUE* argv, VALUE self)
{
  RGeo_GeometryData* self_data;
  const GEOSGeometry* self_geom;
  GEOSGeometry* valid_geom;
  self_data = RGEO_GEOMETRY_DATA_PTR(self);
  self_geom = self_data->geom;
  if (!self_geom)
    return Qnil;

  GEOSMakeValidParams* params = GEOSMakeValidParams_create();
  valid_geom = GEOSMakeValid(self_geom);
  GEOSMakeValidParams_destroy(params);

  if (!valid_geom) {
    rb_raise(rb_eRGeoInvalidGeometry,
             "%" PRIsVALUE,
             method_geometry_invalid_reason(self));
  }
  return rgeo_wrap_geos_geometry(self_data->factory, valid_geom, Qnil);
}

in this case there is no valgrind error

static VALUE
method_geometry_make_valid(int argc, VALUE* argv, VALUE self)
{
  RGeo_GeometryData* self_data;
  const GEOSGeometry* self_geom;
  GEOSGeometry* valid_geom;
  self_data = RGEO_GEOMETRY_DATA_PTR(self);
  self_geom = self_data->geom;
  if (!self_geom)
    return Qnil;

  GEOSMakeValidParams* params = GEOSMakeValidParams_create();
  GEOSMakeValidParams_destroy(params);
  valid_geom = GEOSMakeValid(self_geom);

  if (!valid_geom) {
    rb_raise(rb_eRGeoInvalidGeometry,
             "%" PRIsVALUE,
             method_geometry_invalid_reason(self));
  }
  return rgeo_wrap_geos_geometry(self_data->factory, valid_geom, Qnil);
}

As you can see I don't even call GEOSMakeValidWithParams. Obviously my tests fail, but valgrind shows different results for those examples. Why? May it be because of extHandle reference that is used inside those methods?

Even if I call method GEOSMakeValidWithParams that should use extHandle internally it makes no difference.
still valigrind error 8 bytes in 1 blocks are definitely lost in loss record 3 of 98

static VALUE
method_geometry_make_valid(int argc, VALUE* argv, VALUE self)
{
  RGeo_GeometryData* self_data;
  const GEOSGeometry* self_geom;
  GEOSGeometry* valid_geom;
  self_data = RGEO_GEOMETRY_DATA_PTR(self);
  self_geom = self_data->geom;
  if (!self_geom)
    return Qnil;

  GEOSMakeValidParams* params = GEOSMakeValidParams_create();
  valid_geom = GEOSMakeValidWithParams(self_geom, params);
  GEOSMakeValidParams_destroy(params);

  if (!valid_geom) {
    rb_raise(rb_eRGeoInvalidGeometry,
             "%" PRIsVALUE,
             method_geometry_invalid_reason(self));
  }
  return rgeo_wrap_geos_geometry(self_data->factory, valid_geom, Qnil);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants