-
Notifications
You must be signed in to change notification settings - Fork 56
ENH: add Spatial Adaptive Agglomerative Aggregation (SA3) regionalisation algorithm #482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| The algorithm carries out ``sklearn.cluster.AgglometariveClustering`` | ||
| per the specified parameters and extracts clusters from it, using density-clustering | ||
| extraction algorithms - Excess of Mass or Leaf. This results in multiscale, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have some reference to what EoM and Leaf mean?
spopt/region/sa3.py
Outdated
|
|
||
| from libpysal.graph import Graph | ||
| from libpysal.weights import W | ||
| from numpy import column_stack, full, unique, where, zeros |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you import numpy as np and use np.where etc?
spopt/region/sa3.py
Outdated
| from libpysal.graph import Graph | ||
| from libpysal.weights import W | ||
| from numpy import column_stack, full, unique, where, zeros | ||
| from pandas import Series, concat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as with numpy.
spopt/region/sa3.py
Outdated
| gdf, | ||
| w, | ||
| attrs_name, | ||
| min_cluster_size=15, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default is meaningless without knowing the use case. Shall we maybe make it a required arg?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #482 +/- ##
=======================================
+ Coverage 77.8% 78.2% +0.4%
=======================================
Files 27 28 +1
Lines 2638 2716 +78
=======================================
+ Hits 2053 2125 +72
- Misses 585 591 +6
🚀 New features to boost your workflow:
|
|
@jGaboardi can we bump lipysal min req to 4.10 here? |
Let's open an issue for that and discuss with @knaaptime, @gegen07, @ljwolf. I think it will be OK, but want to get their inputs. |
martinfleis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now fine by me! @u3ks I pushed some changes to future-proof the API (extraction keyword that is not a bool) and to clean the API (kwargs passed directly to sklearn rather than via a dedicated dictionary).
|
cool |
gegen07
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…tion algorithm (pysal#482) * init * notebook * public extraction api * notebook * notebook * change cluster renumbering * formatting and docstrings * reorder imports * ci change * more ci changes * formatting * more formatting * test failures * typo * Update sa3.py * backwards compat * load data within setup_method * lint * better API * fix tests i broke --------- Co-authored-by: Martin Fleischmann <[email protected]>
Hi all,
We implemented an algorithm to delineate contiguous areas within cities that have identical characteristics and configurations of buildings and streets, but we thought it might be useful for other applications since the procedure is quite generic.
The idea is that is a kind of spatially restricted HDBSCAN, so there is only one parameter to specify - the minimum number of observations to form a cluster. The procedure basically consists of two steps: first, carrying out a full spatially, restricted
sklearn.cluster.AgglometariveClusteringclustering; and second, extracting clusters from the resulting linkage matrix, using density-clustering extraction algorithms - Excess of Mass or Leaf. This results in multiscale (clusters have varying ranges of internal similarity), contiguous clusters with noise (some observations are not attached to any clusters).I try to explain more how it works, examples and advantages and disadvantages in the
sa3.ipynbnotebook.