Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@u3ks
Copy link
Contributor

@u3ks u3ks commented Apr 18, 2025

Hi all,

We implemented an algorithm to delineate contiguous areas within cities that have identical characteristics and configurations of buildings and streets, but we thought it might be useful for other applications since the procedure is quite generic.

The idea is that is a kind of spatially restricted HDBSCAN, so there is only one parameter to specify - the minimum number of observations to form a cluster. The procedure basically consists of two steps: first, carrying out a full spatially, restricted sklearn.cluster.AgglometariveClustering clustering; and second, extracting clusters from the resulting linkage matrix, using density-clustering extraction algorithms - Excess of Mass or Leaf. This results in multiscale (clusters have varying ranges of internal similarity), contiguous clusters with noise (some observations are not attached to any clusters).

I try to explain more how it works, examples and advantages and disadvantages in the sa3.ipynb notebook.

@martinfleis martinfleis self-requested a review April 18, 2025 17:22
@jGaboardi jGaboardi requested review from gegen07 and knaaptime April 19, 2025 16:38
@jGaboardi jGaboardi added enhancement New feature or request region labels Apr 19, 2025

The algorithm carries out ``sklearn.cluster.AgglometariveClustering``
per the specified parameters and extracts clusters from it, using density-clustering
extraction algorithms - Excess of Mass or Leaf. This results in multiscale,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some reference to what EoM and Leaf mean?


from libpysal.graph import Graph
from libpysal.weights import W
from numpy import column_stack, full, unique, where, zeros
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you import numpy as np and use np.where etc?

from libpysal.graph import Graph
from libpysal.weights import W
from numpy import column_stack, full, unique, where, zeros
from pandas import Series, concat
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as with numpy.

gdf,
w,
attrs_name,
min_cluster_size=15,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is meaningless without knowing the use case. Shall we maybe make it a required arg?

@codecov
Copy link

codecov bot commented May 6, 2025

Codecov Report

Attention: Patch coverage is 92.30769% with 6 lines in your changes missing coverage. Please review.

Project coverage is 78.2%. Comparing base (13ca45e) to head (ed383c4).
Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
spopt/region/sa3.py 92.2% 6 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main    #482     +/-   ##
=======================================
+ Coverage   77.8%   78.2%   +0.4%     
=======================================
  Files         27      28      +1     
  Lines       2638    2716     +78     
=======================================
+ Hits        2053    2125     +72     
- Misses       585     591      +6     
Files with missing lines Coverage Δ
spopt/region/__init__.py 100.0% <100.0%> (ø)
spopt/region/sa3.py 92.2% <92.2%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@martinfleis
Copy link
Member

@jGaboardi can we bump lipysal min req to 4.10 here?

@jGaboardi
Copy link
Member

@jGaboardi can we bump lipysal min req to 4.10 here?

Let's open an issue for that and discuss with @knaaptime, @gegen07, @ljwolf. I think it will be OK, but want to get their inputs.

@martinfleis martinfleis changed the title Spatial Adaptive Agglomerative Aggregation (SA3) clustering ENH: add Spatial Adaptive Agglomerative Aggregation (SA3) regionalisation algorithm May 7, 2025
Copy link
Member

@martinfleis martinfleis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now fine by me! @u3ks I pushed some changes to future-proof the API (extraction keyword that is not a bool) and to clean the API (kwargs passed directly to sklearn rather than via a dedicated dictionary).

@knaaptime
Copy link
Member

cool

Copy link
Member

@gegen07 gegen07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@martinfleis martinfleis merged commit 5e8fcad into pysal:main May 8, 2025
11 checks passed
fiendskrah pushed a commit to fiendskrah/spopt that referenced this pull request Oct 15, 2025
…tion algorithm (pysal#482)

* init

* notebook

* public extraction api

* notebook

* notebook

* change cluster renumbering

* formatting and docstrings

* reorder imports

* ci change

* more ci changes

* formatting

* more formatting

* test failures

* typo

* Update sa3.py

* backwards compat

* load data within setup_method

* lint

* better API

* fix tests i broke

---------

Co-authored-by: Martin Fleischmann <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request region

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants