Feature: v1 endpoints for address stats with pagination and filtering #894

1yam · 2025-12-15T14:05:42Z

This Pr goal is to make a new endpoints /api/v1/addresses/stats.json to handle address stats with filter, pagination

Related Clickup or Jira tickets : ALEPH-XXX

Self proofreading checklist

Is my code clear enough and well documented
Are my files well typed
New translations have been added or updated if new strings have been introduced in the frontend
Database migrations file are included
Are there enough tests
Documentation has been included (for new feature)

Changes

This pull request introduces a new, efficient, and flexible system for querying address statistics, including a new API endpoint, database materialized views, and backend logic. The main focus is to enable advanced filtering, sorting, and substring search of addresses, along with robust pagination and improved performance. Comprehensive tests are also added to ensure correctness.

The most important changes are:

Database and Backend Infrastructure:

Added a new Alembic migration to create the address_total_message_stats materialized view, which aggregates total message counts per address and includes indexes (including a trigram index) to support fast substring search and efficient sorting/filtering.
Updated the backend logic in messages.py to:
- Add fetch_stats_for_addresses for advanced address stats queries with filtering, sorting, and pagination.
- Add find_matching_addresses for fast substring search using the new trigram index.
- Ensure materialized views are refreshed together for up-to-date stats.

API and Schema Enhancements:

Introduced the AddressesQueryParams schema, supporting flexible query parameters for filtering, sorting, and pagination of address statistics, including substring search.
Added a new API endpoint /api/v1/addresses/stats.json with the addresses_stats_view_v2 handler, which leverages the new backend logic and schema for efficient address stats queries. [1] [2]

Testing:

Added comprehensive tests for address stats functions, covering substring search, filtering, sorting, and pagination to ensure correctness and robustness of the new querying system.

Copilot

Pull request overview

This PR introduces a new v1 API endpoint /api/v1/addresses/stats.json that provides address statistics with enhanced filtering, sorting, and pagination capabilities. The implementation leverages PostgreSQL materialized views with trigram indexing for efficient substring search.

Key changes:

Database materialized view address_total_message_stats with trigram indexing for fast address substring search
New query parameter schema with support for filtering by message type counts, sorting options, and pagination
Backend functions for fetching address statistics and finding matching addresses

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
`deployment/migrations/versions/0040_d6539a42cd51_create_address_summary_view.py`	Creates materialized view for address message counts with trigram index for substring search
`src/aleph/schemas/addresses_query_params.py`	Defines query parameters schema with filtering, sorting, and pagination support
`src/aleph/db/accessors/messages.py`	Adds `fetch_stats_for_addresses` and `find_matching_addresses` functions with SQL-based queries
`src/aleph/web/controllers/accounts.py`	Implements new v2 endpoint handler with pagination and custom JSON encoding for Decimal types
`src/aleph/web/controllers/routes.py`	Registers new v1 endpoint route
`tests/db/test_address_stats.py`	Comprehensive test coverage for address stats functions including filtering, sorting, and pagination

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/aleph/schemas/addresses_query_params.py

src/aleph/db/accessors/messages.py

src/aleph/web/controllers/accounts.py

src/aleph/db/accessors/messages.py

src/aleph/schemas/addresses_query_params.py

src/aleph/db/accessors/messages.py

src/aleph/schemas/addresses_query_params.py

nesitor

There are some things done with AI that needs to be done in the proper way following the same patterns we already have and also preventing security issues.

src/aleph/db/accessors/messages.py

deployment/migrations/versions/0041_d6539a42cd51_create_address_summary_view.py

src/aleph/web/controllers/accounts.py

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

deployment/migrations/versions/0040_d6539a42cd51_create_address_summary_view.py

src/aleph/db/accessors/address.py

src/aleph/schemas/addresses_query_params.py

src/aleph/web/controllers/accounts.py

src/aleph/db/accessors/address.py

src/aleph/schemas/addresses_query_params.py

src/aleph/services/cache/node_cache.py

src/aleph/db/accessors/address.py

odesenfans

Found a few bugs.

deployment/migrations/versions/0041_d6539a42cd51_create_address_summary_view.py

odesenfans · 2026-01-05T14:24:28Z

src/aleph/db/accessors/address.py

+    ).group_by(AddressStats.address)
+
+    # Filter by address (list)
+    if addresses:


There's a bug here. If find_matching_addresses returns an empty list (=no match), this will instead query all of the addresses. You should:

Check for if addresses is not None

Detect the case where there is no match and just not enter this function at all (no need for a DB query if we know that there is no match).

Wouldnt it be better to return a 404 here if matched_address is an empty list ?

https://github.com/aleph-im/pyaleph/pull/894/changes#diff-a2f5afabe3554da832215ad15f43df800d131001df7ab23cbb75cf061df8e36eR119

odesenfans · 2026-01-05T14:25:18Z

src/aleph/services/cache/node_cache.py

+        self,
+        session: DbSession,
+        filters: Optional[Dict[str, Any]] = None,
+    ) -> int:


What's the logic behind this caching? Is querying the materialized view too slow?

The caching was to avoid re processed the total of address everytime,
If user use explorer, when switching page everytime we would re query the toal paginations on every request.

It's was mostly for the filter i added before but if we remove the filter parts we might also want to remove that, it's shouldn't be that expensive to query even if we query it 10 time.

odesenfans · 2026-01-05T14:27:57Z

src/aleph/services/cache/node_cache.py

+            enum_filters = {SortBy(k): v for k, v in filters.items()}
+
+        # Pass per_page=0 to disable pagination for the count query
+        stmt = make_fetch_stats_address_query(filters=enum_filters, per_page=0)


You never specify address_contains here, so you always query all addresses.

odesenfans · 2026-01-05T14:30:28Z

src/aleph/db/accessors/address.py

+
+    address_query = (
+        select(AddressTotalMessages.address)
+        .where(AddressTotalMessages.address.ilike(pattern))


Use lower() to make sure that you hit the GIN index.

Suggested change

.where(AddressTotalMessages.address.ilike(pattern))

.where(func.lower(AddressTotalMessages.address).like(pattern.lower()))

odesenfans · 2026-01-05T14:34:12Z

src/aleph/db/accessors/address.py

+
+
+def find_matching_addresses(
+    session: DbSession, address_contains: str, limit: int = 5000


5000 is a bit big. Could result in performance issues when using the result in later queries, especially for ones with lots of potential matches. It's not a huge problem right now, but maybe a subquery could be better? i.e. pass address_contains to make_fetch_stats_address_query and use a subquery if it is present?

odesenfans · 2026-01-05T14:47:14Z

src/aleph/schemas/addresses_query_params.py

+    filters: Dict[SortBy, int] | None = Field(
+        default=None,
+        description="Minimum values required for each sort category. Example: { 'POST': 3 }",
+    )


What is the point of this field? Defining it as a dictionary is also weird, how are you supposed to pass it as query parameters?

If i remember correct was something like:

?filters[POST]=300

The goal was to sort with minimum value exemple you want address who send store so you can just filter.
But i guess this isnt really usefull i can remove it

odesenfans · 2026-01-05T14:48:38Z

src/aleph/web/controllers/accounts.py

    return web.json_response(output, dumps=lambda v: json.dumps(v))


+async def addresses_stats_view_v2(request: web.Request):


Add at least a simple integration test that uses most features of the endpoint.

odesenfans · 2026-01-05T14:49:24Z

src/aleph/web/controllers/routes.py

    )

    app.router.add_get("/api/v0/addresses/stats.json", accounts.addresses_stats_view)
+    app.router.add_get("/api/v1/addresses/stats.json", accounts.addresses_stats_view_v2)


Naming: a bit weird to have v1 use a function called v2. Name the original addresses_stats_view_v0 and the new one addresses_stats_view_v1.

…w and address_total_message_stats

…4c5d6

…need

… addresses_stats_view_v2 to addresses_stats_view_v1

…for faster search

odesenfans · 2026-01-06T17:14:56Z

src/aleph/db/accessors/address.py

+    Subquery defining the set of addresses to include.
+    Only used when address filtering is requested.
+    """
+    pattern = f"%{address_contains.lower()}%"


Missed this, this is a risk for SQL injection. Remove this and use func.lower(AddressStats.address).contains(address_contains.lower()) instead.

src/aleph/schemas/addresses_query_params.py

* use same model for "data" in v0 and v1 * return ints in all cases * reduce duplication between v0 and v1

odesenfans · 2026-01-07T10:11:55Z

src/aleph/db/models/address.py

+class ViewBase:
+    pass
+
+
+# Create the base with the class as a template
+Base = declarative_base(cls=ViewBase)


Why do you declare a new Base class? Any reason not to reuse the one that's in db/models/base.py?

Copilot AI review requested due to automatic review settings December 15, 2025 14:05

Copilot started reviewing on behalf of 1yam December 15, 2025 14:09 View session

1yam requested review from gmolki, nesitor and odesenfans December 15, 2025 14:10

Copilot AI reviewed Dec 15, 2025

View reviewed changes

nesitor requested changes Dec 15, 2025

View reviewed changes

1yam requested review from Copilot and nesitor December 17, 2025 12:38

1yam self-assigned this Dec 17, 2025

Copilot started reviewing on behalf of 1yam December 17, 2025 12:39 View session

Copilot AI reviewed Dec 17, 2025

View reviewed changes

1yam force-pushed the 1yam-address-improvment branch from d2ab60a to d07f760 Compare December 22, 2025 09:31

odesenfans requested changes Jan 5, 2026

View reviewed changes

1yam added 16 commits January 6, 2026 16:37

Feature: v1 endpoints for address stats with pagination and filtering

60f4552

Feature: new node cache functions count_address_stats

9008ec0

Fix: remove pure sql db accessors

06e6de7

Fix: query params for adddress sortby messages instead of MESSAGES

efd2016

Feature: DB model for the two materialized biew address_stats_mat_vie…

e7e9667

…w and address_total_message_stats

Feature: db accessors for address stats v1 and refactor of the endpoints

ef917e5

fix: unit test for test address stats

531dc78

fix: json encpdong error with decimal

e35824f

fix: docstring wrong revsion

3946255

fix: typo and typing

105735b

fix: use .scalars().all() instead of .all() for address matching

5e5ce3c

fix: keep consistency between type

19ebbef

fix: missing import

ca84832

Fix: count_address_stats missing per_page=0

167ae03

fix: migrations for adddress summary down revisions should be e1f2a3b…

075bcae

…4c5d6

fix: count_address_stats does not need to be in node cache

db06ee9

1yam added 6 commits January 6, 2026 16:37

fix: remove AddressTotalMessages, AddressStats is enough for what we …

6d56c5f

…need

fix: remove refresh of the removed mat view

0ce6f9b

refactor: addresses_stats_view renamed to addresses_stats_view_v0 and…

0762329

… addresses_stats_view_v2 to addresses_stats_view_v1

Fix: use count_address_stats from db instead of nodecache

43954da

Fix: fully replace the added view to the old mat view and news index …

7d98da7

…for faster search

Unit: improvent on unit test for the news stats endpoints / db accessors

d3bf43f

1yam force-pushed the 1yam-address-improvment branch from 7aa2dc4 to d3bf43f Compare January 6, 2026 15:37

odesenfans reviewed Jan 6, 2026

View reviewed changes

1yam added 3 commits January 6, 2026 18:45

Fix: use contains instead of ilike for security reason

23d052d

Fix: count_address_stats should also use make_address_filter_subquery

f92dce1

Fix: address_contains should have a maxsize of 66 char

b43afb8

odesenfans reviewed Jan 6, 2026

View reviewed changes

src/aleph/schemas/addresses_query_params.py Show resolved Hide resolved

Update src/aleph/schemas/addresses_query_params.py

2eb8fd3

odesenfans reviewed Jan 6, 2026

View reviewed changes

src/aleph/schemas/addresses_query_params.py Outdated Show resolved Hide resolved

Update src/aleph/schemas/addresses_query_params.py

0f28a51

odesenfans reviewed Jan 6, 2026

View reviewed changes

src/aleph/schemas/addresses_query_params.py Outdated Show resolved Hide resolved

odesenfans added 6 commits January 7, 2026 00:15

Update src/aleph/schemas/addresses_query_params.py

a51cb5d

fmt

9f8823b

fixes:

ad83852

* use same model for "data" in v0 and v1 * return ints in all cases * reduce duplication between v0 and v1

fix test

56b1b21

sort by address by default

9986431

fix tests

045f402

odesenfans reviewed Jan 7, 2026

View reviewed changes

odesenfans added 2 commits January 7, 2026 11:18

fixes for review

a558282

ignore views in alembic autogeneration

ca56b6a

odesenfans merged commit cfa8466 into main Jan 7, 2026
5 checks passed

odesenfans deleted the 1yam-address-improvment branch January 7, 2026 10:44

	.where(AddressTotalMessages.address.ilike(pattern))
	.where(func.lower(AddressTotalMessages.address).like(pattern.lower()))



		def find_matching_addresses(
		session: DbSession, address_contains: str, limit: int = 5000

		return web.json_response(output, dumps=lambda v: json.dumps(v))


		async def addresses_stats_view_v2(request: web.Request):

Feature: v1 endpoints for address stats with pagination and filtering #894

Feature: v1 endpoints for address stats with pagination and filtering #894

Uh oh!

Conversation

1yam commented Dec 15, 2025

Self proofreading checklist

Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nesitor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

odesenfans left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!