-
Notifications
You must be signed in to change notification settings - Fork 864
Open
Description
Environment
{
"server": "arango",
"license": "community",
"version": "3.12.4"
}
Issue Summary
Two nearly identical AQL queries produce different results when a simple filter is added.
- Query 1 (without
doc._is_latest == TRUE
) returns expected results. - Query 2 (with the additional filter
doc._is_latest == TRUE
) returns no results. - This seem to only happen with when I filter by fields indexed with
[*]
. - Seems to work correctly with other indexed fields
- It appears that filtering by any
[*]
field causes issue when using additional filters on fields not indexed in the index.
Both queries use the same inverted index (kev_epss_inv
) with forceIndexHint: true
. The execution plan confirms that the filters are covered by the index.
Expected Behavior
Query 2 should return a filtered subset of the documents returned in Query 1. Since all documents returned by Query 1 already satisfy that condition, both queries should return the same results.
Actual Behavior
Query 2 returns an empty result set, even though all documents from Query 1 have _is_latest == true
.
Query 1
Query String (200 chars, results cachable: true):
FOR doc IN nvd_cve_vertex_collection OPTIONS {indexHint: "kev_epss_inv", forceIndexHint: true}
FILTER doc.labels[*] == 'epss' //AND doc._is_latest == TRUE
LIMIT 10
RETURN KEEP(doc, 'id', '_is_latest')
Execution plan:
Id NodeType Par Est. Comment
1 SingletonNode 1 * ROOT
8 IndexNode 2602005 - FOR doc IN nvd_cve_vertex_collection /* inverted index scan, index scan + document lookup */
5 LimitNode 10 - LIMIT 0, 10
6 CalculationNode ✓ 10 - LET #4 = KEEP(doc, "id", "_is_latest") /* simple expression */ /* collections used: doc : nvd_cve_vertex_collection */
7 ReturnNode 10 - RETURN #4
Indexes used:
By Name Type Collection Unique Sparse Cache Selectivity Fields Stored values Ranges
8 kev_epss_inv inverted nvd_cve_vertex_collection false true false n/a [ `id`, `labels[*]`, `name` ] [ ] (doc.`labels`[*] == "epss")
Functions used:
Name Deterministic Cacheable Uses V8
KEEP true true false
Optimization rules applied:
Id Rule Name Id Rule Name
1 use-indexes 3 remove-unnecessary-calculations-2
2 remove-filter-covered-by-index 4 async-prefetch
58 rule(s) executed, 1 plan(s) created, peak mem [b]: 0, exec time [s]: 0.00034
Result 1
[
{
"_is_latest": true,
"id": "report--8a22a246-f149-538c-8d1d-611cb74de030"
},
{
"_is_latest": true,
"id": "report--cb0f7af1-cc05-5103-b112-3620f50295cd"
},
{
"_is_latest": true,
"id": "report--cb16f552-f80e-5ef5-bf8a-9b6d1e4f05ba"
},
{
"_is_latest": true,
"id": "report--cb26eb1b-7cb7-5c70-9422-e8ab47c9ea46"
},
{
"_is_latest": true,
"id": "report--cb29d083-25bc-5abd-8755-749083cfefc1"
},
{
"_is_latest": true,
"id": "report--cb2e23bb-e26d-5af2-a018-81cb25a9822c"
},
{
"_is_latest": true,
"id": "report--cb3934e4-b433-5fea-8500-010195624c80"
},
{
"_is_latest": true,
"id": "report--cb46ea40-0943-5611-b15b-ded5b3b58fe9"
},
{
"_is_latest": true,
"id": "report--cb49b842-85b5-567e-be43-d307455af4bb"
},
{
"_is_latest": true,
"id": "report--cb5e8543-e2e9-5667-b065-3d5017d68eca"
}
]
Query 2
Query String (198 chars, results cachable: true):
FOR doc IN nvd_cve_vertex_collection OPTIONS {indexHint: "kev_epss_inv", forceIndexHint: true}
FILTER doc.labels[*] == 'epss' AND doc._is_latest == TRUE
LIMIT 10
RETURN KEEP(doc, 'id', '_is_latest')
Execution plan:
Id NodeType Par Est. Comment
1 SingletonNode 1 * ROOT
8 IndexNode 2602005 - FOR doc IN nvd_cve_vertex_collection /* inverted index scan, index scan + document lookup (filter projections: `_is_latest`, `labels`) */ FILTER ((doc.`labels`[*] == "epss") && (doc.`_is_latest` == true)) /* early pruning */
5 LimitNode 10 - LIMIT 0, 10
6 CalculationNode ✓ 10 - LET #4 = KEEP(doc, "id", "_is_latest") /* simple expression */ /* collections used: doc : nvd_cve_vertex_collection */
7 ReturnNode 10 - RETURN #4
Indexes used:
By Name Type Collection Unique Sparse Cache Selectivity Fields Stored values Ranges
8 kev_epss_inv inverted nvd_cve_vertex_collection false true false n/a [ `id`, `labels[*]`, `name` ] [ ] (doc.`labels`[*] == "epss")
Functions used:
Name Deterministic Cacheable Uses V8
KEEP true true false
Optimization rules applied:
Id Rule Name Id Rule Name Id Rule Name
1 use-indexes 2 move-filters-into-enumerate 3 async-prefetch
57 rule(s) executed, 1 plan(s) created, peak mem [b]: 0, exec time [s]: 0.00066
Result 2
[]
Steps to Reproduce
-
Create a collection and import the following documents:
[ { "_is_latest": true, "id": "report--8a22a246-f149-538c-8d1d-611cb74de030", "labels": [ "epss" ], "name": "EPSS Scores: CVE-2025-31772" }, { "_is_latest": true, "id": "report--cb0f7af1-cc05-5103-b112-3620f50295cd", "labels": [ "epss" ], "name": "EPSS Scores: CVE-2019-8950" }, { "_is_latest": true, "id": "report--cb16f552-f80e-5ef5-bf8a-9b6d1e4f05ba", "labels": [ "epss" ], "name": "EPSS Scores: CVE-2018-11289" }, { "_is_latest": true, "id": "report--cb26eb1b-7cb7-5c70-9422-e8ab47c9ea46", "labels": [ "epss" ], "name": "EPSS Scores: CVE-2019-6596" }, { "_is_latest": true, "id": "report--cb29d083-25bc-5abd-8755-749083cfefc1", "labels": [ "epss" ], "name": "EPSS Scores: CVE-2018-16071" }, { "_is_latest": true, "id": "report--cb2e23bb-e26d-5af2-a018-81cb25a9822c", "labels": [ "epss" ], "name": "EPSS Scores: CVE-2018-18810" }, { "_is_latest": true, "id": "report--cb3934e4-b433-5fea-8500-010195624c80", "labels": [ "epss" ], "name": "EPSS Scores: CVE-2018-15518" }, { "_is_latest": true, "id": "report--cb46ea40-0943-5611-b15b-ded5b3b58fe9", "labels": [ "epss" ], "name": "EPSS Scores: CVE-2018-17584" }, { "_is_latest": true, "id": "report--cb49b842-85b5-567e-be43-d307455af4bb", "labels": [ "epss" ], "name": "EPSS Scores: CVE-2019-9563" }, { "_is_latest": true, "id": "report--cb5e8543-e2e9-5667-b065-3d5017d68eca", "labels": [ "epss" ], "name": "EPSS Scores: CVE-2018-5800" } ]
-
Create the following inverted index on the collection:
{ "type": "inverted", "name": "", "inBackground": true, "analyzer": "", "features": [], "cache": false, "includeAllFields": false, "trackListPositions": false, "searchField": false, "fields": [ { "name": "id" }, { "name": "labels[*]" }, { "name": "name" } ], "cleanupIntervalStep": 2, "commitIntervalMsec": 1000, "consolidationIntervalMsec": 1000, "writebufferIdle": 64, "writebufferActive": 0, "writebufferSizeMax": 33554432, "primarySort": { "fields": [ { "field": "", "direction": "asc" } ], "compression": "lz4" }, "storedValues": [ { "fields": [], "compression": "lz4" } ], "consolidationPolicy": { "type": "tier", "segmentsMin": 1, "segmentsMax": 10, "segmentsBytesMax": 5368709120, "segmentsBytesFloor": 2097152, "minScore": 0 } }
-
Run the two queries above, ensuring they both use the
kev_epss_inv
index withforceIndexHint: true
.
Metadata
Metadata
Assignees
Labels
No labels