Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 5f4d98d

Browse files
Prevent premature nbtree array advancement.
nbtree array index scans could fail to return matching tuples in rare cases where the missed tuples cover key space that the scan's arrays incorrectly indicate has already been read. These cases involved nearby tuples with NULL values that were evaluated using a skip array key while in pstate.forcenonrequired mode. To fix, prevent forcenonrequired mode from prematurely advancing the scan's array keys beyond key space that the scan has yet to read tuples from: reset the scan's array keys (to the first elements in the current scan direction) before the _bt_checkkeys call for pstate.finaltup. That way _bt_checkkeys starts from a clean slate, which ensures that it will call _bt_advance_array_keys (while passing it sktrig_required=true). This reliably restores the invariant that the scan's arrays always accurately track its progress through the index's key space (at least when the scan is "between pages"). Oversight in commit 8a51027, which optimized nbtree search scan key comparisons. Author: Peter Geoghegan <[email protected]> Reviewed-By: Mark Dilger <[email protected]> Discussion: https://postgr.es/m/CAH2-WzmodSE+gpTd1CRGU9ez8ytyyDS+Kns2r9NzgUp1s56kpw@mail.gmail.com
1 parent 7e25c93 commit 5f4d98d

File tree

2 files changed

+39
-11
lines changed

2 files changed

+39
-11
lines changed

src/backend/access/nbtree/nbtsearch.c

+7-1
Original file line numberDiff line numberDiff line change
@@ -1790,9 +1790,13 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
17901790
IndexTuple itup = (IndexTuple) PageGetItem(page, iid);
17911791
int truncatt;
17921792

1793-
truncatt = BTreeTupleGetNAtts(itup, rel);
1793+
/* Reset arrays, per _bt_set_startikey contract */
1794+
if (pstate.forcenonrequired)
1795+
_bt_start_array_keys(scan, dir);
17941796
pstate.forcenonrequired = false;
17951797
pstate.startikey = 0; /* _bt_set_startikey ignores P_HIKEY */
1798+
1799+
truncatt = BTreeTupleGetNAtts(itup, rel);
17961800
_bt_checkkeys(scan, &pstate, arrayKeys, itup, truncatt);
17971801
}
17981802

@@ -1879,8 +1883,10 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
18791883
pstate.offnum = offnum;
18801884
if (arrayKeys && offnum == minoff && pstate.forcenonrequired)
18811885
{
1886+
/* Reset arrays, per _bt_set_startikey contract */
18821887
pstate.forcenonrequired = false;
18831888
pstate.startikey = 0;
1889+
_bt_start_array_keys(scan, dir);
18841890
}
18851891
passes_quals = _bt_checkkeys(scan, &pstate, arrayKeys,
18861892
itup, indnatts);

src/backend/access/nbtree/nbtutils.c

+32-10
Original file line numberDiff line numberDiff line change
@@ -2489,13 +2489,14 @@ _bt_oppodir_checkkeys(IndexScanDesc scan, ScanDirection dir,
24892489
* primscan's first page would mislead _bt_advance_array_keys, which expects
24902490
* pstate.nskipadvances to be representative of every first page's key space.)
24912491
*
2492-
* Caller must reset startikey and forcenonrequired ahead of the _bt_checkkeys
2493-
* call for pstate.finaltup iff we set forcenonrequired=true. This will give
2494-
* _bt_checkkeys the opportunity to call _bt_advance_array_keys once more,
2495-
* with sktrig_required=true, to advance the arrays that were ignored during
2496-
* checks of all of the page's prior tuples. Caller doesn't need to do this
2497-
* on the rightmost/leftmost page in the index (where pstate.finaltup isn't
2498-
* set), since forcenonrequired won't be set here by us in the first place.
2492+
* Caller must call _bt_start_array_keys and reset startikey/forcenonrequired
2493+
* ahead of the finaltup _bt_checkkeys call when we set forcenonrequired=true.
2494+
* This will give _bt_checkkeys the opportunity to call _bt_advance_array_keys
2495+
* with sktrig_required=true, restoring the invariant that the scan's required
2496+
* arrays always track the scan's progress through the index's key space.
2497+
* Caller won't need to do this on the rightmost/leftmost page in the index
2498+
* (where pstate.finaltup isn't ever set), since forcenonrequired will never
2499+
* be set here in the first place.
24992500
*/
25002501
void
25012502
_bt_set_startikey(IndexScanDesc scan, BTReadPageState *pstate)
@@ -2556,10 +2557,31 @@ _bt_set_startikey(IndexScanDesc scan, BTReadPageState *pstate)
25562557
if (key->sk_flags & SK_ROW_HEADER)
25572558
{
25582559
/*
2559-
* Can't let pstate.startikey get set to an ikey beyond a
2560-
* RowCompare inequality
2560+
* RowCompare inequality.
2561+
*
2562+
* Only the first subkey from a RowCompare can ever be marked
2563+
* required (that happens when the row header is marked required).
2564+
* There is no simple, general way for us to transitively deduce
2565+
* whether or not every tuple on the page satisfies a RowCompare
2566+
* key based only on firsttup and lasttup -- so we just give up.
25612567
*/
2562-
break; /* unsafe */
2568+
if (!start_past_saop_eq && !so->skipScan)
2569+
break; /* unsafe to go further */
2570+
2571+
/*
2572+
* We have to be even more careful with RowCompares that come
2573+
* after an array: we assume it's unsafe to even bypass the array.
2574+
* Calling _bt_start_array_keys to recover the scan's arrays
2575+
* following use of forcenonrequired mode isn't compatible with
2576+
* _bt_check_rowcompare's continuescan=false behavior with NULL
2577+
* row compare members. _bt_advance_array_keys must not make a
2578+
* decision on the basis of a key not being satisfied in the
2579+
* opposite-to-scan direction until the scan reaches a leaf page
2580+
* where the same key begins to be satisfied in scan direction.
2581+
* The _bt_first !used_all_subkeys behavior makes this limitation
2582+
* hard to work around some other way.
2583+
*/
2584+
return; /* completely unsafe to set pstate.startikey */
25632585
}
25642586
if (key->sk_strategy != BTEqualStrategyNumber)
25652587
{

0 commit comments

Comments
 (0)