Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 54d2002

Browse files
committed
Fix some problems with selectivity estimation for partial indexes.
First, genericcostestimate() was being way too liberal about including partial-index conditions in its selectivity estimate, resulting in substantial underestimates for situations such as an indexqual "x = 42" used with an index on x "WHERE x >= 40 AND x < 50". While the code is intentionally set up to favor selecting partial indexes when available, this was too much... Second, choose_bitmap_and() was likewise easily fooled by cases of this type, since it would similarly think that the partial index had selectivity independent of the indexqual. Fixed by using predicate_implied_by() rather than simple equality checks to determine redundancy. This is a good deal more expensive but I don't see much alternative. At least the extra cost is only paid when there's actually a partial index under consideration. Per report from Jeff Davis. I'm not going to risk back-patching this, though.
1 parent 2b49e5d commit 54d2002

File tree

4 files changed

+120
-60
lines changed

4 files changed

+120
-60
lines changed

src/backend/optimizer/path/indxpath.c

Lines changed: 75 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
*
1010
*
1111
* IDENTIFICATION
12-
* $PostgreSQL: pgsql/src/backend/optimizer/path/indxpath.c,v 1.217 2007/03/17 00:11:04 tgl Exp $
12+
* $PostgreSQL: pgsql/src/backend/optimizer/path/indxpath.c,v 1.218 2007/03/21 22:18:12 tgl Exp $
1313
*
1414
*-------------------------------------------------------------------------
1515
*/
@@ -57,8 +57,8 @@ static Path *choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel,
5757
static int bitmap_path_comparator(const void *a, const void *b);
5858
static Cost bitmap_and_cost_est(PlannerInfo *root, RelOptInfo *rel,
5959
List *paths, RelOptInfo *outer_rel);
60-
static List *pull_indexpath_quals(Path *bitmapqual);
61-
static bool lists_intersect_ptr(List *list1, List *list2);
60+
static void find_indexpath_quals(Path *bitmapqual, List **quals, List **preds);
61+
static bool lists_intersect(List *list1, List *list2);
6262
static bool match_clause_to_indexcol(IndexOptInfo *index,
6363
int indexcol, Oid opfamily,
6464
RestrictInfo *rinfo,
@@ -562,6 +562,7 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel,
562562
Path **patharray;
563563
Cost costsofar;
564564
List *qualsofar;
565+
List *firstpred;
565566
ListCell *lastcell;
566567
int i;
567568
ListCell *l;
@@ -586,8 +587,7 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel,
586587
* consider an index redundant if any of its index conditions were already
587588
* used by earlier indexes. (We could use predicate_implied_by to have a
588589
* more intelligent, but much more expensive, check --- but in most cases
589-
* simple pointer equality should suffice, since after all the index
590-
* conditions are all coming from the same RestrictInfo lists.)
590+
* simple equality should suffice.)
591591
*
592592
* You might think the condition for redundancy should be "all index
593593
* conditions already used", not "any", but this turns out to be wrong.
@@ -597,10 +597,27 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel,
597597
* non-selective. In any case, we'd surely be drastically misestimating
598598
* the selectivity if we count the same condition twice.
599599
*
600-
* We include index predicate conditions in the redundancy test. Because
601-
* the test is just for pointer equality and not equal(), the effect is
602-
* that use of the same partial index in two different AND elements is
603-
* considered redundant. (XXX is this too strong?)
600+
* We must also consider index predicate conditions in checking for
601+
* redundancy, because the estimated selectivity of a partial index
602+
* includes its predicate even if the explicit index conditions don't.
603+
* Here we have to work harder than just checking expression equality:
604+
* we check to see if any of the predicate clauses are implied by
605+
* index conditions or predicate clauses of previous paths. This covers
606+
* cases such as a condition "x = 42" used with a plain index, followed
607+
* by a clauseless scan of a partial index "WHERE x >= 40 AND x < 50".
608+
* Also, we reject indexes that have a qual condition matching any
609+
* previously-used index's predicate (by including predicate conditions
610+
* into qualsofar). It should be sufficient to check equality in this
611+
* case, not implication, since we've sorted the paths by selectivity
612+
* and so tighter conditions are seen first --- only for exactly equal
613+
* cases might the partial index come first.
614+
*
615+
* XXX the reason we need all these redundancy checks is that costsize.c
616+
* and clausesel.c aren't very smart about redundant clauses: they will
617+
* usually double-count the redundant clauses, producing a too-small
618+
* selectivity that makes a redundant AND look like it reduces the total
619+
* cost. Perhaps someday that code will be smarter and we can remove
620+
* these heuristics.
604621
*
605622
* Note: outputting the selected sub-paths in selectivity order is a good
606623
* thing even if we weren't using that as part of the selection method,
@@ -619,26 +636,46 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel,
619636

620637
paths = list_make1(patharray[0]);
621638
costsofar = bitmap_and_cost_est(root, rel, paths, outer_rel);
622-
qualsofar = pull_indexpath_quals(patharray[0]);
639+
find_indexpath_quals(patharray[0], &qualsofar, &firstpred);
640+
qualsofar = list_concat(qualsofar, firstpred);
623641
lastcell = list_head(paths); /* for quick deletions */
624642

625643
for (i = 1; i < npaths; i++)
626644
{
627645
Path *newpath = patharray[i];
628646
List *newqual;
647+
List *newpred;
629648
Cost newcost;
630649

631-
newqual = pull_indexpath_quals(newpath);
632-
if (lists_intersect_ptr(newqual, qualsofar))
650+
find_indexpath_quals(newpath, &newqual, &newpred);
651+
if (lists_intersect(newqual, qualsofar))
633652
continue; /* consider it redundant */
653+
if (newpred)
654+
{
655+
bool redundant = false;
656+
657+
/* we check each predicate clause separately */
658+
foreach(l, newpred)
659+
{
660+
Node *np = (Node *) lfirst(l);
661+
662+
if (predicate_implied_by(list_make1(np), qualsofar))
663+
{
664+
redundant = true;
665+
break; /* out of inner loop */
666+
}
667+
}
668+
if (redundant)
669+
continue;
670+
}
634671
/* tentatively add newpath to paths, so we can estimate cost */
635672
paths = lappend(paths, newpath);
636673
newcost = bitmap_and_cost_est(root, rel, paths, outer_rel);
637674
if (newcost < costsofar)
638675
{
639676
/* keep newpath in paths, update subsidiary variables */
640677
costsofar = newcost;
641-
qualsofar = list_concat(qualsofar, newqual);
678+
qualsofar = list_concat(list_concat(qualsofar, newqual), newpred);
642679
lastcell = lnext(lastcell);
643680
}
644681
else
@@ -710,35 +747,39 @@ bitmap_and_cost_est(PlannerInfo *root, RelOptInfo *rel,
710747
}
711748

712749
/*
713-
* pull_indexpath_quals
750+
* find_indexpath_quals
714751
*
715-
* Given the Path structure for a plain or bitmap indexscan, extract a list
752+
* Given the Path structure for a plain or bitmap indexscan, extract lists
716753
* of all the indexquals and index predicate conditions used in the Path.
717754
*
718755
* This is sort of a simplified version of make_restrictinfo_from_bitmapqual;
719756
* here, we are not trying to produce an accurate representation of the AND/OR
720757
* semantics of the Path, but just find out all the base conditions used.
721758
*
722-
* The result list contains pointers to the expressions used in the Path,
759+
* The result lists contain pointers to the expressions used in the Path,
723760
* but all the list cells are freshly built, so it's safe to destructively
724-
* modify the list (eg, by concat'ing it with other lists).
761+
* modify the lists (eg, by concat'ing with other lists).
725762
*/
726-
static List *
727-
pull_indexpath_quals(Path *bitmapqual)
763+
static void
764+
find_indexpath_quals(Path *bitmapqual, List **quals, List **preds)
728765
{
729-
List *result = NIL;
730766
ListCell *l;
731767

768+
*quals = NIL;
769+
*preds = NIL;
770+
732771
if (IsA(bitmapqual, BitmapAndPath))
733772
{
734773
BitmapAndPath *apath = (BitmapAndPath *) bitmapqual;
735774

736775
foreach(l, apath->bitmapquals)
737776
{
738-
List *sublist;
777+
List *subquals;
778+
List *subpreds;
739779

740-
sublist = pull_indexpath_quals((Path *) lfirst(l));
741-
result = list_concat(result, sublist);
780+
find_indexpath_quals((Path *) lfirst(l), &subquals, &subpreds);
781+
*quals = list_concat(*quals, subquals);
782+
*preds = list_concat(*preds, subpreds);
742783
}
743784
}
744785
else if (IsA(bitmapqual, BitmapOrPath))
@@ -747,36 +788,36 @@ pull_indexpath_quals(Path *bitmapqual)
747788

748789
foreach(l, opath->bitmapquals)
749790
{
750-
List *sublist;
791+
List *subquals;
792+
List *subpreds;
751793

752-
sublist = pull_indexpath_quals((Path *) lfirst(l));
753-
result = list_concat(result, sublist);
794+
find_indexpath_quals((Path *) lfirst(l), &subquals, &subpreds);
795+
*quals = list_concat(*quals, subquals);
796+
*preds = list_concat(*preds, subpreds);
754797
}
755798
}
756799
else if (IsA(bitmapqual, IndexPath))
757800
{
758801
IndexPath *ipath = (IndexPath *) bitmapqual;
759802

760-
result = get_actual_clauses(ipath->indexclauses);
761-
result = list_concat(result, list_copy(ipath->indexinfo->indpred));
803+
*quals = get_actual_clauses(ipath->indexclauses);
804+
*preds = list_copy(ipath->indexinfo->indpred);
762805
}
763806
else
764807
elog(ERROR, "unrecognized node type: %d", nodeTag(bitmapqual));
765-
766-
return result;
767808
}
768809

769810

770811
/*
771-
* lists_intersect_ptr
812+
* lists_intersect
772813
* Detect whether two lists have a nonempty intersection,
773-
* using pointer equality to compare members.
814+
* using equal() to compare members.
774815
*
775816
* This possibly should go into list.c, but it doesn't yet have any use
776817
* except in choose_bitmap_and.
777818
*/
778819
static bool
779-
lists_intersect_ptr(List *list1, List *list2)
820+
lists_intersect(List *list1, List *list2)
780821
{
781822
ListCell *cell1;
782823

@@ -787,7 +828,7 @@ lists_intersect_ptr(List *list1, List *list2)
787828

788829
foreach(cell2, list2)
789830
{
790-
if (lfirst(cell2) == datum1)
831+
if (equal(lfirst(cell2), datum1))
791832
return true;
792833
}
793834
}

src/backend/utils/adt/selfuncs.c

Lines changed: 27 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
*
1616
*
1717
* IDENTIFICATION
18-
* $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.229 2007/03/17 00:11:05 tgl Exp $
18+
* $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.230 2007/03/21 22:18:12 tgl Exp $
1919
*
2020
*-------------------------------------------------------------------------
2121
*/
@@ -86,6 +86,7 @@
8686
#include "optimizer/pathnode.h"
8787
#include "optimizer/paths.h"
8888
#include "optimizer/plancat.h"
89+
#include "optimizer/predtest.h"
8990
#include "optimizer/restrictinfo.h"
9091
#include "optimizer/var.h"
9192
#include "parser/parse_coerce.h"
@@ -4775,36 +4776,38 @@ genericcostestimate(PlannerInfo *root,
47754776
List *selectivityQuals;
47764777
ListCell *l;
47774778

4778-
/*
4779+
/*----------
47794780
* If the index is partial, AND the index predicate with the explicitly
47804781
* given indexquals to produce a more accurate idea of the index
4781-
* selectivity. This may produce redundant clauses. We get rid of exact
4782-
* duplicates in the code below. We expect that most cases of partial
4783-
* redundancy (such as "x < 4" from the qual and "x < 5" from the
4784-
* predicate) will be recognized and handled correctly by
4785-
* clauselist_selectivity(). This assumption is somewhat fragile, since
4786-
* it depends on predicate_implied_by() and clauselist_selectivity()
4787-
* having similar capabilities, and there are certainly many cases where
4788-
* we will end up with a too-low selectivity estimate. This will bias the
4789-
* system in favor of using partial indexes where possible, which is not
4790-
* necessarily a bad thing. But it'd be nice to do better someday.
4782+
* selectivity. However, we need to be careful not to insert redundant
4783+
* clauses, because clauselist_selectivity() is easily fooled into
4784+
* computing a too-low selectivity estimate. Our approach is to add
4785+
* only the index predicate clause(s) that cannot be proven to be implied
4786+
* by the given indexquals. This successfully handles cases such as a
4787+
* qual "x = 42" used with a partial index "WHERE x >= 40 AND x < 50".
4788+
* There are many other cases where we won't detect redundancy, leading
4789+
* to a too-low selectivity estimate, which will bias the system in favor
4790+
* of using partial indexes where possible. That is not necessarily bad
4791+
* though.
47914792
*
4792-
* Note that index->indpred and indexQuals are both in implicit-AND form,
4793-
* so ANDing them together just takes merging the lists. However,
4794-
* eliminating duplicates is a bit trickier because indexQuals contains
4795-
* RestrictInfo nodes and the indpred does not. It is okay to pass a
4796-
* mixed list to clauselist_selectivity, but we have to work a bit to
4797-
* generate a list without logical duplicates. (We could just list_union
4798-
* indpred and strippedQuals, but then we'd not get caching of per-qual
4799-
* selectivity estimates.)
4793+
* Note that indexQuals contains RestrictInfo nodes while the indpred
4794+
* does not. This is OK for both predicate_implied_by() and
4795+
* clauselist_selectivity().
4796+
*----------
48004797
*/
48014798
if (index->indpred != NIL)
48024799
{
4803-
List *strippedQuals;
4804-
List *predExtraQuals;
4800+
List *predExtraQuals = NIL;
4801+
4802+
foreach(l, index->indpred)
4803+
{
4804+
Node *predQual = (Node *) lfirst(l);
4805+
List *oneQual = list_make1(predQual);
48054806

4806-
strippedQuals = get_actual_clauses(indexQuals);
4807-
predExtraQuals = list_difference(index->indpred, strippedQuals);
4807+
if (!predicate_implied_by(oneQual, indexQuals))
4808+
predExtraQuals = list_concat(predExtraQuals, oneQual);
4809+
}
4810+
/* list_concat avoids modifying the passed-in indexQuals list */
48084811
selectivityQuals = list_concat(predExtraQuals, indexQuals);
48094812
}
48104813
else

src/test/regress/expected/select.out

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -209,9 +209,13 @@ SELECT onek.unique1, onek.string4 FROM onek
209209
-- test partial btree indexes
210210
--
211211
-- As of 7.2, planner probably won't pick an indexscan without stats,
212-
-- so ANALYZE first.
212+
-- so ANALYZE first. Also, we want to prevent it from picking a bitmapscan
213+
-- followed by sort, because that could hide index ordering problems.
213214
--
214215
ANALYZE onek2;
216+
SET enable_seqscan TO off;
217+
SET enable_bitmapscan TO off;
218+
SET enable_sort TO off;
215219
--
216220
-- awk '{if($1<10){print $0;}else{next;}}' onek.data | sort +0n -1
217221
--
@@ -288,6 +292,9 @@ SELECT onek2.unique1, onek2.stringu1 FROM onek2
288292
999 | LMAAAA
289293
(19 rows)
290294

295+
RESET enable_seqscan;
296+
RESET enable_bitmapscan;
297+
RESET enable_sort;
291298
SELECT two, stringu1, ten, string4
292299
INTO TABLE tmp
293300
FROM onek;

src/test/regress/sql/select.sql

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,10 +59,15 @@ SELECT onek.unique1, onek.string4 FROM onek
5959
-- test partial btree indexes
6060
--
6161
-- As of 7.2, planner probably won't pick an indexscan without stats,
62-
-- so ANALYZE first.
62+
-- so ANALYZE first. Also, we want to prevent it from picking a bitmapscan
63+
-- followed by sort, because that could hide index ordering problems.
6364
--
6465
ANALYZE onek2;
6566

67+
SET enable_seqscan TO off;
68+
SET enable_bitmapscan TO off;
69+
SET enable_sort TO off;
70+
6671
--
6772
-- awk '{if($1<10){print $0;}else{next;}}' onek.data | sort +0n -1
6873
--
@@ -81,6 +86,10 @@ SELECT onek2.unique1, onek2.stringu1 FROM onek2
8186
SELECT onek2.unique1, onek2.stringu1 FROM onek2
8287
WHERE onek2.unique1 > 980;
8388

89+
RESET enable_seqscan;
90+
RESET enable_bitmapscan;
91+
RESET enable_sort;
92+
8493

8594
SELECT two, stringu1, ten, string4
8695
INTO TABLE tmp

0 commit comments

Comments
 (0)