Long story short, using PostgreSQL 11 and RUM index you can do both TOP-N query and COUNT(*) query for non-selective FTS queries without fetching all the matching results from heap (and that is certainly much faster). If you’re interested in details, then please read the detailed description below.

At November 1st 2017, Tom Lane committed a patch enabling bitmap scans to behave like index-only scan when possible. In particular, since PostgreSQL 11 COUNT(*) queries can be evaluated using bitmap scans without accessing heap, when corresponding bit in visibility map is set.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
commit 7c70996ebf0949b142a99c9445061c3c83ce62b3
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Wed Nov 1 17:38:12 2017 -0400

    Allow bitmap scans to operate as index-only scans when possible.

    If we don't have to return any columns from heap tuples, and there's
    no need to recheck qual conditions, and the heap page is all-visible,
    then we can skip fetching the heap page altogether.

    Skip prefetching pages too, when possible, on the assumption that the
    recheck flag will remain the same from one page to the next.  While that
    assumption is hardly bulletproof, it seems like a good bet most of the
    time, and better than prefetching pages we don't need.

    This commit installs the executor infrastructure, but doesn't change
    any planner cost estimates, thus possibly causing bitmap scans to
    not be chosen in cases where this change renders them the best choice.
    I (tgl) am not entirely convinced that we need to account for this
    behavior in the planner, because I think typically the bitmap scan would
    get chosen anyway if it's the best bet.  In any case the submitted patch
    took way too many shortcuts, resulting in too many clearly-bad choices,
    to be committable.

    Alexander Kuzmenkov, reviewed by Alexey Chernyshov, and whacked around
    rather heavily by me.

    Discussion: https://postgr.es/m/239a8955-c0fc-f506-026d-c837e86c827b@postgrespro.ru

This commit might seem to be just one of planner and executor optimizations, nice but doesn’t deserve much attention. However, under detailed consideration this patch appears to be significant improvement on the way of making fulltext search in PostgreSQL to be done in the right way.

I’ve started working on FTS improvements in 2012. That time I realized that GIN index is good for selective FTS queries, when the number of matching results is low. See an example below: GIN did the great work by returning just few dozens of matching rows very fast. The rest of operations including rank calculation and sorting are also fast, because they are made over very small row set.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM pgmail
WHERE fts @@ plainto_tsquery('english', 'exclusion constraint')
ORDER BY ts_rank_cd(fts, plainto_tsquery('english', 'exclusion constraint')) DESC
LIMIT 10;
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=144.26..144.28 rows=10 width=784) (actual time=320.142..320.149 rows=10 loops=1)
   Buffers: shared hit=7138 read=7794
   ->  Sort  (cost=144.26..144.32 rows=25 width=784) (actual time=320.141..320.147 rows=10 loops=1)
         Sort Key: (ts_rank_cd(fts, '''exclus'' & ''constraint'''::tsquery)) DESC
         Sort Method: top-N heapsort  Memory: 38kB
         Buffers: shared hit=7138 read=7794
         ->  Bitmap Heap Scan on pgmail  (cost=44.20..143.72 rows=25 width=784) (actual time=5.232..315.302 rows=3357 loops=1)
               Recheck Cond: (fts @@ '''exclus'' & ''constraint'''::tsquery)
               Heap Blocks: exact=2903
               Buffers: shared hit=7138 read=7794
               ->  Bitmap Index Scan on pgmail_fts_idx  (cost=0.00..44.19 rows=25 width=0) (actual time=3.689..3.689 rows=3357 loops=1)
                     Index Cond: (fts @@ '''exclus'' & ''constraint'''::tsquery)
                     Buffers: shared hit=11 read=23
 Planning time: 0.176 ms
 Execution time: 320.213 ms
(15 rows)

But the situation is different when FTS query is not selective, and number of matching rows is high. Then we have fetch all the matching rows from the heap, rank them, and then sort. And despite we only need TOP-10 of matching rows, this query takes a lot of time. See the example below: “Tom Lane” is for sure non-selective query over PostgreSQL mailing lists.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM pgmail
WHERE fts @@ plainto_tsquery('english', 'Tom Lane')
ORDER BY ts_rank_cd(fts, plainto_tsquery('english', 'Tom Lane')) DESC
LIMIT 10;
                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=144.26..144.28 rows=10 width=784) (actual time=18110.231..18110.236 rows=10 loops=1)
   Buffers: shared hit=1358323 read=399077
   ->  Sort  (cost=144.26..144.32 rows=25 width=784) (actual time=18110.229..18110.231 rows=10 loops=1)
         Sort Key: (ts_rank_cd(fts, '''tom'' & ''lane'''::tsquery)) DESC
         Sort Method: top-N heapsort  Memory: 44kB
         Buffers: shared hit=1358323 read=399077
         ->  Bitmap Heap Scan on pgmail  (cost=44.20..143.72 rows=25 width=784) (actual time=70.267..17895.628 rows=224568 loops=1)
               Recheck Cond: (fts @@ '''tom'' & ''lane'''::tsquery)
               Rows Removed by Index Recheck: 266782
               Heap Blocks: exact=39841 lossy=79307
               Buffers: shared hit=1358323 read=399077
               ->  Bitmap Index Scan on pgmail_fts_idx  (cost=0.00..44.19 rows=25 width=0) (actual time=63.914..63.914 rows=224568 loops=1)
                     Index Cond: (fts @@ '''tom'' & ''lane'''::tsquery)
                     Buffers: shared hit=41 read=102
 Planning time: 0.131 ms
 Execution time: 18110.376 ms
(16 rows)(15 rows)

How this situation can be improved? If we would get results from index already pre-ordered by rank, then we would be able to evaluate TOP-N query without fetching the whole set of matching rows from heap. Unfortunately, that appears to be impossible for GIN index, which doesn’t contain enough of information to calculate the rank. However, once we have additional information about terms positions in the index, it might work. That would be enough for raking fulltext results inside the index.

Thus, I’ve a set of patches to the GIN index. Some of improvements were committed to 9.4, including index compression and index search optimization. However, additional information storage for GIN index wasn’t committed, because it changes GIN index structure too much.

Fortunately, we have extensible index access methods since PostgreSQL 9.6. And that enables us to implement things, which wasn’t accepted to GIN and even more, as a separate index access method RUM. Using RUM, one can execute TOP-N FTS query much faster without fetching all the matching rows from the heap.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM pgmail
WHERE fts @@ plainto_tsquery('english', 'Tom Lane')
ORDER BY fts <=> plainto_tsquery('english', 'Tom Lane')
LIMIT 10;
                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=48.00..83.25 rows=10 width=1523) (actual time=242.974..248.366 rows=10 loops=1)
   Buffers: shared hit=809 read=25, temp read=187 written=552
   ->  Index Scan using pgmail_idx on pgmail  (cost=48.00..193885.14 rows=54984 width=1523) (actual time=242.972..248.358 rows=10 loops=1)
         Index Cond: (fts @@ '''tom'' & ''lane'''::tsquery)
         Order By: (fts <=> '''tom'' & ''lane'''::tsquery)
         Buffers: shared hit=809 read=25, temp read=187 written=552
 Planning time: 14.709 ms
 Execution time: 312.794 ms
(8 rows)

However, the problem persists if you need to get the total count of matching rows. Then PostgreSQL executor still have to fetch all the matching rows from the heap in order to check their visibility. So, if you need to get a total number of matching rows for pagination, then it might be still very slow.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
EXPLAIN (ANALYZE, BUFFERS)
SELECT COUNT(*) FROM pgmail
WHERE fts @@ plainto_tsquery('english', 'Tom Lane');
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=118931.46..118931.47 rows=1 width=8) (actual time=36263.708..36263.709 rows=1 loops=1)
   Buffers: shared hit=800692 read=348338
   ->  Bitmap Heap Scan on pgmail  (cost=530.19..118799.14 rows=52928 width=0) (actual time=74.724..36195.946 rows=224568 loops=1)
         Recheck Cond: (fts @@ '''tom'' & ''lane'''::tsquery)
         Rows Removed by Index Recheck: 266782
         Heap Blocks: exact=39841 lossy=79307
         Buffers: shared hit=800692 read=348338
         ->  Bitmap Index Scan on pgmail_fts_idx  (cost=0.00..516.96 rows=52928 width=0) (actual time=67.467..67.467 rows=224568 loops=1)
               Index Cond: (fts @@ '''tom'' & ''lane'''::tsquery)
               Buffers: shared hit=41 read=102
 Planning time: 0.210 ms
 Execution time: 36263.790 ms
(12 rows)

For sure, there are some modern UI’s techniques, which doesn’t require to show full number of results to user (continuous scrolling, for instance). Alternatively, one can use a planner estimation instead of true number of matching rows. But nevertheless, slow counting of total results number was still a problem for many of RUM users.

Situation was improved after index-only count was committed. Naturally, PostgreSQL maintains visibility map for heap pages. When visibility bit is set for particular heap page, then all the tuples on that page are visible; so, visibility check can be skipped. Using visibility map, index- only scan node is working. But, if index-only scan can skip fetching heap, then why can’t count do the same? Now it can!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
EXPLAIN (ANALYZE, BUFFERS)
SELECT COUNT(*) FROM pgmail
WHERE fts @@ plainto_tsquery('english', 'Tom Lane');
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=121794.28..121794.29 rows=1 width=8) (actual time=132.336..132.336 rows=1 loops=1)
   Buffers: shared hit=404
   ->  Bitmap Heap Scan on pgmail  (cost=558.13..121656.82 rows=54984 width=0) (actual time=83.676..116.889 rows=224568 loops=1)
         Recheck Cond: (fts @@ '''tom'' & ''lane'''::tsquery)
         Heap Blocks: exact=119148
         Buffers: shared hit=404
         ->  Bitmap Index Scan on pgmail_idx  (cost=0.00..544.38 rows=54984 width=0) (actual time=61.459..61.459 rows=224568 loops=1)
               Index Cond: (fts @@ '''tom'' & ''lane'''::tsquery)
               Buffers: shared hit=398
 Planning time: 0.183 ms
 Execution time: 133.885 ms
(11 rows)

Therefore, when you run non-selective fulltext query in PostgreSQL 11 with RUM, then neither TOP-N query neither COUNT(*) query fetches the whole set of matching rows from the heap. That’s cool, and I would say that fulltext search is made almost right in this case. It’s “almost”, because it would be nice to get both TOP-N and COUNT(*) in one SQL query and single index scan. And that is a room for improvement in future versions.

Comments