Decoding Sequential Scans in PostgreSQL

In every database product, sequential scans or full table scans are often resource consuming, so almost all the developers and DBAs see such scans as a performance killer. In my opinion, this is a false perception or biased view. In many cases, sequential scans are proven to be a performance booster. However, due to a lack of knowledge, many experts demonize sequential scans and prefer to eliminate them during their queries’ execution. This blog focuses on that part and tries to shed some light on the internals of sequential scans.

What is a sequential scan?

It is a method of accessing data from a table where every record in a table is scanned to see if it meets the desired search criteria. While accessing data from the database, there is a need to identify what records would satisfy the WHERE clause filter. In order to accomplish the same, database products sometimes need to scan all the records in the table, and matching records are placed in a separate section for further actions.

It happens in a sequential order of pages. Every page in a table is scanned in the “shared buffers” area of shared memory, and the result set is then put into the user area(work memory) for further operations, such as joins, sorting, and so on.

Process

Sequential scans happen in two phases:

– Copying pages in memory
– Fetching data from memory pages

Copying pages in memory

In every database, while storing data on file systems, it has to follow the standards of operating systems and the database itself. On disks, data is stored in form pages of uniform sizes. So, in order to read data, firstly, the database needs to read the blocks, which are also known as pages. In the case of PostgreSQL, the page size is 8kB. Whenever PostgreSQL needs to read data from the database, database pages will be loaded into a specific section of memory area that is known as shared buffers.

Fetching data from memory pages

After the shared buffers are loaded with the required pages, PostgreSQL scans those blocks in the memory and extracts rows from the pages. During this operation, each and every page is read, and all the records are matched to compare with the filter condition.

This is the operation where CPU operations are required. In this operation, the CPU will apply logic, read pages from shared buffers, and fetch rows from memory pages. During this operation, a processor will keep iterating from one page to another and one record to another. This continues until the last record is read. Due to this, a processor stays busy when a sequential scan is going on.

Cost calculation

As described in the process section, the complete operation happens in two phases. The total costs for both these operations are separately calculated and presented as the sum of both. The cost for the first operation is called IO cost, which estimates the amount of effort required to perform IOs. Another cost is CPU cost.

Disk run cost

In the case of sequential scans, every tuple is read on disk. On every disk, data pages are stored in a sequential manner. Also, they are accessed in the same fashion. Due to this, disks have to read blocks in a sequence, which is also called sequential reads. In PostgreSQL, we may set an estimated cost for fetching one page by setting seq_page_cost PG parameter.

disk_run_cost = the number of pages X seq_page_cost

1	disk_run_cost = the number of pages X seq_page_cost

(The number of pages in a table is stored in the relpages column of pg_class.)

CPU run cost

When pages are loaded into memory (shared buffers), they are still available in the form of PG blocks. But, they are readable by the OS only and are of no use until the read data inside pages is gleaned. The CPU’s jobs start here! CPUs have to identify tuples in pages and extract them to perform various operations. In PostgreSQL, we may set an estimated cost required to fetch one record using the cpu_tuple_cost parameter, and by multiplying it by the total number of rows in a table, we can obtain cpu_run_cost.

cpu_run_cost = Total number of records X cpu_tuple_cost

1	cpu_run_cost = Total number of records X cpu_tuple_cost

(The number of records in a table is stored in the reltuples column of pg_class.)

The total cost required to perform a sequential scan is the sum of the CPU run cost and the disk run cost.

total_cost = disk_run_cost + cpu_run_cost

1	total_cost = disk_run_cost + cpu_run_cost

Source (https://github.com/postgres/postgres/blob/master/src/backend/optimizer/path/costsize.c)

Here, I have taken an example to calculate the value and validate it using EXPLAIN ANALYZE. The query below shows the necessary details along with statistics.

test=# SELECT relation.no_of_pages as "No. of Pages",
seq_page_cost.value as "seq_page_cost",
relation.no_of_rows as "No. of records",
cpu_tuple_cost.value as "cpu_tuple_cost",
(relation.no_of_pages * seq_page_cost.value) as "Disk run cost",
(relation.no_of_rows * cpu_tuple_cost.value) as "CPU run cost",
(relation.no_of_pages * seq_page_cost.value + relation.no_of_rows * cpu_tuple_cost.value) as "Total cost"
FROM
(select setting::float value from pg_settings where name = 'cpu_tuple_cost') cpu_tuple_cost,
(select setting::float value from pg_settings where name = 'seq_page_cost') seq_page_cost,
(select reltuples::int no_of_rows, relpages::int no_of_pages from pg_class where relname = 'sequential_scan_test_tbl') relation;

No. of Pages | seq_page_cost | No. of records | cpu_tuple_cost | Disk run cost | CPU run cost | Total cost
--------------+---------------+----------------+----------------+---------------+--------------+------------
14803 | 1 | 126728 | 0.01 | 14803 | 1267.28 | 16070.28
(1 row)

test=# SELECT relation.no_of_pages as "No. of Pages",

seq_page_cost.value as "seq_page_cost",

relation.no_of_rows as "No. of records",

cpu_tuple_cost.value as "cpu_tuple_cost",

(relation.no_of_pages * seq_page_cost.value) as "Disk run cost",

(relation.no_of_rows * cpu_tuple_cost.value) as "CPU run cost",

(relation.no_of_pages * seq_page_cost.value + relation.no_of_rows * cpu_tuple_cost.value) as "Total cost"

FROM

(select setting::float value from pg_settings where name = 'cpu_tuple_cost') cpu_tuple_cost,

(select setting::float value from pg_settings where name = 'seq_page_cost') seq_page_cost,

(select reltuples::int no_of_rows, relpages::int no_of_pages from pg_class where relname = 'sequential_scan_test_tbl') relation;

--------------+---------------+----------------+----------------+---------------+--------------+------------

14803 | 1 | 126728 | 0.01 | 14803 | 1267.28 | 16070.28

(1 row)

The EXPLAIN ANALYZE command below shows the cost incurred by a query. We can see it is the same as “Total cost” in the query above.

test=# explain analyze select * from sequential_scan_test_tbl;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on sequential_scan_test_tbl (cost=0.00..16070.28 rows=126728 width=1438) (actual time=6.202..114.818 rows=126728 loops=1)
Planning Time: 0.330 ms
Execution Time: 120.151 ms
(3 rows)

test=# explain analyze select * from sequential_scan_test_tbl;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------

Seq Scan on sequential_scan_test_tbl (cost=0.00..16070.28 rows=126728 width=1438) (actual time=6.202..114.818 rows=126728 loops=1)

Planning Time: 0.330 ms

Execution Time: 120.151 ms

(3 rows)

Sequential-scan-prone conditions

Missing indexes

In any query returning a relatively small data set, an index speeds up data retrieval. Indexes store OIDs associated with particular values, so when the table data is queried, it tries to look into an index associated with columns/expressions mentioned in the WHERE predicate. In the case of no-index on a column in the WHERE clause, the executor cannot fetch details of records, due to which a sequential scan becomes inevitable.

Stale statistics

In PostgreSQL, a cost-based optimization method is designed to choose the most effective execution plan; this method is strictly based on statistics of tables. ANALYZE method is used in which optimization engines calculate the costs of different operations based on available statistics. All the costs are compared, and the one with the lowest value wins, and the execution engine goes with the method with the lowest cost. For example, if the sequential scan’s cost is 3000 and the index scan’s cost is 150, the index scan is chosen. If statistics are not updated frequently, it may lead to poor execution plans, which may cause sequential scans.

Large result set

From the above two, it is very clear that the latest statistics and proper indexing can be useful to avoid sequential scans. However, there may be scenarios when sequentially accessing data is still evident despite both the above requirements being fulfilled. This happens because the query expects a huge data set from a table. In such a case, using an index becomes overhead and causes a delay in execution.

Conclusion

Sequential scans go through all the records in a table, which is a two-stage operation, i.e., copying blocks in the shared memory and reading data from blocks. Sequential scans may happen due to improper statistics or missing indexes; however, returning a large amount of data is an exception.

If you need assistance with optimizing database performance or resolving PostgreSQL issues – be sure to check out our 24/7 PostgreSQL support and services.

0 Comments

Inline Feedbacks

View all comments

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Decoding Sequential Scans in PostgreSQL

What is a sequential scan?

Process

Copying pages in memory

Fetching data from memory pages

Cost calculation

Disk run cost

CPU run cost

Sequential-scan-prone conditions

Missing indexes

Stale statistics

Large result set

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Valkey/Redis: Configuration Best Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Decoding Sequential Scans in PostgreSQL

What is a sequential scan?

Process

Copying pages in memory

Fetching data from memory pages

Cost calculation

Disk run cost

CPU run cost

Sequential-scan-prone conditions

Missing indexes

Stale statistics

Large result set

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Valkey/Redis: Configuration Best Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation