Explain Query Plan and Query Performance with Postgres

DB/RDB 2019. 8. 23. 09:34

1. Overview

Let's analyze the query plan with Postgres to enhance performance using explain and analysis.

2. Description

2.1 Optimizer

The brain of the database, which interprets queries and determines the fastest method of execution. A single query optimization technique can increase database performance drastically.

2.2 Explain & Analyze

2.2.1 Explain

Displaying the execution plan that the PostgreSQL planner generates for the supplied statement
Gives an exact breakdown of a query.
Based on the statistics about the table
Identify the most efficient path to the data
Takes different database paradigms into consideration such as indexes
Only guessing a plan that it thinks it will execute

2.2.2 Analyze

Basically runs a query to find the processing time to execute

3. Down to the nuts and bolts

Most commonly occurring matches are scanned using the seq scan and the least common matches are scanned using an index scan, anything in between is scanned using a bitmap heap scan followed by an index scan. One of the reasons for this is that random I/O is very slow as compared to sequential I/O.

3.1 Bitmap scan (Sequential Scan)

A sequential scan
Sequentially open a shortlist of disk pages and grab every applicable row in each one.

3.2 Index scan

The index is visited row by row in order
Results in disk page being visited multiple times

3.3 Comparison between Sequential scan and index scan

Cases where a sequential scan is faster than an index scan

Usually faster than reading in random order because an index scan requires several I/O for each row which includes looking up a row in the index looking up and retrieving the row from memory
The sequential scan requires a single I/O operation to retrieve more than one block containing multiple rows.

3.4 Query plan on JOINS

3.4.1 Nested Loop

Can use either a sequential scan or index scan
sequential scan used when the second table is small
The basic logic of choosing between a sequential scan and index scan applied here too

3.4.2 Hash Join

Creating a hash table of the smaller table on the join key
The larger table is scanned, searching the hash table for the rows which meet the join condition
Require a lot of memory to store the hash table in the first place

3.4.3 Merge Join

Similar to merge sort algorithm
Planner sorts both tables to be joined on the join attribute
Scanned in parallel to fine the matching values

4. Index

4.1 Multiple column indexes

A multi-column B-Tree index
Most efficient when there are constraints on the leading columns(leftmost).

4.2 Cover-index

When an index includes the actual values of all the fields specified in the query, the index covers the query and does not require an additional step to fetch the actual values from the data service. An index, in this case, is called a covering index and the query is called a covered query. As a result, covered queries are faster and deliver better performance.

4.2.1 Without Covering Indexes

4.2.2 With Covering Indexes

4.3 Unique index

Enforce uniqueness of a column's value or the uniqueness of a combined value of more than one column.
Postgres creates an index for the defined primary key itself.
Creating unique constraint rather than unique index by the Postgres product team
When we create a unique constraint, Postgres automatically creates an index behind the scene.

4.4 Examples

4.4.1 index is not used

LIKE '%scheme', but LIKE 'sheme%' can use the index
The upper/lower case function used in where clause
- or Creating lower case index like below
- CREATE INDEX idx_scheme_name ON schemes (lower(scheme_name))

4.5 Partial Index

A subset of the rows of a table
CREATE INDEX idx_scheme_name ON schemes start_time WHERE start_time < '11:00:00'
Have not indexed all the schemes where start_time is after 11:00 AM

5. Downsides fo indexes in production environments

5.1 Finding unused indexes

Indexed eat memory

5.2 CREATE/DROP index and CREATE/DROP index concurrently

Creating and dropping an index in a large database can take hours or even days and the CREATE INDEX command blocks all the writers on a table
CREATE INDEX CONCURRENT will not acquire locks against writes
- Firstly, scans the table to build indexes
- Run the index once again for the things to be added since the first pass
If something goes wrong during creating an index concurrently, it does not roll back and leaves an invalid index behind can find these like below
- SELECT * FROM pg_class, pg_index WHERE pg_index.indisvalid = false AND pg_index.indexrelid = pg_class.oid;

5.3 Rebuilding indexes

REINDEX rebuilds an index using the data stored in the index table replacing the old copy of the index
- REINDEX INDEX or REINDEX TABLE
REINDEX locks out writes but not reads of the index's parent table.
Also takes an exclusive lock on the specific index being processed, which will block reads that attempt to use that index.

6. References

http://dwgeek.com/improve-performance-netezza-query-plan.html/

https://www.vertica.com/kb/Reading-Query-Plans/Content/BestPractices/Reading-Query-Plans.htm

https://docs.oracle.com/cd/E17952_01/workbench-en/wb-performance-explain.html

https://www.hwaci.com/sw/sqlite/eqp.html

https://blog.gojekengineering.com/the-postgres-performance-tuning-manual-query-plans-52a023c2342d

https://www.postgresql.org/docs/11/using-explain.html#USING-EXPLAIN-ANALYZE

https://blog.gojekengineering.com/the-postgres-performance-tuning-manual-indexes-19a1bdfe28ba

https://wiki.postgresql.org/wiki/Index_Maintenance

https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/covering-indexes.html

'DB > RDB' 카테고리의 다른 글

Dirty Read (0)	2019.09.30
Phantom Read (0)	2019.09.30
ACID (0)	2019.09.30
Transactional locking mechanism and Multi-version concurrency control(MVCC) (0)	2019.09.30
OLTP, OLAP, and ETL (0)	2019.09.11

ABOUT ME

Demyank's Tlog Demyank's Tlog

1. Overview

2. Description

2.1 Optimizer

2.2 Explain & Analyze

2.2.1 Explain

2.2.2 Analyze

3. Down to the nuts and bolts

3.1 Bitmap scan (Sequential Scan)

3.2 Index scan

3.3 Comparison between Sequential scan and index scan

3.4 Query plan on JOINS

3.4.1 Nested Loop

3.4.2 Hash Join

3.4.3 Merge Join

4. Index

4.1 Multiple column indexes

4.2 Cover-index

4.2.1 Without Covering Indexes

4.2.2 With Covering Indexes

4.3 Unique index

4.4 Examples

4.4.1 index is not used

4.5 Partial Index

5. Downsides fo indexes in production environments

5.1 Finding unused indexes

5.2 CREATE/DROP index and CREATE/DROP index concurrently

5.3 Rebuilding indexes

6. References

'DB > RDB' 카테고리의 다른 글

티스토리툴바

ABOUT ME

1. Overview

2. Description

2.1 Optimizer

2.2 Explain & Analyze

2.2.1 Explain

2.2.2 Analyze

3. Down to the nuts and bolts

3.1 Bitmap scan (Sequential Scan)

3.2 Index scan

3.3 Comparison between Sequential scan and index scan

3.4 Query plan on JOINS

3.4.1 Nested Loop

3.4.2 Hash Join

3.4.3 Merge Join

4. Index

4.1 Multiple column indexes

4.2 Cover-index

4.2.1 Without Covering Indexes

4.2.2 With Covering Indexes

4.3 Unique index

4.4 Examples

4.4.1 index is not used

4.5 Partial Index

5. Downsides fo indexes in production environments

5.1 Finding unused indexes

5.2 CREATE/DROP index and CREATE/DROP index concurrently

5.3 Rebuilding indexes

6. References

'DB > RDB' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바