-
Database CursorDB/RDB 2020. 2. 24. 01:53
1. Overview
In computer science, a database cursor is a control structure that enables traversal over the records in a database. Cursors facilitate subsequent processing in conjunction with the traversal, such as retrieval, addition, and removal of database records. The database cursor characteristic of traversal makes cursors akin to the programming language concept of an iterator.
2. Description
2.1 Intuition
Using cursors compared to big resultsets is like using video streaming instead of downloading a video in one swoop, and watching it when it has downloaded. If you download, you have to have a few gigs of space and the patience to wait until the download finished. Now, no matter how fast your machine or network may be, everyone watches a movie at the same speed.
Normally any query gets sent to the server, executed, and the resultset sent over the network to you, in one burst of activity. The cursor will give you access to the data row by row and stream every row only when you request it (can actually view it).
2.2 Pros
- A cursor can save you time - because you don't need to wait for the processing and download of your complete recordset
- It will save you memory, both on the server and on the client because they don't have to dedicate a big chunk of memory to resultsets
- Load-balance both your network and your server - Working in "burst" mode is usually more efficient, but it can completely block your server and your network. Such delays are seldom desirable for multiuser environments. Streaming leaves room for other operations.
- Allows operations on queried tables (under certain conditions) that do not affect your cursor directly. So while you are holding a cursor on a row, other processes are able to read, update and even delete other rows. This helps especially with very busy tables, many concurrent reads and writes.
2.3 Cons
2.3.1 Consistency
Using a cursor, you do (usually) not operate on a consistent snapshot of the data, but on a row. So your concurrency/consistency/isolation guarantees drop from the whole database (ACID) to only one row. You can usually inform your DBMS what level of concurrency you want, but if you are too nitpicky (locking the complete table you are in), you will throw away many of the resource savings on the server-side.
2.3.2 inefficient
Transmitting every row by itself can be very inefficient, since every packet has negotiation overhead that you might avoid by sending big, maybe compressed, chunks of data per packet. ( No DB server or client library is stupid enough to transmit every row individually, there's caching and chunking on both ends, still, it is relevant.)
2.3.3 Load
Cursors are harder to do right. Consider a query with a big resultset, motivating you to use a cursor, that uses a GROUP BY clause with aggregate functions. (Such queries are common in data warehouses). The GROUP BY can completely trash your server, because it has to generate and store the whole resultset at once, maybe even holding locks on other tables.
2.4 When to use
- If you work on small, quickly created resultsets, don't use cursors.
- Cursors excel on ad hoc, complex (referentially), queries of sequential nature with big resultsets and low consistency requirements.
"Sequential nature" means there are no aggregate functions in heavy GROUP BY clauses in your query. The server can lazily decide to compute 10 rows for your cursor to consume from a cache and do other stuff meanwhile.
3. Reference
https://en.wikipedia.org/wiki/Cursor_(databases)
https://stackoverflow.com/questions/3861558/what-are-the-benefits-of-using-database-cursor
'DB > RDB' 카테고리의 다른 글
Comparison between MSSQL, Oracle, PostgreSQL, and MySQL (0) 2020.04.11 PostgreSQL (0) 2020.04.10 Trigger (0) 2020.02.24 S3, Block storage, File storage, and Object storage (0) 2020.02.24 DDL, DML, DCL, and TCL (0) 2020.02.23