ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • OLTP, OLAP, and ETL
    DB/RDB 2019. 9. 11. 14:28

    1. Overview

    1.1 On-line Transaction Processing (OLTP)

    OLTP stands for On-line Transaction Processing. OLTP based systems (account, ticket booking, banking systems, money transfer system) are used to perform a large number of short transactions. Almost all of the database queries in OLTP system consist of commands insert, update, delete. Select queries are mainly designed to enable users to select data from different entities. The majority of query is known in advance at the design stage of the system. Thus, critical for OLTP system is the speed and reliability of the performance of short operations updating data.

    1.2 On-line Analytical Processing (OLAP)

    OLAP stands for On-line Analytical Processing. OLAP based systems are used for management decisions and these systems known as Decision Support System(DSS). OLAP-applications operate with large amounts of data accumulated in the operational databases OLTP. Typical applications of OLAP include business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas.

    1.3 Extract, Transform, Load (ETL)

    ETL is short for extract, transform, load, three database functions that are combined into one tool to pull data out of one database and place it into another database.

    2. Online Transaction Processing (OLTP)

    The primary objective is a data processing and not data analysis. Online transaction processing shortly known as OLTP supports transaction-oriented applications in a 3-tier architecture. OLTP administers the day to day transaction of an organization.

    2.1 Usages

    • Online banking
    • Online airline ticket booking
    • Sending a text message
    • Order entry
    • Add a book to shopping cart

    3. Online Analytical Processing (OLAP)

    The primary objective is data analysis and not data processing. Online Analytical Processing, a category of software tools that provide analysis of data for business decisions. OLAP systems allow users to analyze database information from multiple database systems at one time.

    3.1 Usages

    • Any Datawarehouse system is an OLAP system.
    • A company might compare their mobile phone sales in September with sales in October, then compare those results with another location that may be stored in a sperate database.
    • Amazon analyzes purchases by its customers to come up with a personalized homepage with products which likely interest to their customer.

     

    4. Extract, Transform, Load (ETL)

    In the ELT process, after you extracted data from all data sources, you immediately start moving it into a centralized data repository. Today’s cloud-based data warehouse and data lake infrastructure supports large storage and scalable compute. So it’s no longer necessary to prevent the data warehouse from “exploding” by keeping data small and summarized. Many organizations today maintain massive data pools in the cloud at low cost, leveraging ELT tools to process the data.

    4.1 Types of ETL

    • Traditional ETL batch processing - meticulously preparing data and transforming it using a rigid, structured process.
    • ETL with stream processing - using a modern stream processing framework like Kafka, you can pull data in real time from source, manipulate it on the fly using Kafka’s Stream API, and load it to a target system such as Amazon Redshift.
    • Automated data pipeline without ETL - we showed how to use our automated data warehouse, Panoply, to pull data from multiple sources, automatically prep it without requiring a full ETL process, and immediately begin analyzing it using BI tools.

    5. Differences between OLTP and OLAP

    Parameters OLTP OLAP
    Process It is an online transactional system. It manages database modification. OLAP is an online analysis and data retrieving process.
    Characteristic It is characterized by large numbers of short online transactions. It is characterized by a large volume of data.
    Functionality OLTP is an online database modifying system. OLAP is an online database query management system.
    Method OLTP uses traditional DBMS. OLAP uses the data warehouse.
    Query Insert, Update, and Delete information from the database. Mostly select operations
    Table Tables in OLTP database are normalized. Tables in OLAP database are not normalized.
    Source OLTP and its transactions are the sources of data. Different OLTP databases become the source of data for OLAP.
    Data Integrity OLTP database must maintain data integrity constraint. OLAP database does not get frequently modified. Hence, data integrity is not an issue.
    Response time It's response time is in millisecond. Response time in seconds to minutes.
    Data quality The data in the OLTP database is always detailed and organized. The data in OLAP process might not be organized.
    Usefulness It helps to control and run fundamental business tasks. It helps with planning, problem-solving, and decision support.
    Operation Allow read/write operations. Only read and rarely write.
    Audience It is a market orientated process. It is a customer orientated process.
    Query Type Queries in this process are standardized and simple. Complex queries involving aggregations.
    Back-up Complete backup of the data combined with incremental backups. OLAP only need a backup from time to time. Backup is not important compared to OLTP
    Design DB design is application oriented. Example: Database design changes with industry like Retail, Airline, Banking, etc. DB design is subject oriented. Example: Database design changes with subjects like sales, marketing, purchasing, etc.
    User type It is used by Data critical users like clerk, DBA & Data Base professionals. Used by Data knowledge users like workers, managers, and CEO.
    Purpose Designed for real time business operations. Designed for analysis of business measures by category and attributes.
    Performance metric Transaction throughput is the performance metric Query throughput is the performance metric.
    Number of users This kind of Database users allows thousands of users. This kind of Database allows only hundreds of users.
    Productivity It helps to Increase user's self-service and productivity Help to Increase productivity of the business analysts.
    Challenge Data Warehouses historically have been a development project which may prove costly to build. An OLAP cube is not an open SQL server data warehouse. Therefore, technical knowledge and experience is essential to manage the OLAP server.
    Process It provides fast result for daily used data. It ensures that response to the query is quicker consistently.
    Characteristic It is easy to create and maintain. It lets the user create a view with the help of a spreadsheet.
    Style OLTP is designed to have fast response time, low data redundancy and is normalized. A data warehouse is created uniquely so that it can integrate different data sources for building a consolidated database

    5.1 KEY DIFFERENCE:

    • Online Analytical Processing (OLAP) is a category of software tools that analyze data stored in a database whereas Online transaction processing (OLTP) supports transaction-oriented applications in a 3-tier architecture.
    • OLAP creates a single platform for all type of business analysis needs which includes planning, budgeting, forecasting, and analysis while OLTP is useful to administer day to day transactions of an organization.
    • OLAP is characterized by a large volume of data while OLTP is characterized by large numbers of short online transactions.
    • In OLAP, data warehouse is created uniquely so that it can integrate different data sources for building a consolidated database whereas OLTP uses traditional DBMS.

    5. References

    https://en.wikipedia.org/wiki/Data_integrity

    https://www.guru99.com/oltp-vs-olap.html

    https://en.wikipedia.org/wiki/Extract,_transform,_load

    http://www.sql-datatools.com/2015/08/dw-oltp-vs-olap.html

    https://techdifferences.com/difference-between-oltp-and-olap.html

    https://database.guide/what-is-oltp/

    https://panoply.io/data-warehouse-guide/3-ways-to-build-an-etl-process/

    'DB > RDB' 카테고리의 다른 글

    Dirty Read  (0) 2019.09.30
    Phantom Read  (0) 2019.09.30
    ACID  (0) 2019.09.30
    Transactional locking mechanism and Multi-version concurrency control(MVCC)  (0) 2019.09.30
    Explain Query Plan and Query Performance with Postgres  (0) 2019.08.23

    댓글

Designed by Tistory.