Designing Data-Intensive Applications 書本 - Storage and Retrieval 筆記 (2)

發表於 2021-08-30 更新於 2024-04-28 分類於 Designing Data-Intensive Applications book Disqus：文章字數： 2.3k 所需閱讀時間 ≈ 4 分鐘

上次講了 LSM tree and SSTable 的概念，今天來記錄一下何謂 Column-Oriented Storage。

OLTP V.S OLAP

在資料庫的使用情境下，可以分為 OLTP 和 OLAP 的用途，前者指的是資料通常來自於 user 的 input，並且 user 的資料會時常被 updated，因此在這種情境下講求的是能夠實現 online transaction processing，而通常當資料量一大的時候，因應商業需求會需要對使用者的資料進行分析，這時候可能就會需要 OLAP 的情境，也就是能下一些 analytic query needs to scan over a huge number of records, only reading a few columns per record, and calculates aggregate statistics (such as count, sum, or average) rather than returning the raw data to the user.

此外，兩者的情境撈取的資料量也可以說是天差地別，OLTP 講求的是能夠越快的將資料回傳給 user 或是快速地寫入，OLAP 通常是會一次性的撈大量的資料並且做一些統計的分析。

兩者的差別可以參考這張圖：

Column-Oriented Storage

而在 OLTP 的資料庫通常是採用 row-oriented storage 來存放資料，像是我們常見的關聯式資料庫：MySQL，PostgreSQL 等等。之所以 OLTP 的情境要使用 row-oriented storage 是因為通常我們會需要給 user 一個範圍的 row data，而 column-oriented storage 的話如果想要拿到這個 row 的每個欄位的資料，就需要針對每個欄位的每一行去讀取才可以，會做許多不必要 scan。

反之，如果是 OLAP 的情境為何要使用 column-oriented storage，是因為通常分析的需求是要統計這個欄位的 max, min 或是 sum 等等，也就是說我只需要 scan 這個欄位的資料就可以了，其他欄位不在我考量的範圍，如果我採用 row-oriented storage，會需要把其他欄位的資料也 scan 下來。

來看例子會更清楚：

假設我有一個 sales table，這些是會常出現的欄位，可以看的 row-oriented storage 的儲存方式以及下面 column-oriented storage 的差別。

如果你想要在 column-oriented storage 去取得 entire row 的資料，就必須去每一個 column file 去 scan 找出來對應的值。

Column Compression

在 column-oriented storage 有一些壓縮方式可以讓壓縮 data。最常見的方式是採用 bitmap encoding：

也就是說將每個 value 用 bit 來表示，The bit is 1 if the row has that value, and 0 if not。此外，如果一個 column 所擁有的 distinct value 相對的少，可以在進行 run-length encoding，來大幅度地壓縮。

bitmap 的方式可以讓 query 的速度提升不少，例如：

1	WHERE product_sk IN (30, 68, 69):

這個 value 轉成 bit 之後就可以進行 bitwise 的計算，就可以算得非常的快。

COLUMN-ORIENTED STORAGE AND COLUMN FAMILIES

這邊特別要提的是 column-oriented storage 與 column families 是不同的儲存方式，column families 實際上是有所謂的 row key，這個 row key 會對應到許多的 columns 所組成，但是儲存方式還是採用 row-oriented storage，而知名的 Cassandra and HBase 是採用 column families 方式的。

至於為何要使用 column families 可以參考這篇文章：https://mlwmlw.org/2011/01/cassandra-the-definitive-guide/

覺得寫得還不錯，是針對 Cassandra 去講解的。

也可以看這張圖：

Writing to Column-Oriented Storage

通常 column-oriented storage 透過 compression and sorting 可以讓 read query 更加的快速，因此不太會採用 B-Tree 結構，因為 B-Tree 是使用 update-in-place approach，如果想要 insert row 到 sorted table，會需要對每一個 column files 進行 rewrite 的動作。

因此 Column-Oriented Storage 可以採用 LSM-tree，All writes first go to an in-memory store, where they are added to a sorted structure and prepared for writing to disk，It doesn’t matter whether the in-memory store is row-oriented or column-oriented. When enough writes have accumulated, they are merged with the column files on disk and written to new files in bulk。

總結

最後其實要分別出 row-oriented storage 與 column-oriented storage 直接看 code 最容易理解，這邊分享一下大神的示意 code：