This is the second part of my article about Column-Store databases. In the first part Column-Oriented Databases – Old Idea, New wave I was focusing on topics like performance and functionality of Column-Oriented Databases and their comparison to RDBMS, specifically to Oracle database. This time I will continue the comparison of two database camps – Column-Stores vs Row-Stores – in areas of compression, partitioning. I’ll mention also the usage of Column-oriented storage benefits in Oracle products, like for example a new Oracle 12c database In-Memory Option.
Compression of columns vs rows
One of the potential advantages of column-oriented storage is the possibility of good compression. It is important to understand why compressing the data can be advantageous. It is not primarily the pure cost of having enough disk space to cover the physical size of the data that matters – disks are relatively cheap and are getting larger and cheaper at a steady rate. Rather, the potential benefit is when data has to be retrieved from disk as part of processing queries. Good I/O bandwidth is not cheap and techniques, such as compression, that reduce the size of the data that is retrieved from storage can be very advantageous, although there is usually some CPU-cost associated with compressing and uncompressing the data.
Oracle for example provides several major mechanisms for utilizing compression to benefit query processing. One is the row-level objects compression feature; another is Exadata Hybrid Columnar Compression – HCC (see below).
Partitioning VERTICAL vs HORIZONTAL
Column-oriented storage is a form of vertical partitioning of the data. One of the disadvantages of this type of partitioning is Read more »