The In-Memory Column Store feature that was introduced by Oracle in the database version 12c (18.104.22.168) brings the solution for accelerating performance of database-driven business decision-making to real-time speeds. Since it is an extra-license feature for which Oracle makes you pay around 50% on top of your CPU license (similar to RAC option), you probably ask yourself a few valid questions. Don’t we cache everything already in memory anyway? Is it really required for my application, my company? Will it work for my workload at all? In this post I’ll give you a short guideline and introduction into Oracle database In-Memory feature.
First of all me personally and so probably you do not know any database that works only on disk. Indeed we cache most of our data, code and intermediate results in memory already. Furthermore there are some extra Oracle database performance features that help you to achieve that fairly efficiently, for example:
– KEEP/RECYCLE Pools
– RESULT Cache
– 12c Big Table Caching
– 12c Full Database Caching
The key point of Oracle In-Memory is not “What to cache” but “How”. So the major difference of Oracle In-Memory Column Store is that it enables individual database segments to be loaded into memory in the compressed columnar format. This technique enables segment scans to perform much faster than the traditional on-disk formats, providing performance boost for analytical and reporting workload.
I remember when long time ago one database consultant confused my manager saying that our Oracle 9i database had poor performance just taking into account a slow response from dba_segments data dictionary view. That was a nasty trick to blame a DBA and the Oracle database for poor performance at that time. In fact there were a few Oracle bugs related to those performance issues after switching from dictionary to locally managed tablesspaces at that time. Recently I’ve noticed similar performance degradation on Oracle 11gR2 (22.214.171.124 and 126.96.36.199) by querying DBA_SEGMENTS or USER_SEGMENTS data dictionary views involving the columns BYTES, BLOCKS, or EXTENTS. Queries on DBA_TS_QUOTAS or USER_TS_QUOTES on columns BYTES or BLOCKS were also slow.
Even if you personally do not care about these dictionary views they are still very important since they are used by some Oracle internal components and the other database tools including Oracle Enterprise Manager (OEM) Cloud Control and its Database Home Page. Thus, I’ll describe below the problematic of those data dictionary views and the way how to fix their performance issues.
First of all do not wonder why queries against those views often seem to slow. DBA_SEGEMENTS for example is a very complex view that is built on another SYS_DBA_SEGS view. In summary DBA_SEGMENTS view on Oracle 11gR2 consists of the following components:
– 25 columns
– around 110 lines of SQL code
– 3 UNION ALL clauses
– A lot of joins between following tables: sys.user$, sys.ts$, sys.undo$, sys.seg$, sys.file$
During the last few years I went through several POC of different Column-Store databases reviewing their functionality, performance and use cases. Usually at the beginning of every exercise I saw the impressive vendor promises of reach functionality, great performance and scalability. Some even said: this is a new trend in database world, even a standard! You do not need RDBMS anymore!
In this type of cases I usually act as conservative database architect. And you know what – that always helped eliminating additional companies’ efforts and frustrations in implementing specialized database solutions. This time I share some experiences in evaluating Column-Store databases. But let start with basics first.
While most commercial RDBMS products store data in some form of row format, some database vendors provide column-oriented storage of data. The supposed advantages of storing the data by column rather than by row include a better ability to compress the data, something that would reduce the need for disk-I/O. The idea of column-based storage is not new and has been used in commercial products from former Sybase and Sand Technology for well over a decade. In reality, each storage format has its own set of advantages and disadvantages and there is no free lunch – only tradeoffs.
The tradeoffs associated with column-based storage include the cost of tracking and eventual reconstruction of the rows to which the column values belong as well as additional complexity for ETL and OLTP processing. While recognizing that each storage format has its pros and cons and that there are scenarios where a column-based format has some merit, it is worth examining whether the column-based format lives up to its recent hype.
Beware of disingenuous benchmark numbers
Yes folks – PERFORMANCE is the main sales factor of the columnar databases!
There are claims that Column-Stores outperform a commercial row-store RDBMS by large factors. I just want to warn you to not rely blindly on magic performance benchmarks the vendors have done, in house themselves. Usually these performance test cases are not similar to the real production database loads, created often for read-only data using database engines that lacks RDBMS features and functionality that would be required in a production system.
A second observation is that the often benchmarks against Column-Stores do not test joins. Read more »
In spite of the fact that Oracle recommends using the new Data Pump tools (expdp/impdp) available as of Oracle Database 10g, many developers and DBAs still use the original Oracle Export and Import (exp/imp) utilities. Below I will quickly summarize some useful Export and Import tips. Read more »