After my presentation Oracle database High Availability strategy, architecture and solutions at German Oracle User Group (DOAG) meeting in Nuremberg a few days ago I decided to write about High Availability (HA) solutions for Oracle database on my DBMS Blog. I must admit that when I started preparing this topic I realized it’s so extensive and complex that I decided starting slowly describing the things that any DBA or Infrastructure architect should understand before building a high available database system with minim allowed downtime. This first article will focus on understanding the availability requirements and Service Level Agreement (SLA).
Understand and develop Service Level Agreement (SLA)
First, DBA needs to understand Service Level Agreements (SLA) or customer’s service requirements.
A Service Level Agreement (SLA) is a negotiated agreement between two or more parties, where one is the customer and the others are service providers. SLA usually is part of a service contract where a service is formally defined. As an example, IT service providers will commonly include Service Level Agreements within the terms of their contracts with customers to define the level(s) of service being sold in plain language terms. A database SLA typically has a technical definition in terms of following.
This is a main SLA element and commonly expressed as a percentage, but is often more meaningful when expressed as hours. For example, 99.9% availability is roughly equivalent to 8 hours and 45 minutes of maintenance window, or allowed downtime, per year.
|Availability Target||Downtime Per Year (approx.)|
|90 %||36 days|
|98 %||7.3 days|
|99.7 %||26 hours|
|99.99 %||52 minutes|
|99.999 %||5 minutes|
In this article I continue listing common database configuration issues that can affect Oracle database high availability (HA) causing unplanned downtime. Make sure you read the first part of Oracle database configuration issues that cause downtime.
Control file limit reached
This issue can occur when you reach limits of some DB configuration parameters stored in a control file like MAXLOGFILES, MAXLOGFILEMEMBERS, MAXINSTANCES.
To fix this you need a downtime. 10g has reduced some of those limitations though.
Oracle ASM instance limits
Oracle ASM can be considered as another database instance that have own parameter limitations you also need to consider carefully. Read more »
A well-designed high availability (HA) solution accounts for all these factors in preventing unplanned database downtime. One of the true challenges in designing a highly available (HA) solution is examining and addressing all the possible causes of downtime. It is important to consider causes of both unplanned and planned downtime. The diagram shown in the slide, classifies unplanned database failures.