Friday, 8 July 2016

Delphix for Production Support

Production Support is a very strong Delphix use case.  Delphix can minimize the time taken to provision a database copy for investigation and repair of production data problems whether caused accidentally or maliciously.  Although most customers invest in Delphix to accelerate release cycles its surprising how many have found Delphix to have saved the day for Production Support.




Production Outages

Production application outages are often caused by “logical” data corruption as a result of errors including accidental deletion or modification of data, dropping the wrong table, failed batch jobs and bugs that have eluded testing.   In all these cases you want to repair the production database by restoring the affected data to the most recent consistent state and if the failure was caused by an unknown action you want to identify, diagnose, fix, test and repair the cause of the failure.

There are a variety of methods to establish or maintain an up to date replica copy of a production database to allow operational support teams to quickly troubleshoot production faults.  The server infrastructure is usually paired down significantly as the environment is not intended to support full production concurrent activity or for performance related issue resolution or reproduction but it must be a full replica of production from a data standpoint.

The simplest method is to restore from a backup but this introduces a long delay before you can start investigating the fault even from disk based backups.  The next method is to use a standby database, sometimes delaying the synchronization to avoid the data error being replicated to the standby.  More sophisticated is to use database flashback in combination with a standby, this allows the standby to be kept up to date with the ability to temporarily rewind the standby database prior to the point in time of the failure and then open the database for investigation.

Production Support wth Delphix

Delphix provides the simplest solution for provisioning a Production Support environment on demand as Delphix allows customers to start a private, read write full copy database from any point in time in just a few minutes.  Aside from speed and ease of use Delphix leverages its own compressed, de-duplicated copy of the production database, which it also maintains automatically.   That same copy can be shared to provision as many virtual full copy databases as desired whether for production support, development, testing, reporting, etc.

The time taken to provision a virtual database is less dependent on the database size but more on the rate of change.  All point in time provisioning typically involves starting from a known consistent state and rolling forwards to the desired point by applying transaction logs.    So the majority of the time is consumed by the database rolling forward.  Taking frequent consistent snapshots to minimize the volume of transactions logs to roll forward can reduce the recovery time.

Delphix LiveSource




Delphix LiveSource in Action


For Oracle databases Delphix has a capability called LiveSource this provides an alternative method for synchronizing a Delphix database copy (dSource) with a production database.  Typically Delphix synchronises with an Oracle database using incremental backps and transaction log capture.  Delphix LiveSource starts up an Oracle standby database instance using the Delphix dSource data copy directly, now Oracle Data Guard is keeping the Delphix dSource copy synchronised.  LiveSources are also easier to manage in the case where non logged changes occur on production.  Such changes are not applied by the Data Guard mechanism since they are not recorded in the transaction logs.   Delphix can simply take an incremental backup of the changed blocks and apply them to its own copy.  This is a push button or CLI operation.

This solution has some major advantages for Production Support, in particular for very large databases spanning 100s of Terabytes.  The Delphix copy is compressed so you don’t require adequate storage for a full database copy.  A point in time snapshot of a Delphix LiveSource database takes seconds and incurs microscopic storage overhead.   Even for high change rate source databases the transaction log volume that needs to be applied to perform point in time recovery is minimized.

Oracle Database Flashback


FLASHBACK DATABASE TO SCN 123456789;
FLASHBACK DATABASE TO RESTORE POINT my_restore_point;
FLASHBACK DATABASE TO TIMESTAMP SYSDATE-(5/24/60);

Example Oracle Database Flashback Commands

If that is not enough you can also enable Oracle Flashback Database on the LiveSource standby.  The flashback area used by Oracle to store and access flashback logs, which record past versions of changed blocks, can be provisioned by Delphix and hence these flashback logs are also compressed.

The Delphix LiveSource is just a normal Oracle physical standby database, running on ultra efficient storage and hence with Database Flashback you can rewind the standby to the desired point in time to retrieve lost data, investigate root cause and test corrective actions or fixes.

Once the problem has been resolved you can re-enable the LiveSource which will synchronise the Delphix dSource with production changes that have occurred during the investigation.

Summary

Using Delphix LiveSource for Production Support you can start investigations within minutes regardless of database size or change rate.  For extremely high change rate databases you can use Delphix LiveSource in combination with Oracle Flashback Database.  You then have a choice of flashing back or rolling forward, whichever is the quickest.




No comments:

Post a Comment