Wednesday, 30 December 2015

Agile Archive

When most customers install Delphix they typically add source databases by ingesting an up-to-date backup and then establishing a timeflow of continuous changes from there, however it is possible to ingest existing backups stretching back for as long as you require.

Why would you want to ingest older backups you have already into Delphix?

There are several challenges managing backup archives, one of the biggest being trying to use them.

1.     Unless you know the required date how do you know which backup to restore, assuming you are not simply restoring production from the latest backup?
2.     The time taken to restore involves copying the files and applying recovery, usually this takes a while and requires time from Database and maybe Storage Administrators.
3.     How can you be sure you can restore from a given backup unless the restore process has been tested.
4.     In many cases access to older backups is only required for reporting and compliance reasons.  These are generally not heavy duty tasks so why restore an entire physical copy for a relatively lightweight use case?


Using Delphix as an Agile Archive

1.     The Delphix ingest process will implicitly test that your backups are valid.
2.     You can provision a Virtual Database (VDB) in minutes from any of your ingested backups.
3.     You could build a catalogue of contents of each backup by starting a VDB against the oldest version, querying the content and refreshing the VDB to the next version for each backup in turn.  You could add this metadata to your version control system.
4.     You can publish those backups via self-service to anyone who may need to access, query or run a report against an archived backup.
5.     You could optionally delete your backups once copied into Delphix as Delphix will retain them for as long as required, store them much more economically, using block de-duplication and compression, and can protect them via replication, or backup.

The process to use Delphix for Agile Archiving is very straightforward.

Using SQL Server as an example you would create a Delphix dSource specifying the initial load as a specific full backup, you would then enter the backup UUID of the oldest backup you want to ingest.

For example:

  


Then you can use the Delphix Command Line Interface (CLI) to ingest a set of backups in chronological sequence, starting from the oldest.  An example command would be:


database "AdventureWorksLT2008R2" sync; set type=MSSqlSyncParameters; set loadFromBackup=true; set backupUUID="4688F509-33E1-404D-A692-F9A0373CEF75"; commit


The UUIDs can refer to full or incremental backups which you query from SQL Server.  You can run several of these CLI commands from a single script and you will see in the Delphix UI a set of discrete timecards corresponding to each ingest.



Once your ingest sequence has completed Delphix will automatically stay in synchronization with the source database when new backups, optionally including transaction log backups, are taken.

A similar approach can be used with other database types.  For example for SAP ASE (Sybase) the CLI script to ingest a sequence of archived backups might look like:



database "pubs3" update sourcingPolicy; edit sourcingPolicy; set logsyncEnabled=false; commit

database "pubs3" sync; set type=ASESpecificBackupSyncParameters; set backupFiles=pubs3.full.10293; commit
database "pubs3" sync; set type=ASESpecificBackupSyncParameters; set backupFiles=pubs3.full.31019; commit
database "pubs3" sync; set type=ASESpecificBackupSyncParameters; set backupFiles=pubs3.full.31499; commit
database "pubs3" sync; set type=ASESpecificBackupSyncParameters; set backupFiles=pubs3.full.32083; commit
database "pubs3" sync; set type=ASESpecificBackupSyncParameters; set backupFiles=pubs3.full.34057; commit
database "pubs3" sync; set type=ASESpecificBackupSyncParameters; set backupFiles=pubs3.full.34071; commit

database "pubs3" update sourcingPolicy; edit sourcingPolicy; set logsyncEnabled=true; commit




In this example we disabled and enabled LogSync before and after the ingest sequence.


Delphix Agile Archiving provides faster and more economical access to validated, archived backups and guarantees these can be accessed near instantly and on demand. 



Thursday, 24 December 2015

Agile or Fragile

At a recent DevOps presentation by Gene Kim I was reminded of some startling facts on what differentiates high performing IT organisations from everyone else.

High performing  IT organisations do not have code freezes, in fact they make more changes during the periods that would otherwise be code freeze periods.  This seems counter intuitive, how is this even possible?   A code freeze typically occurs at peak periods of customer activity and hence no one wants to make changes which might upset the running of live applications.   Sounds sensible but what it really means is that just at the very time when changes could have the most beneficial impact you don’t make them because of a more fundamental issue, you are not able to make changes safely, in other words your code release process is fragile.

Agile development advocates that you can make changes frequently, but for that you must be able to make changes safe in the knowledge that these changes will not regress the functionality or performance of live applications.  High frequency changes also means you can experiment and prove the value of alternative solutions in production, testing not the stability of the change but the suitability of the change.

Agile development teams don’t measure releases per year but deployments per day.  A deployment is any change to any component of the application, environment or infrastructure.  Which reminds me of another startling observation, another top predictor of IT performance is “version control for all production artifacts”.  We all know that there are just as many configuration changes as there are source code changes and version control allows you to provision, compare and debug holistically.   The first question anyone asks when something breaks is “what changed?” which is relatively easy to answer for source code using source code control but is usually difficult and time consuming to answer for everything else.

So how does Delphix virtual data help with agile development?  A key to implementing agile development is making environments available early in the development cycle.  For database based applications that means provisioning a full read write copy of the database for every developer, tester and batch test activity.   A de-sensitised copy of the production database is a crucial artifact in the set of components which make up a non-production clone environment.  For most customers this is just not feasible, but with Delphix full database copies can be version controlled, can be versioned in seconds and can be provisioned or switched between versions in minutes.  Delphix allows you to manipulate and control databases versions as quickly and easily as source code files.

The example below shows how Delphix allows developers to submit regression tests on demand by automating and accelerating the data provisioning process in line with automated test suites.  Making environments available early in the development lifecycle allows a developer to get fast feedback on the success and implications of his or her proposed change.  You also get better value from your test environments as a greater ratio of time is spent running tests rather than preparing to run tests.




With more tests run more frequently and earlier in the cycle, the net result is that problems are discovered early and fixed more quickly, as illustrated below.




Virtual databases are a key enabler to Agile Development, allowing IT to produce higher quality releases, faster.  Be Agile not Fragile.


Monday, 26 October 2015

Delphix versus Storage Snapshots

Many customers ask, "What is the difference between Delphix and Storage Snapshots?" This article lists some of the key capabilities that Delphix provides over and above Storage Snapshot based cloning solutions to meet the increasing business demand for Agile Development.

First it is useful to contrast the distinct goals and implementation behind Storage Snapshots and Delphix.

Storage Snapshots


The primary use for storage snapshots is to enable backups of active database or file systems.  A backup takes a long time and most database systems cannot be suspended for the entire duration of a backup.  Storage snapshots allow a database to be queisced for a brief duration whilst the snapshot is taken.  The database can then be un-queiesced allowing transactions to proceed and a consistent backup of the database can then be taken asynchronously via the snapshot.  So a single snapshot can be used to provide a consistent view of a database.  Storage snapshot solutions also allow cloning enabling multiple snapshots.  These read write clones can be used to provision database copies quickly and with low storage overhead, consuming only changed blocks per snapshot.

Supporting clones has an impact on storage performance and hence clones are usually driven off a full copy of production maintained on a separate storage server.  This mitigates the performance impact of clones on the production storage array at the cost of maintaining a replication method and additional storage.

Delphix


Delphix is designed to support Agile Development not just backups and clones.  Agile Development requires fully automated end-to-end provisioning of database copies, ready to use running database environments and fast database manipulation operations exposed through self-service.   Like storage snapshots Delphix virtual databases can be created in minutes and each copy consumes very little incremental storage however a virtual database can be manipulated directly by the end user, typically a developer or tester, through self-service.

Aside from provisioning database copies, virtual databases support the full range of developer and tester activities such as bookmarking (saving and sharing a database state), running multiple database versions (branches), rewinding for repeated consistent tests and refreshing for operating on up-to-date data.  All of these operations are fully automated, complete in minutes and can be applied to a set of databases and optionally the application stack, and treated as a single entity.  All this is done without consuming additional database and application server resources.

Delphix synchronises with source databases with zero impact.  There is no need to replicate or copy the source databases as Delphix synchronises directly with the production database or via backups.


Accelerating Release Cycles


Delphix provides the following benefits for all Software Development Life Cycles:

1.    Manage multiple data sources as one

Many applications rely on more than one data source that must be kept synchronized. Delphix can provision and manipulate a set of databases, folders and data files as a single consistent entity.

2.    Masking and non-production data preparation

Delphix allows you to prepare a copy of production data for non-production by modifying or masking a single virtual database (VDB) once and then creating virtual copies from that single master VDB, executing the preparation phase once and enabling all downstream copies to benefit from that operation.

3.    Performance

Delphix' compression and caching reduces the workload on the underlying storage making it possible to run multiple workloads on Delphix enabled storage faster than using physical storage directly.

4.    Ease of Use

Delphix has class leading end-to-end automation covering deployment, synchronisation, administration, security, provisioning, and replication exposed through self-service user interfaces and leveraged APIs.

5.    Storage efficient

Delphix uses minimum storage when synchronising, provisioning and archiving data and is more storage efficient than storage snapshot based solutions.  All data held in Delphix is compressed and implicitly de-duplicated.

6.    Storage Agnostic

Delphix is storage agnostics allowing customers to switch storage vendors and leverage new developments in storage technologies.

7.    Provenance

Delphix has the best provenance of any Data as a Service (DaaS) solution with over 120 of the Fortune 500 companies doubling their application release frequency using Delphix.

In Summary


If you want to take backups you can use storage snapshots to give you a consistent discrete view of the data, however you could backup directly from Delphix from any point in time, avoiding impacting your production storage.

If you want to save storage you could use storage snapshots but Delphix saves even more storage and is storage independent.

If you want to support agile development through agile provisioning and agile manipulation of full sized data sets then Delphix is your only option.

Delphix is simpler, faster, more storage efficient and supports agile development more effectively than storage snapshots.

Thursday, 21 May 2015

Use Delphix to virtualise Oracle database binaries for easier maintenance and improved server utilisation

Delphix is well known for its ability to virtualise and provision databases.  However Delphix has for some while also been able to virtualise folders and filesystems.

One of the most common use cases for Delphix file virtualisation (vFiles) is to virtualise application binaries.  Delphix customers have found that the Oracle database binaries stored under $ORACLE_HOME are also a great candidate for virtualising for several reasons.  We call this Virtual Oracle Home (VOH).


Maintaining and Distributing Patched Oracle Database Binaries


Typically this process involves:

  1. Patch the Oracle Home with an Oracle issued PSU
  2. Validate that the Patch works
  3. Patch all of the affected DB Target Servers
Typical Oracle Database Patching Process



With Delphix vFiles this becomes a much more efficient two step process.

Step 1:  Virtualise and synchronise your patched Oracle Home directory.



  1. Establish Patch Master and link to Delphix
  2. Install patch on DB Patch Master
  3. Take Delphix Snapshot to ingest patched binaries
  4. Provision/Refresh Oracle Home to DEV/TEST DB server
  5. Use Refresh, Rewind, Reset data control features tovtest patch, remedy errors and validate patch.
        If patch fails, you can “Rewind” the binaries to previous patch set



Step 2:  Distribute either virtual or physical copies

Now you can distribute your patched ORACLE_HOME, either as a virtual copy using NFS mounts or as a physical copy using Virtual to Physical (V2P) re-hydration of the patched ORACLE_HOME folder hierarchy on the target servers.







Other major use cases for Virtual Oracle Home (VOH)



DevOps - provision entire versioned stack

Archive - ensure exact database version is archived along with the database

Server Utilisation - make better use of your dev/test server pool by provisioning pre-requisite Oracle binaries on demand

Testing-as-a-Service - fully automated Regression Testing by provisioning all the components for your tests on any compatible server and O/S




How do I Implement Virtual Oracle Home (VOH)


Please see my upcoming blog post on implementing VOH using Delphix vFiles and hook scripts.




Learn more at Delphix Agile Data Managementwww.delphix.com

Monday, 18 May 2015

Accelerate ETL by Leveraging Delphix Virtual Databases

Many downstream database and analytics systems need to be routinely refreshed with processed versions of the latest changes from production.   Typically the new data needs go through an ETL process to prepare it for downstream use in an Analytics Engine, Warehouse, Data Mart, Reader Farm, etc.

There are several major benefits Delphix offers in this scenario.

Firstly Delphix maintains its own up-to-date copy of source data by seamlessly synchronising with the source databases and retaining this in a secure central data hub.

Second, all ETL processes can be run directly from virtual databases (VDBs) provisioned by Delphix, completely offloading production and without the need for physical copies.  You can create as many VDBs as you want to run these processes in parallel.

Third, if you wish to keep pace with new transactions you can use Delphix Live Sources, which uses an Oracle Standby to keep the Delphix dSource copy continuously updated but also enables reporting using Oracle Active Dataguard.  If synchronising with a target Hadoop cluster you can run Apache Sqoop jobs against a virtual data warehouse that is constantly being updated from the physical data warehouse.  You can run your ETL directly on the changed data stored in the virtual data warehouse and import only the processed changes into your Hadoop cluster.

In addition it is also possible to use Delphix to help identify deltas between two versions of a source database by creating two VDBs based on two point-in-time snapshots.

If the two VDBs (snap1 and snap2) are based on an Oracle source database, first you simply create a database link connecting to snap1 from snap2.

On snap2 you create tables to store:

1.     INSERTS, rows that have been added since the old version (snap1)
2.     UPDATES, rows that have been modified between the two versions
3.     DELETES, rows that no longer exist in the new version (snap2)

You can identify INSERTS, UPDATES and DELETES between the two versions using the MINUS set operator between the corresponding tables in the two VDBs.

For example to find and populate my local table “delta_inserts” with new rows, I use something like:
           
/* create empty table with correct structure */

create table delta_inserts as select * from table where 1=0;
alter table delta_inserts nologging;

/* populate table with new rows only */

insert /*+ append */
  into delta_inserts
  with new_rows as
  (
    select id from table@snap2
     minus
    select id from table@snap1
)
/* fetch entire row from latest version */
select latest.*
  from table@snap2 latest,
       new_rows delta
 where latest.c_id = delta.c_id;

The SQL is similar for UPDATES and DELETES.  However for UPDATES we need to compare the entire row in the minus operation since any column value might have changed.  For DELETES we only need to store the key or id we will use to identify the rows to delete during the apply process.

Now we have two tables containing copies of changed rows and one with a list of rows to be deleted.

We can now run our ETL processing on only the changed rows held in delta_inserts and delta_updates making this process substantially more efficient than running against the full dataset.

The final phase is to implement an apply process to merge the processed deltas with the target database.  The options for applying the changes will depend on the nature of the changes and the type of the target database which might be relational, Hadoop (HDFS), etc.

For Hadoop you can use Apache Sqoop to perform incremental imports and merges.  For relational databases there are more options.

The apply process would usually

1.     delete all the rows in the target table having the keys saved in delta_deletes
2.     insert or append all the rows to the target table having processed rows saved in delta_inserts
3.     merge or update all the rows in the target table having processed rows saved in delta_updates

If there are foreign key dependencies then you might need to disable and re-enable them before and after the apply process.

In summary, Delphix is ideal for provisioning data sets for ETL processing.  Delphix Virtual Databases (VDBs) can also underpin the process to identify changes between two versions of the same database. ETL jobs can then operate against only the changed rows and finally once processed these changes can be used to refresh the target database using a simple apply process.




Learn more at Delphix Agile Data Management