John Watson

Dave Anderson

Geoff Wiland

Mick Hosegood

Bob Benoit

Stretched (Extended Distance) Clusters / ASM

In 2010, Oracle Master John Watson was the development DBA (designer / implementer) behind the successful implementation of a stretched cluster (aka extended distance / geo-cluster). Here's the story....

We all know what happens if the database that tracks passengers through security fails: the queues grow longer, and within minutes they stretch half way round the terminal. The database that tracks the baggage is as critical: thousands of bags stack up when they should be on the conveyor belts.

At this client (a large airport in Africa) it appeared that even Data Guard would not be adequate to keep things moving. Fast Start Failover with the Data Guard Broker is fast: it can initiate within seconds. But most DBAs will want to build in a delay of a few minutes. Then it takes time actually to switch over to the standby and reconnect all the sessions. That would have been too slow, given the speed with which chaos would escalate in that environment. And it would need Enterprise Edition licences.

RAC looks like the answer: near instantaneous failover of services and sessions from one instance to another if you lose a node. But it doesn't protect you against losing the site. Or does it? Yes, if you set up a stretched cluster. At each of two airport terminals, we had a database server and a storage array, connected through a fiber switch. A separate ethernet gave the terminals on the security desks and the baggage scanners access, load balanced across both server nodes. ASM handled the mirroring. Losing a server node (not uncommon given erratic power suppliers and unreliable networks) caused all the broken sessions to reconnect (yes, we automated that) to the surviving node with a break in service time of only seconds. When the node came back online, ASM would re-synchronize the database copies. It really worked.

And, best of all, it worked with Standard Edition licences.

