Database High Availability Simplified

FlashGrid Storage Fabric

FlashGrid Storage Fabric turns local drives into shared drives accessible from all nodes in the cluster. The local drives shared via FlashGrid Storage Fabric can be block devices of any type including Elastic Block Storage (EBS) volumes, local SSDs, or LVM volumes. The sharing is done at the block level with concurrent access from all nodes.

Architecture Highlights

  • Shared storage based on local storage devices
  • Broad range of storage devices: NVMe SSD, SAS SSD, virtual disks, LVM volumes
  • Physical storage located inside the database nodes (hyper-converged) or in separate storage nodes
  • Standard x86 servers, on-premise VMs, or public cloud VMs used as database and storage nodes
  • Fully distributed architecture with no single point of failure
  • 2-way or 3-way mirroring of data across separate nodes (done by ASM)
  • Choice of standard Ethernet or RDMA (InfiniBand, RoCE) for network connectivity
  • FlashGrid Read-Local Technology minimizes network overhead by serving reads from local storage devices
  • Oracle ASM and Clusterware integration

Shared Access

FlashGrid Storage Fabric makes every storage device accessible from every database node in the cluster.

Converged Nodes or Separate Storage Nodes

FlashGrid Storage Fabric allows having storage devices attached to the database nodes or to dedicated storage nodes. In 2-node or 3-node clusters having the storage devices attached to the database nodes is preferred in most cases. A configuration with storage in separate dedicated storage servers may be preferred with four or more database nodes in a cluster or when the database nodes do not have enough physical room for storage, for example, with blades or 1U database servers.

High Availability and Data Mirroring

FlashGrid has a fully distributed architecture with no single point of failure. FlashGrid leverages Oracle ASM’s existing capabilities for mirroring data. In Normal Redundancy mode each block of data has two mirrored copies. In High Redundancy mode each block of data has three mirrored copies. Each ASM disk group is divided into failure groups – one failure group per node. Each disk is configured to be a part of a failure group that corresponds to the node where the disk is physically located. ASM makes sure that mirrored copies of a block are placed on different failure groups. In Normal Redundancy mode the cluster can withstand loss of one (converged or storage) node without interruption of service. In High Redundancy mode the cluster can withstand loss of two (converged or storage) nodes without interruption of service.

FlashGrid Read-Local™ Technology

In converged clusters the read traffic can be served from local SSDs at the speed of the PCIe bus instead of travelling over the network. In 2-node clusters with 2-way mirroring or 3-node clusters with 3-way mirroring 100% of the read traffic is served locally because each node has a full copy of all data. Because of the reduced network traffic the write operations are faster too. As a result, even 10 GbE network fabric can be sufficient for achieving outstanding performance in such clusters for both data warehouse and OLTP workloads. For example, a 3-node cluster with four NVMe SSDs per node can provide 30 GB/s of read bandwidth, even on a 10 GbE network.