Ldom Ovm Zones Arch

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 20

An Oracle White Paper

January 2014

Architectural Overview of the Oracle ZFS Storage


Appliance
Architectural Overview of the Oracle ZFS Storage Appliance

Introduction ....................................................................................... 1
Overview ........................................................................................... 1
Architectural Principals and Design Goals ......................................... 5
The DRAM-Centric Hybrid Storage Pool ............................................ 6
Storage Object Structures and Hierarchy .......................................... 9
ZFS Data Protection ........................................................................ 12
ZFS Data Reduction ........................................................................ 14
Snapshot and Related Data Services .............................................. 15
Other Major Data Services .............................................................. 16
File Protocols ............................................................................... 16
Shadow Migration ........................................................................ 17
Block Protocols ............................................................................ 17
NDMP for Backup ........................................................................ 17
Management, Analytics, and Diagnostic Tools ................................ 17
Conclusion ...................................................................................... 18
Related Links ................................................................................... 18 Architectural Overview of the Oracle ZFS
Storage Appliance

Introduction
The Oracle ZFS Storage Appliance is a multiprotocol enterprise storage system designed to accelerate
application performance and simplify management efficiency in a budget-friendly manner. The Oracle
ZFS Storage Appliance is suitable for a wide variety of workloads in heterogeneous vendor environments
and is best-of-breed multiprotocol storage in any environment. Additionally, because of collaborative coengineering within Oracle, the Oracle ZFS Storage Appliance offers even more unique storage
performance and efficiency advantages in Oracle-on-Oracle environments. See "Realizing the Superior
Value of the Oracle ZFS Storage Appliance" to learn more about the business value of the Oracle ZFS
Storage Appliance. The purpose of this white paper, however, is to explore the architectural details of the
Oracle ZFS Storage Appliance, examine how it works from a high level, and explain why this approach
was taken to develop a unique enterprise storage product that is able to drive extreme performance and
efficiency advantages at an affordable cost.

Overview
To deliver high performance and advanced data services, the Oracle ZFS Storage Appliance uses a
combination of standard enterprise-grade hardware and a unique, storage-optimized operating system
based on the Oracle Solaris kernel with Oracle's ZFS file system at its core. The storage controllers are
based upon powerful Sun x86 Servers that can deliver the exceptional compute power required to
concurrently run multiple modern storage workloads along with advanced data services. Each Oracle ZFS
Storage Appliance system can be configured as a single controller or as a dual-controller system. In the
case of a dual-controller system, two identical storage controllers work as a cluster, monitoring one
another so that a single controller can take over the storage resources managed by the other controller in
the event of a controller failure. Dual-controller systems are required when implementing a high
availability (HA) environment.
Each controller ingests and sends the traffic from and to the storage clients via a high-performance
network. Ethernet, Fibre Channel, and InfiniBand connectivity options are Architectural Overview of the Oracle ZFS Storage
Appliance

supported for this front-end traffic. The controller then handles the computations required to implement
the selected data protection (i.e. mirroring, RAIDZ), data reduction (i.e. inline compression, deduplication),
and any other relevant data services (i.e. remote replication). The controllers also handle the caching of
stored data in both DRAM and flash. Our unique caching algorithm is key to the spectacular performance
that can be obtained from an Oracle ZFS Storage Appliance. As it processes this traffic, the storage
controller then sends the data to or receives the data from the storage media. A SAS fabric is used for this
back-end controller connectivity.
The disk/flash pools reside in enterprise-grade SAS drive enclosures. These drive enclosures contain
either high-speed or high-capacity SAS spinning disks, along with SAS SLC SSDs, which are used to
stage random writes so that they can be transferred sequentially to the spinning disks, thus accelerating
performance.
Both the controllers and drive enclosures have been configured with availability as the foremost thought.
Redundancy is built in to all systems, with features like dual power supplies, SAS loops, and redundant
OS boot drives.
Figure 1: Highlights of some key hardware features of an Oracle ZFS Storage ZS3-2 system
Storage Appliance

Architectural Overview of the Oracle ZFS

Current hardware models are the Oracle ZFS Storage ZS3-2 (for midrange enterprise storage workloads)
and the Oracle ZFS Storage ZS3-4 (for high-end enterprise storage workloads),. For specific details of the
current Oracle ZFS Storage Appliance systems hardware and configuration options, see the Oracle
storage product documentation pages.
All Oracle ZFS Storage Appliance systems run the same enterprise storage OS. This storage OS offers
multiple data protection layouts, end-to-end checksumming to prevent silent data corruption, and an
advanced set of data services, including compression, snapshot and cloning, remote replication, and
many others. Analytics is one of the most compelling and unique features of the Oracle ZFS Storage
Appliance and is a rich user interface to DTrace, a technology that runs within Oracle Solaris. This
analytics feature can probe anywhere along the data pipeline, giving unique end-to-end visibility of the
process with the ability to drill down on attributes of interest.
Figure 2: A sampling of important data services available on the Oracle ZFS Storage Appliance.

Finally, all of these data services can be managed by an advanced management framework, available as
either a command line interface (CLI) or a browser user interface (BUI). The BUI Architectural Overview of the Oracle
ZFS Storage Appliance

incorporates an advanced analytics environment based on DTrace, which runs within the OS on the
storage controller itself, offering unparalleled end-to-end visibility of key metrics.
For information on the current Oracle ZFS Storage Appliance hardware specifications and options, as well
as the latest listing of data services, please review the product data sheet.
.Architectural Overview of the Oracle ZFS Storage Appliance
5

Architectural Principals and Design Goals


An overarching development goal of the Oracle ZFS Storage Appliance is to provide
maximum possible performance from standard enterprise hardware while providing
robust end-to-end data protection and simplified management. To take maximum
advantage of standard hardware, Oracle Solaris is used as the basis of the storage
operating system. Oracle Solaris is a modern, symmetric multiprocessing (SMP)
operating system that is able to take full advantage of modern Intel x86 multicore
CPUs.
Intel Hyper-Threading Technology provides two hardware threads per core, yielding 20
hardware threads in a 10-core processor, for example. This helps ensure optimal
scheduling of software threads to minimize resource contention and maximize
performance. In addition, years of performance optimization work have gone into the
Oracle Solaris threading model so that the Oracle Solaris multithreaded kernel and
threaded user applications both can take full advantage of Intel multicore processors.
Since the Oracle ZFS Storage Appliance includes many powerful data services that can
be executed in parallel with other activities, the ability to effectively use the available
threads on the Intel Xeon processor enables Oracle ZFS Storage Appliance to increase
overall performance throughput.
Intel QuickPath Technology is a platform architecture that provides high-speed (up to
25.6 GB/sec) point-to-point connections between processors as well as between
processors and the I/O hub. Each processor has its own dedicated memory that it
accesses directly through an integrated memory controller. If a processor needs to
access the dedicated memory of another processor, it can do so through a high-speed
Intel QuickPath Interconnect that links all the processors. Intel QuickPath Interconnect
architecture also includes capabilities such as an optimized scheduler and memory
placement optimization capability that work together with Oracle Solaris to deliver
proven performance benefits. The result is faster movement of data between
processors and memory so that the Oracle ZFS Storage Appliance can move data to
and from DRAM or SSD cache quickly. This, in turn, improves read and write
performance for I/O requests.
Multiprocessor systems generally demonstrate some memory locality effects, which
mean that when a processor requests access to data in memory, that operation will
occur with somewhat lower latency if the memory bank is physically close to the
requesting processor. Oracle Solaris Memory Placement Optimization determines the
non-uniform memory access (NUMA) configuration of the underlying hardware
platform at boot time and then uses this information to allocate memory and schedule
software threads so that memory is as close as possible to the processors that access
it.
Intel Turbo Boost Technology, together with Intel Intelligent Power Technology,
delivers performance on demand, letting processors operate above the rated frequency
to speed specific workloads and reduce power consumption during low utilization
periods. In those situations where Oracle Solaris determines that maximum processing
power is required, the Intel Xeon processor increases the frequency in the active
core(s) to the maximum extent possible given other conditions such as load, power
consumption, and temperature.
All of these features within the OS mean that the Oracle ZFS Storage Appliance can
handle the computational needs to run data protection algorithms, checksumming,
advanced data services (such as Architectural Overview of the Oracle ZFS Storage Appliance
6

compression), and manage the appliances advanced automatic data tiering, all while
simultaneously maintaining excellent throughput and transactional performance
characteristics. This is also a reason the Oracle ZFS Storage Appliance delivers high
performance in high burst, random I/O environments like virtualization.
To use hardware most effectively and cost efficiently in transactional workloads, the
ZFS file system is employed to manage traffic to and from clients in a way that isolates
it from the latency penalties associated with spinning disks. This caching, or autotiering, approach is referred to as the Hybrid Storage Pool architecture. Hybrid
Storage Pool is an exclusive feature of the Oracle ZFS Storage Appliance.

The DRAM-Centric Hybrid Storage Pool


The Hybrid Storage Pool architecture is the core technology that enables the Oracle
ZFS Storage Appliances superior performance. Using an intelligent and adaptive set
of algorithms to manage I/O, the Oracle ZFS Storage Appliance is able to make most
efficient use of modern hardware resources: automatic placement of data on dynamic
random access memory (DRAM), read and write optimized flash-based SSD, and SAS
disk for optimal performance efficiency. Read and write paths are each handled in a
distinct manner to address the unique performance and data integrity needs of each.
In the Hybrid Storage Pool architecture, DRAM is treated as a shared resource. As in
most storage operating systems, DRAM serves as a resource for controller operations
overhead for the operating system itself. But uniquely, and crucially for performance,
DRAM is also used as a primary cache to accelerate reads. Because DRAM is a much
faster media type than either disk or flash for transactional workloads, having a high
proportion of read operations served out of DRAM radically accelerates overall system
performance. The portion of DRAM used to serve as a read cache is known as the
Adaptive Replacement Cache (ARC). DRAM allocation to ARC is managed by the
operating system on a dynamic basis to maximize overall system performance. In the
ARC, blocks are classified as most recently used (MRU), most frequently used (MFU),
least recently used (LRU), or least frequently used (LFU). The idea is to keep the
hottest portion of the overall = data set in DRAM. As ARC becomes saturated and
hotter data needs to replace cooler data in the ARC, the Hybrid Storage Pool will evict
the coolest data in DRAM to a read flash cache device. This is known as the Level 2
ARC (L2ARC), for which the Oracle ZFS Storage Appliance uses SSDs. Read requests
for data that had not been judged hot enough to be placed in either ARC or L2ARC
must be served from spinning disk, resulting in a higher latency on those reads.
However, in practice, it is common to have ARC hit rates in excess of 80 percent across
a wide sampling of installed base systems. This means that in the vast majority of
workloads, the performance will tend toward accelerated, cached, DRAM speeds. Of
course, DRAM is orders of magnitude faster than flash, which is orders of magnitude
faster than spinning diskwhich is why it is important that the Oracle ZFS Storage
Appliance features this DRAM-centric architecture for serving reads. This is beneficial
for performance in any latency-sensitive workload. For example, in database
workloads, where the hot portion of a data set can automatically get assigned so that
reads are issued from DRAM, or in server virtualization use cases, where OS images
can be automatically cached Architectural Overview of the Oracle ZFS Storage Appliance
7

in DRAM, this DRAM-centric architecture can dramatically reduce read latency,


accelerating application host performance.
Figure 3: Illustration of the relative latencies of reads served from different media types.

It is important to note that, for the read path, all data resides on spinning disk,
whether cached or uncached. That is to say, the automated caching done by the Hybrid
Storage Pool puts duplicate sets of blocks in ARC or L2ARCthe data is also safely
stored on disk. This is important because, in the event of a controller failure, all data is
protected because it is persistently stored on spinning disk. In a dual-controller
system, upon failure of a pools primary controller, the second controller can take over
the pool disk resources and have access to all data and serve the reads just as the first
controller would have done. Any cached reads are checksummed against the persistent
storage to ensure that any changed blocks are updated before serving the read to the
client. Architectural Overview of the Oracle ZFS Storage Appliance
8

Figure 4: Graphical depiction of the Hybrid Storage Pool architecture.

The write path is handled differently. Incoming writes to the appliance initially land in
DRAM. (Clients can potentially issue writes to storage either synchronously or
asynchronously, although for most enterprise client OSs, hypervisors, and applications,
writes are generally requested synchronously for data protection and consistency
reasons.) Asynchronous writes are acknowledged as complete to the client
immediately upon landing in DRAM. While this results in extremely low latency, it is
risky from a consistency and data integrity standpoint because, should something fail
after acknowledgement of a write-complete but before destaging the write to
persistent storage, there is the opportunity for that write to be lost and, potentially, a
loss of consistency of the data set. For this reason, most writes are requested
synchronously. Synchronous writes to the Oracle ZFS Storage Appliance are not
acknowledged immediately upon landing in DRAM. Instead, they are acknowledged
only once they are persistently stored on disk or flash. The ZFS file system has a
mechanism for reducing write latency known as the ZFS Intent Log (ZIL). On the
Oracle ZFS Storage Appliance, the storage device that is used to hold the ZIL is a lowlatency, high-durability SLC SSD. There is a logbias that can be set under the share
properties as either latency or throughput depending on the particular workload
that an administrator expects a particular share to experience.
For latency workloads, such as redo log shares for transactional databases and certain
VM environments, the logbias = latency share setting is selected so that the ZIL is
enabled and writes are immediately copied from the system DRAM buffer into the SLC
SSD. Once stored on the SSD, the write is persistent and therefore is acknowledged as
write-complete to the client. The SSD accumulates these random writes and every five
seconds the contents of the system DRAM buffer are flushed to spinning disk and the
space used by the copy on SSD is freed. A failure of an SSD containing write data not
yet flushed to disk does not impact data integrity since the data is still in DRAM. The
copy of the write on the SSD is there so that if the DRAM contents should be lost due
to power outage or component failure, then the contents can be retrieved from the
persistent SSD upon redundant controller takeover or primary controller restoration.
This content is then placed back into the controller's DRAM (a process known as ZIL
replay) and ultimately migrated back to spinning disk. Additionally, SSDs can be
mirrored for additional data protection. Inside of this specialized write- Architectural Overview of the
Oracle ZFS Storage Appliance
9

optimized SSD are three primary components: the NAND flash used to persistently
store the ZIL data, a DRAM buffer to stage data entering the SSD before transferring
to the NAND flash, and a super capacitor that is designed to provide power to allow
flushing the DRAM buffer to the NAND flash in the event of power supply loss to the
SSD before the buffer is cleared to flash. The SSD contains the embedded circuitry
required to ensure persistent storage of the acknowledged data and handle any flash
error corrections needed to complete the restore upon power resumption. In this
manner, the SSD serves as a mechanism to persistently and safely stage writes and
accelerate write-complete acknowledgement, dramatically reducing write latency
without risking the integrity of the data set.
For throughput or streaming workloads, such as query-intensive database workloads
or media streaming, latency is not as critical as data transfer rates. In these workloads,
the logbias = throughput share setting should be used so that writes bypass the ZIL,
thus skipping the SSD and going straight from the controller DRAM buffer to spinning
disk. (Contrary to common misperception, groups of spinning disks are actually faster
than SSDs for throughput workloads.) In this case, once stored persistently on
spinning disk, the write-complete acknowledgement is delivered to the client.
With this architecture, reads are mostly cached in DRAM for optimal read
performance, and writes are handled either to optimize latency performance or
throughput performance, all while ensuring data integrity and persistency. The
performance benefits of the Oracle ZFS Storage Appliance are well documented and
independently verified. Oracle periodically publishes Storage Performance Councils
SPC-1 and SPC-2 benchmark results, as well as Standard Performance Evaluation
Corporations SPECsfs benchmark results to demonstrate performance results for the
Oracle ZFS Storage Appliance. Visit Storage Performance Councils website
(www.storageperformance.org) and Standard Performance Evaluation Corporations
website (www.spec.org) for the latest independently audited, standardized storage
benchmark results for the Oracle ZFS Storage Appliance and for results of many
competitors.

Storage Object Structures and Hierarchy


In the ZFS file system, it is sometimes stated that everything is thin provisioned.
There is some degree of truth to this, and the Oracle ZFS Storage Appliance inherits
these characteristics and arranges them within an intelligent logical hierarchy for the
storage administrator to manage. (Of course, for customers who desire to selectively
turn off thin provisioning, there is a method to accomplish this as well.) Interestingly,
there is virtually no limitation of file system size and number of files or shares beyond
the physical capacity of the system and its pools. Architectural Overview of the Oracle ZFS Storage Appliance
10

Figure 5: Conceptual illustration of thin provisioning (versus traditional or thick provisioning).

The linkage between physical hardware resources and the rest of the logical objects
that comprise ZFS is referred to as a pool. Disks, write flash-accelerating SSDs, and
read-accelerating SSDs are physically grouped together in these physical pools of
resources. Within a given pool, the storage devices all are subject to the same layout
(i.e., stripe, mirror, RAID) and are managed by the same assigned storage controller.
(Thus, in a dual-controller cluster, for an active/active setup, users must provision the
physical devices in at least two distinct pools so that each active controller can manage
at least one pool.) Each pool can contain multiple ZFS virtual devices (vdevs). For
example, if a single parity RAID layout is selected for the pool, a 3+1 stripe width is
used based upon the built-in best practice, and each vdev will contain four disks. The
Oracle ZFS Storage Appliance OS effectively masks the vdevs from the user, however,
and hot spares (unused disks that are preinstalled and ready for use as a
replacement in case of failure of an active data disk) are also provisioned automatically
based upon built-in best practices. Therefore, all the administrator need be concerned
with in terms of physical components is the pool level. (Note that ARC is shared across
all poolsthe Hybrid Storage Pool read tiering algorithm is global and not associated
to any particular pool for the ARC, but L2ARC is assigned by pool.) Architectural Overview of the Oracle
ZFS Storage Appliance
11

Figure 6: BUI screenshot shows the result of a pool layout example.

Once a pool has been created with the admin-specified devices and layout, that pool is
the basic physical resource that the system and the administrator have to work with.
Everything within the pool is essentially thin provisioned up to the limitations of that
pool. For example, each file system and each LUN associated with that pool will have a
maximum available capacity equal to the remaining formatted capacity of the whole
pool, nominally. File systems and LUNs have various share setting options, however, so
a share quota can be established so that the capacity within that share cannot exceed a
threshold and can therefore not consume more than its quota of the pool. A share
reservation also can be set such that the other shares in the pool cannot occupy more
of the pool than is possible without infringing on the reserved capacity. However, if a
share reservation is set to a number and the shares quota is set to that same number,
then thin provisioning is, in effect, defeated this is how an admin would create thick
provisioned shares, if desired.
After the pool, the next hierarchical data structure is known as the project. Think of a
project as a sort of a template for creating shares. Shares have many settings that can
be optimized for different use cases, and projects provide a way to make a template so
that shares can be easily provisioned for same or similar use cases repeatedly over
time without undue effort. For example, a typical Oracle ZFS Storage Appliance might
have three concurrently running workloads: user home directories in SMB shares, VM
files in NFS shares, and some LUNs associated with different e-mail or collaboration
software. A project could be created to group the shares separately for each of these
use cases. As new VM servers are deployed, or as new user home directories are
added, or as new e-mail or collaboration accounts are created, an admin might want to
create a new share that is totally new and empty, but has the raw properties that the
other similar shares utilize. To do so, the user will simply create a new share under the
correct project. (Once a share has been created based upon a project, the settings of
the individual share can be simply edited unique to that project, if needed, without
changing the project. In Architectural Overview of the Oracle ZFS Storage Appliance
12

other words, just because the user started with the template does not imply that the
user has to conform to it later. If the user changes the settings at the share level, that
share will have a setting that is distinct from the rest of the shares in that project. If,
however, the user changes a setting at the project level, then all shares in that project
will inherit the new setting.) Note that projects are completely thin provisioned in that
they have no notion of sizeit is simply a set of attributes that apply to their
associated shares. Thus, if physical capacity is added to a pool, the projects in that
pool are not impacted in any way. It simply increases the total available capacity for all
files, all shares, and all projects associated with the pool.

ZFS Data Protection


The Oracle ZFS Storage Appliance provides robust data protection at both microscopic
and macroscopic levels, ensuring data integrity and protecting against silent data
corruption while also protecting data from hardware failures. At the microscopic level,
ZFS is able to perform end-to-end checksumming throughout all controller-based
cache components all the way down to the disk level. This is because the Oracle ZFS
Storage Appliance uses no RAID controllersrather the storage controllers operating
system itself runs ZFS data protection. This provides the file system full end-to-end
control of data movement and visibility of the blocks so that full and integrated end-toend checksumming can be performed. ZFS checksumming is a more advanced type of
hierarchical checksumming versus the traditional flat checksumming approach.
Traditional checksumming can check the integrity of only one block in isolation of
others, meaning that media bit rot can be successfully screened, but other types of
data corruption across the I/O path cannot be identified. ZFS checksumming uses a
Merkle tree, whereby data blocks are distinct from address block checksums and the
entire I/O path can be protected from end to end. Architectural Overview of the Oracle ZFS Storage Appliance
13

Figure 7: The ZFS approach to checksums can detect more types of errors than the traditional approach.

Another advantage of ZFS checksumming is that it enables a self-healing architecture.


In some traditional checksum approaches, wherever data blocks get replicated from
one location to another there is an opportunity to propagate data corruption. This is
because, with traditional checksum approaches, the newest data block simply gets
replicated. With ZFS checksums, each block replica pair has a checksum calculated
independently. If one is found to be corrupted, the healthy block is then used as a
reference to repair the unhealthy block.
Figure 8: Example of how the ZFS self-healing architecture could correct a corrupted block.

This robust protection from data corruption has yielded a surprising track record: the
Oracle ZFS Storage Appliance has surpassed 100 million hours in production with no
known data corruption errors to date.
The Oracle ZFS Storage Appliance also provides robust protection from hardware
failures. The most typical hardware failure in enterprise storage is, of course, disk
failures. ZFS provides multiple options for protection from disk failures. An Oracle ZFS
Storage Appliance administrator will form pools of physical disks when provisioning
storage. Each pool has a layout assigned to it at creation that defines how data could
be protected. Available layout options are stripe (no media failure protection), mirror
(single disk failure of a paired-set protection), triple-mirror (dual disk failure of a
triple-set protection), RAID Z1 single-parity raid (single disk failure protection within a
four-disk set), RAID Z2 dual-parity raid (dual disk failure protection within a 9, 10, or
12-disk set, depending on pool drive count), and RAID Z3 (triple-disk failure protection
within a multiple disk set, where stripe width varies depending on pool disk count).
Mirroring tends to offer the highest performance for latency-sensitive Architectural Overview of the
Oracle ZFS Storage Appliance
14

transactional workloads, with triple mirroring being a good transactional performance


option as well when a higher protection level is desired. Single-parity RAID tends to
offer excellent throughput performance in streaming workloads, with dual-parity RAID
being a reasonable throughput option as well when higher protection is desired with
some performance expense. Write SSDs can be either striped or mirrored, while read
SSDs are always striped. Oracle has established best practices for balancing media
protection requirements with performance requirements for a variety of use cases.
Oracle Sales Consultants can advise users about their particular environment or users
can review publicly available documents on the Oracle Technology Network pages.

ZFS Data Reduction


The Oracle ZFS Storage Appliance has several options for data reduction, offering four
levels of compression plus a deduplication option. These are share options that can be
set at either the share or project level and can be applied to file systems or LUNs
equally. All of these data reduction mechanisms work inline at the ZFS block level. So,
although the options can be altered on the fly after initial share creation, only new data
is subjected to the new policy. The at-rest data is unaffected by the new setting unless
and until those ZFS blocks at rest are accessed and changed.
Typically, one of the compression options will offer the best combination of data
reduction and performance. The four compression algorithm offerings are LZJB, GZIP2, GZIP, and GZIP-9 (stated in order of lowest-to-highest compression). The higher the
compression level selected, the more computationally intensive it is, thereby driving
higher CPU utilization in the storage controller. Thus, for systems with substantially
excess CPU capacity given their workloads, the higher compression settings will have
much lower (if any) performance impact than they will on heavily taxed systems.
Figure 9: Example of how the ZFS self-healing architecture could correct a corrupted block.

LZJB is an extremely lightweight compression option. Although compression rates vary


widely depending on data type, 2:1 to 3:1 compression ratios are not considered
uncommon. (Again, bear in mind that all workloads compress differently and nothing
can ever be guaranteed in terms of compression ratio.) Sometimes, storage admin
organizations with Oracle ZFS Storage Appliance will use LZJB by default as an
organizational policy because they find it so rarely (if ever) causes any significant
performance impact for their workloads. (Hence the moniker compression for free.)
In fact, using LZJB sometimes even creates a performance increase. This is because
LZJB uses so little Architectural Overview of the Oracle ZFS Storage Appliance
15

compute overhead, and because implementing inline compression means that the
traffic across the SAS fabric between the controller and the disks is compressed. The
net effect is increased bandwidth utilization and, often, a net increase in performance
versus using no compression at all (though this is not always the case).
GZIP-2, GZIP, and GZIP-9 are standard compression algorithms that are commonly
used industry wide in a variety of IT implementations (www.gzip.org provides more
information from the developers of these algorithms). These options have been
implemented as options on the Oracle ZFS Storage Appliance and can provide higher
compression than LZJB, but typically with greater compute resource utilization. In
systems that have significant excess compute capacity relative to the workload, or in
environments where data reduction is more important than maximum performance, it
is often satisfactory to implement GZIP or even GZIP-9. GZIP-2 strikes a compromise
between LZJB and GZIP for many customers.
ZFS deduplication is another data reduction available on the Oracle ZFS Storage
Appliance. This option must be used with caution. It uses a hash table to reference
duplicate ZFS blocks, meaning that each write that generates unique ZFS blocks
produces another entry in the table, and each read requires a lookup on the table. This
can become very computationally intensive in some workloads as the hash table grows
over time. Streaming workloads are particularly ill suited for ZFS deduplication. Thus,
ZFS deduplication should never be used for Oracle Recovery Manager (Oracle RMAN)
backups, for example. Oracle RMAN is a feature of Oracle Database. (The Oracle ZFS
Storage Appliance is an excellent option for Oracle RMAN backup workloads, but
deduplication is not recommended on the Oracle ZFS Storage Appliance for this
particular use case. Other superior data reduction options are advised, such as the
ZFS compression options; Oracle RMAN compression; Hybrid Columnar
Compression, ; and others.) ZFS deduplication can be a good option, however, in
workloads that do not cause the hash table to continuously grow, such as certain boot
image deduplication situations. If using ZFS deduplication, it is advisable to segregate
workloads by project, using deduplication only on the workloads for which it works
well and putting the other workloads on other shares that use compression instead.
With proper usage of the data reduction options, in concert with the variety of hostbased data reduction options available, excellent data reduction rates can be achieved
along with optimized performance. Oracle has many documented solutions and best
practices for a variety of environments, particularly Oracle Database environments, to
optimize overall system performance and data reduction. Additional information is
available on Oracle Technology Network page, the Oracle Optimized Solutions page, or
from an Oracle Sales Consultant.

Snapshot and Related Data Services


The Oracle ZFS Storage Appliance features a snapshot data service, as well as several
other data services built using this snapshot capability as a foundation. Snapshots
themselves serve as point-in-time, read-only copies of data. Snapshots can serve as a
restore point for a data set should it be desired to roll the state of the data set back to
a previous point in time. Note that while snapshots create a logical restore point, they
do not provide a physical backup. (Data should always be appropriately Architectural Overview of
the Oracle ZFS Storage Appliance
16

backed up physically in accordance with criticality appropriate data redundancy


policies.) Snapshots also can be referenced as read-only shares from clients (unless
this option is disabled by an administrator) by mounting the root of the file system and
changing directory to .zfs/snapshot. The maximum number of snapshots in ZFS is
virtually unlimited, they take up almost no space, and take very little time to create, so
many customers use them liberally. Because of the Merkle tree and copy-on-write
technology used in ZFS, snapshots are effectively penalty-free. Snapshots can be
scheduled or taken manually, depending on usage and policies.
The Cloning feature of Oracle ZFS Storage Appliance is one data service based upon
the snapshot technology. In ZFS, clones are essentially zero-copy read/write shares
based on a snapshot. Clones are created from existing snapshots, turning that
snapshot into an object that behaves as any regular share to the user. But in the
background, ZFS is tracking and storing only changes to blocks and referencing the
original data set, thus avoiding any requirement to fully duplicate the entire data set to
have a second, distinct working copy. Clones provide a simple, fast way to provision
test/dev/QA environments with minimal incremental storage consumption over the
data set upon which the snapshot is based, for example. (Again, note that clones do not
provide a physical backup, although mounting a clone and copying it to a distinct
location on physically distinct storage could.)
The Replication feature of Oracle ZFS Storage Appliance is another available data
service, and is also based on the snapshot technology. Replication can be used internal
to an appliance to replicate a data set from one pool to another, for example.
Replication also can be used externally to replicate a data set from one Oracle ZFS
Storage Appliance to another. This is useful for facilitating backups or for DR purposes,
or simply to move a data set from one physical location to another for some other
purpose. Oracle ZFS Storage Appliance Replication is asynchronous, meaning that
WAN latency does not impact acknowledgment of client requests. (Note that for
environments requiring synchronous replication, a variety of host-based solutions are
available to accomplish this goal in a higher availability, higher performance manner
than storage-based synchronous replication typically would. Additional information is
available on Oracle Technology Network page, or from an Oracle Sales Consultant.)
Replication can be handled manually, can be scheduled, or can be automated with
scripting. Another option is continuous replication, which starts the next replication
run as soon as the previous is completed. This ensures the minimum possible gap/dataloss window. Because it is snapshot-based, replication is incremental in nature,
meaning that the initial replication will move the entire data set but subsequent
replication of the same data set will move only changed blocks, resulting in shorter
subsequent replication times. Replication can be invoked at either the project or share
level on the source appliance.

Other Major Data Services


File Protocols
The Oracle ZFS Storage Appliance supports many common protocols, both file and
block. NFS versions 2, 3, and 4 are supported. Kerberized NFS is also supported. SMB
is supported, along with key SMB features such as opportunistic locks, SMB signing,
and Active Directory. (Note that the Oracle ZFS Storage Appliance only supports
membership in one Active Directory domain or one Architectural Overview of the Oracle ZFS Storage Appliance
17

Kerberos Real. Multiple domain/realm membership is not supported.) A single share


can be accessed via both NFS and SMB simultaneously. An identity management
service is built in to facilitate sharing of identities between Windows (SMB) and UNIX
(NFS) systems so that mixed clients can access the same share simultaneously. FTP is
supported (along with SFTP and TFTP) so that the Oracle ZFS Storage Appliance can
be used as an FTP server. Access to shares is also possible via HTTP or HTTPS as the
appliance implements the WebDAV extension. NIS and LDAP authentication are
supported for NFS, HTTP, and FTP access. A virus scanning option is built in and, when
invoked, performs inline scanning whenever a file is accessed via any of these file
protocols and will quarantine as necessary. DNS, Dynamic Routing, iPMP, LACP, and
NIS are all supported.
Shadow Migration
Another feature of the Oracle ZFS Storage Appliance is Shadow Migration. This
service uses the NFS protocol to migrate data from any legacy NFS system to the
Oracle ZFS Storage Appliance in a low downtime manner. Clients are disconnected
from the legacy filers and then reconnected to the Oracle ZFS Storage Appliance, and
then the Oracle ZFS Storage Appliance is connected as a client to the legacy filer. The
directory structure of the legacy filer is scanned and data begins migrating to the
Oracle ZFS Storage Appliance. The system is online, so if a client requests a file not
yet present on the Oracle ZFS Storage Appliance, it will be retrieved from the legacy
filer and copied to the Oracle ZFS Storage Appliance and passed on to the client. Other
files migrate as a background activity. While not intended as a fast migration method,
it does provide for extremely low downtime migration.
Block Protocols
Block protocol support is also present in the Oracle ZFS Storage Appliance. LUNs can
be exported via iSCSI or FC. SRP and iSER (via 40 Gb InfiniBand) are also supported.
For iSCSI, both RADIUS and iSNS discovery are supported. The Oracle ZFS Storage
Appliance can serve as a FC target or an FC initiator to facilitate backups. For details
of FC use as a target, please read the Oracle white paper Understanding the Use of
Fibre Channel in the Oracle ZFS Storage Appliance. This document should be
considered essential reading for anyone considering using the Oracle ZFS Storage
Appliance in FC environments.
NDMP for Backup
NDMP is also supported for backups in DMA environments. NDMP backups can be
produced in dump, tar, or zfs formats. Note that, while zfs NDMP format may provide a
performance benefit, it does not support DAR, so direct file access is not possible with
zfs NDMP format specifically. Dump or tar must be used if DAR support is required.

Management, Analytics, and Diagnostic Tools


The Oracle ZFS Storage Appliance includes an advanced command line interface (CLI)
and browser user interface (BUI). These interfaces contain the same management
options and are designed to mask the complexity of the underlying OS while still
offering a powerful and deep command set. Most common administrative tasks can be
accomplished quickly and easily with just a few commands or Architectural Overview of the Oracle ZFS
Storage Appliance
18

mouse clicks. The BUI also offers a built-in visual, industry-leading analytics package
based on D-Trace that runs in the storage controller OS itself. This analytics package
gives unparalleled visibility into the entire storage stackall the way down to disk and
all the way up to client network interfaces, including cache statistics, CPU metrics, and
many other parameters. This analytics package is extremely useful to identify any
bottlenecks and tune the overall system for optimal performance. It is also very helpful
in troubleshooting situations along with a client system administrator as the
information can help the storage administrator to distinguish clearly upstream issues.
This is particularly useful, for example, in large-scale server virtualization
environments where one VM out of thousands has or is causing a performance issue
that needs to be discretely addressed. For further information on analytics, please see
the Analytics Guide on the Oracle ZFS Storage Appliance Product Documentation
website.
The Oracle ZFS Storage Appliance also offers an advanced scripting language,
ECMAScript, which is based on JavaScript. Workflows can be used to store scripts in
the appliance, and unlike many competitive frameworks that run on external machines,
the Oracle ZFS Storage Appliance can take inputs from users or other systems
workflows are executed. Workflows can be invoked via either the BUI or CLI, or by
system alerts and timers. Scripting is a powerful way for administrators to automate
complex but repetitive tasks allowing for customization. For more information on
scripting, visit this white paper on the topic.

Conclusion
The Oracle ZFS Storage Appliance is designed to extract maximum storage
performance from standard enterprise-grade hardware while providing robust data
protection, management simplicity, and compelling economics. The unique
architecture, based upon Hybrid Storage Pool, and the wide variety of advanced data
services make the Oracle ZFS Storage Appliance an excellent choice for a wide variety
of enterprise storage workloads that demand high performance.

Related Links
Main Oracle ZFS Storage Appliance website: http://www.oracle.com/zfsstorage
Oracle Technology Network Oracle ZFS Storage Appliance page:
http://www.oracle.com/technetwork/server-storage/sun-unifiedstorage/overview/index.html
Product Data Sheet: http://www.oracle.com/us/products/serversstorage/storage/nas/zs3-ds-2008069.pdf
Business Value White Paper: http://www.oracle.com/us/products/serversstorage/storage/nas/resources/zfs-sa-businessvaluewp-final-1845658.pdf
Copyright 2014, Oracle and/or its affiliates. All rights reserved.
Analyst White Paper on Oracle Integration:
This document is provided for information purposes only, and the contents hereof are
to change without notice. This document is not warranted to be error-free, nor
http://www.oracle.com/us/products/servers- subject
subject to any other warranties or conditions, whether expressed orally or implied in law,
including implied warranties and conditions of merchantability or fitness for a particular
storage/storage/nas/esg-brief-analystpurpose. We specifically disclaim any liability with respect to this document, and no
paper-2008430.pdf Architectural Overview of the Oracle ZFS
contractual obligations are formed either directly or indirectly by this document. This
Storage Appliance January, 2014 Author: Bryce Cracco, Product Manager
Oracle Corporation World Headquarters 500 Oracle Parkway Redwood Shores, CA
94065 U.S.A.
Worldwide Inquiries: Phone: +1.650.506.7000 Fax: +1.650.506.7200
oracle.com

document may not be reproduced or transmitted in any form or by any means, electronic
or mechanical, for any purpose, without our prior written permission.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names
may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All
SPARC trademarks are used under license and are trademarks or registered trademarks
of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo
are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a
registered trademark of The Open Group. 0113

You might also like