New HP 3PAR Thin Technologies
New HP 3PAR Thin Technologies
New HP 3PAR Thin Technologies
Table of contents
Introduction............................................................................................................................................................................4
Overview of HP 3PAR Thin Technologies for data compaction.......................................................................................4
Product highlights .............................................................................................................................................................4
HP 3PAR Thin Provisioning software ..............................................................................................................................5
HP 3PAR Thin Deduplication software ...........................................................................................................................5
HP 3PAR Thin Clones software .......................................................................................................................................5
HP 3PAR Thin Persistence software ...............................................................................................................................5
HP 3PAR Thin Conversion software ................................................................................................................................5
HP 3PAR Thin Copy Reclamation software....................................................................................................................6
HP 3PAR ASIC with Thin Built InTM .......................................................................................................................................6
The benefits of Thin Provisioning........................................................................................................................................7
Avoid frequent storage capacity addition to servers....................................................................................................7
Accelerate time to market ...............................................................................................................................................7
Chargeback model in utility storage ...............................................................................................................................7
Thin Deduplication ................................................................................................................................................................7
Using fully provisioned volumes .........................................................................................................................................8
System Requirements ..........................................................................................................................................................9
HP 3PAR Volume Manager ...................................................................................................................................................9
Physical disks.....................................................................................................................................................................9
Logical disks.......................................................................................................................................................................9
Common provisioning groups ...................................................................................................................................... 10
Virtual volumes .............................................................................................................................................................. 10
The anatomy of a Thin Volume .................................................................................................................................... 11
Thin Volume metadata .................................................................................................................................................. 11
Thin Deduplication implementation ............................................................................................................................ 12
Express Indexing ............................................................................................................................................................ 12
Common provisioning groups .......................................................................................................................................... 13
CPGs and workloads ...................................................................................................................................................... 13
Availability level .............................................................................................................................................................. 13
Reasons to create multiple CPGs ................................................................................................................................. 13
CPG automatic growth................................................................................................................................................... 14
2
Technical white paper | HP 3PAR Thin Technologies
3
Technical white paper | HP 3PAR Thin Technologies
Introduction
Compaction technologies such as thin provisioning, thin deduplication and thin reclamation offer efficiency benefits for
primary storage that can significantly reduce both capital and operational costs. Thin provisioning has achieved widespread
adoption as it dramatically increases capacity efficiencies. It has become a data center “must have” for its ability to break the
connection between logical and physical capacity. Deduplication is also an essential consideration when looking into
deploying workloads onto a flash tier or an all flash array. Thin technologies can vary widely in how they are implemented,
some are complex to deploy, while others use coarse allocation units and cannot deliver the required space savings. Not
only is HP 3PAR StoreServ Storage viewed as the industry’s thin technology leader, but third-party testing and competitive
analysis confirm that HP 3PAR StoreServ offers the most comprehensive and efficient thin technologies among the major
enterprise storage platforms 1. HP 3PAR Thin Technologies including HP 3PAR Thin Deduplication, Thin Provisioning, Thin
Conversion, Thin Persistence, and Thin Copy Reclamation achieve advanced data compaction through leveraging built-in
hardware capabilities and Express Indexing technology.
Thin provisioning allows a volume to be created and made available as a logical unit number (LUN) to a host without the
need to dedicate physical storage until it is actually needed. HP 3PAR Thin Provisioning software has long been considered
the gold standard in thin provisioning for its simplicity and efficiency. Unlike other “bolt-on” implementations, HP 3PAR Thin
Provisioning software is simple and efficient, helps your organization start new projects more quickly and on demand, and
saves millions of dollars. HP 3PAR Thin Provisioning leverages the dedicate-on-write approach of HP 3PAR StoreServ
Storage, allowing enterprises like yours to purchase only the disk capacity you actually need. HP 3PAR Thin Provisioning
integrates seamlessly with VMware vSphere, Windows® Server 2012, Red Hat® Enterprise Linux (RHEL), and Symantec
Storage Foundation—greatly enhancing the operative and administrative efficiency of these platforms.
While HP 3PAR Thin Technologies software are extremely simple to deploy and use, a certain amount of planning is
advantageous to help maximize its benefits. This paper documents best practices on thin provisioning on HP 3PAR
StoreServ Storage and is intended for administrators looking to get the most out of their HP 3PAR StoreServ deployment. In
addition, it describes other HP 3PAR Thin Technologies that you can use in conjunction with HP 3PAR Thin Provisioning
software to help maximize its effectiveness. Unique to HP 3PAR StoreServ, HP 3PAR Thin Conversion software enables you
to reduce capacity requirements by 50 percent or more by deploying HP 3PAR StoreServ in place of legacy storage 2.
HP 3PAR Thin Persistence software and other thin-reclamation solutions enable thin-provisioned storage on HP 3PAR
StoreServ arrays to stay thin over time by helping ensure that unused capacity is reclaimed for use by the array on an
ongoing basis.
Now, HP 3PAR Thin Deduplication and related HP 3PAR Thin Clones software take thin efficiency to the next level. In
addition, HP 3PAR Thin Technologies protect SSD performance and extend flash-based media life span while ensuring
resiliency.
1
HP Thin Technologies: A Competitive Comparison, Edison Group 2012. h20195.www2.hp.com/V2/GetDocument.aspx?docname=4AA4-
4079ENW&cc=us&lc=en
2
Based on documented client results that are subject to unique business conditions, client IT environment, HP products deployed, and other factors. These
results may not be typical; your results may vary.
4
Technical white paper | HP 3PAR Thin Technologies
5
Technical white paper | HP 3PAR Thin Technologies
With HP 3PAR Thin Conversion, you can quickly shrink your storage footprint, reduce storage TCO, and meet your green IT
targets. HP 3PAR Thin Conversion software makes this possible by leveraging the zero-detection and in-line deduplication
capabilities within the HP 3PAR ASIC and HP 3PAR Thin Engine (a unique virtualization mapping engine for space
reclamation) to power the simple and rapid conversion of inefficient, “fat” volumes on legacy arrays to more efficient,
higher-utilization “thin” volumes. Getting thin has never been so easy.
HP 3PAR Thin Conversion software is an optional feature that converts a fully provisioned volume to a thin provisioned or
thin deduplication volume during data migration. Virtual volumes (VVs) with large amounts of allocated but unused space
are converted to thin volumes that are much smaller than the original volume. During the conversion process, allocated but
unused space is discarded and the result is a volume that uses less space than the original volume.
6
Technical white paper | HP 3PAR Thin Technologies
When planning to collect charging data, it is recommended that the names chosen for objects like CPGs, VVs, snapshots, and
domains contain a meaningful prefix or suffix referring to the project, application, line of business (LOB), or department the
objects belong to. This enables the grouping of objects in the chargeback report. The HP 3PAR OS allows up to 31 characters
for the name of an object.
Thin Deduplication
Deduplication has become standard with disk-based backup due to a high degree of data redundancy and less emphasis on
high performance, backup and archive workloads have been an ideal target for deduplication technologies. Traditional
primary storage workloads such as OLTP have lower data redundancy and hence lower deduplication ratios and therefore
deduplication of primary storage has not been seen as beneficial. However, the landscape around primary storage
deduplication is changing. The high data redundancy of server virtual machine (VM) images and client virtualization
environments with hosted virtual desktops have tremendous potential for benefiting from deduplication. Home and file
directory consolidation is another area where primary storage deduplication can offer significant space savings.
With the increasing use of SSD storage, deduplication for primary storage arrays has become critical. The cost differential
between SSDs and hard disk drives (HDDs) requires compaction technologies like thin provisioning and deduplication to
7
Technical white paper | HP 3PAR Thin Technologies
make flash-based media more cost-efficient. The widespread deployment of server virtualization is also driving the demand
for primary storage deduplication.
The following should be taken into account before implementing Thin Deduplication:
• It is only applicable to Virtual Volumes residing solely on SSD storage. Any system with an SSD tier can take advantage of
thin deduplication. Since a thin deduplication can only be on SSD storage they are not compatible with the sub-LUN tiering
of Adaptive Optimization (AO). If a thinly deduped volume exists within a Common Provisioning Group (CPG) then the CPG
is not available for use in an AO configuration. Conversely, if a CPG is already in an AO configuration then it is not possible
to create a thinly deduped volume in the CPG.
• The granularity of deduplication is 16 KiB and therefore the efficiency is greatest when the I/Os are aligned to this
granularity. For hosts that use file systems with tunable allocation units consider setting the allocation unit to 16 KiB or a
multiple of 16 KiB. With Microsoft Windows hosts that use NTFS the allocation unit can be set in the format dialog box 3.
For applications that have tunable block sizes consider to setting the block size to 16 KiB or a multiple of 16 KiB.
• Deduplication is performed on the data contained within the Virtual Volumes of a CPG. For maximum deduplication store
data with duplicate affinity on Virtual Volumes within the same CPG.
• Thin Deduplication is ideal for data that has a high level of redundancy. Data that has been previously been deduplicated,
compressed or encrypted are not good candidates for deduplication and should be stored on thin provisioned volumes.
3
For Microsoft Windows PowerShell Format-Volume cmdlet syntax see technet.microsoft.com/en-us/library/hh848665.aspx
8
Technical white paper | HP 3PAR Thin Technologies
However, if a significant part of the array will be utilized for thin volumes, it is advised to use thin provisioning for all
volumes, to help minimize management overhead and help maximize space efficiency.
System Requirements
HP 3PAR Thin Technologies are included as part of the HP 3PAR OS Software Suite and are available on all HP 3PAR
StoreServ models.
The functionality offered by each system license is summarized in table 1.
License Functionality
Thin Conversion Zero detection to prevent allocation of space and enables T10 UNMAP support
Thin Copy Reclamation Reclaiming unused space resulting from deleted virtual copy snapshots and remote copy volumes
Physical disks
Every physical disk (PD) that is admitted into the system is divided into 1 GB chunklets. A chunklet is the most basic element
of data storage of the HP 3PAR StoreServ. These chunklets form the basis of the RAID sets; depending on the sparing
algorithm and system configuration, some chunklets are allocated as spares.
Logical disks
The logical disk (LD) layer is where the RAID functionality occurs. Multiple chunklet RAID sets, typically from different PDs,
are striped together to form a LD. All chunklets belonging to a given LD will be from the same drive type. LDs can consist of
all Nearline (NL), Fibre Channel (FC), or solid-state drive (SSD) type chunklets.
There are three types of logical disk:
1. User (USR) logical disks provide user storage space to fully provisioned virtual volumes.
2. Shared data (SD) logical disks provide the storage space for snapshots, virtual copies, thin provisioned virtual volumes
(TPVV) or thin duplicated virtual volumes (TDVV).
3. Shared administration (SA) logical disks provide the storage space for snapshot and TPVV/TDVV administration.
They contain the bitmaps pointing to which pages of which SD LD are in use. The LDs are divided into “regions,” which are
contiguous 128 MiB blocks. The space for the virtual volumes is allocated across these regions.
9
Technical white paper | HP 3PAR Thin Technologies
Virtual volumes
The top layer is the virtual volume (VV). VVs draw their resources from CPGs, and are the only data layer visible to hosts
when they are exported as virtual logical unit numbers (VLUNs).
A VV is classified by its type provisioning which can be one of the following:
• full—Fully provisioned VV, either with no snapshot space or with statically allocated snapshot space.
• tpvv—Thin Provisioned VV, with space for the base volume allocated from the associated user CPG and snapshot space
allocated from the associated snapshot CPG (if any).
• tdvv—Thin Deduplicated VV, with space for the base volume allocated from the associated user CPG and snapshot space
allocated from the associated snapshot CPG (if any).
• cpvv—Commonly provisioned VV. The space for this VV is fully provisioned from the associated user CPG and the
snapshot space allocated from the associated snapshot CPG.
On creation of a thin volume (TPVV or TDVV), the size of the VV is specified, but no storage is allocated. Storage is allocated
on demand in the shared data area as required by the host operation being performed. The shared admin area contains the
metadata indexes that point to the user data in the SD area. Since the SA metadata needs to be accessed to locate the user
data, the indexes are cached in policy memory to help minimize the performance impact of the lookups.
Thin volumes associated with the same CPG share the same LDs and draw space from that pool as needed, allocating space
on demand in small increments for each controller node. As the volumes that draw space from the CPG require additional
storage, the HP 3PAR OS automatically creates increase in the logical disk storage until either all available storage is
consumed or if specified, the CPG reaches the user-defined growth limit, which restricts the CPG’s maximum size. The size
limit for an individual Virtual Volume is 16 TiB.
The relationship between the HP 3PAR OS abstraction layers is illustrated in figure 1.
Figure 1. Overview of thin virtual volumes
10
Technical white paper | HP 3PAR Thin Technologies
11
Technical white paper | HP 3PAR Thin Technologies
space usage will increase (more LDs may be added or the existing LDs expanded) but the amount of SA space will not
expand until the initial 8 GB allocation is used. The CPG will then grow the SA space based on the growth parameters that
have been set.
On a small system with a few thin volumes, the SA space may be as much as 10 percent of the SD space, but on a medium-
large system, the SA space used is typically only 1 percent of the SD space.
Express Indexing
HP 3PAR Thin deduplication uses an advanced metadata lookup mechanism called Express Indexing which is unique as it
combines the built-in hash generation capability of the HP 3PAR ASIC with HP 3PAR Thin Provisioning metadata lookup table
for extremely fast hash comparisons.
Once the hash signature of the incoming data has been generated, there must be a check to see if data with the same
signature already exists. This is typically a CPU and memory intensive operation involving search mechanisms on large pools
of reserved memory containing the signatures of existing data. The HP 3PAR OS instead uses a technique called Express
Indexing to detect duplicate page data. This process takes advantage of the highly optimized and robust address space to
physical storage page indexing mechanism of thin provisioned volumes.
When new a write I/O request comes in, the Logical Block Address (LBA) is used as an index into page tables as per a regular
TPVV but instead of allocating a new page the hash signature of the incoming data page is computed by the HP 3PAR ASIC
and compared to the signatures of the data already stored in the CPG. If a match is found then the existing block is
compared at a bit level with the new block by the ASIC. A successful comparison is a “dedupe hit”, in which case the virtual
volume pointers are updated to reference the existing data. In the unlikely event a hash collision is detected, then the data is
stored in the virtual volume and not treated as duplicate. If the hash of the new data was not found by the lookup, a new
page is allocated in the DDS. The value of the hash is used as the offset of the page in the DDS therefore a simple a page
translation will turn the hash into a physical page location.
When a read is performed the page translation of the TDVV LBA can point either to the local store or to the DDS volume.
12
Technical white paper | HP 3PAR Thin Technologies
Availability level
To provide HA, chunklets from the same RAID set should be distributed across multiple components. 4
There are three levels of availability that can be selected with HP 3PAR StoreServ.
• HA CAGE means that no two members of the same RAID set can be in the same drive enclosure. For example, to support
RAID 5 3+1 (set size four), four drive chassis connected to the same node pair are required. This helps ensure that data is
still available in the event that access to an entire drive cage is lost. This applies to drive chassis that are point-to-point
connected to the nodes (no daisy chain).
• HA MAG means that no two members of the same RAID set are in the same drive magazine. This allows a wider stripe
with fewer drive chassis; for example, a RAID 5 stripe size of 7+1 (set size eight) would be possible with only four drive
chassis, provided each chassis had at least two drive magazines.
• HA PORT applies only to daisy-chained drive chassis. When this level of availability is selected, no two members of the
same RAID set can be in drive chassis that are dependent on one another for node connectivity. For example, in a system
in which there are eight drive chassis with four of the drive chassis connected to another drive chassis for node access, HA
PORT would only allow RAID 5 3+1 (set size four) in order to prevent the loss of one drive chassis from causing a loss of
data access. On systems that do not have daisy-chained cages, such as the HP 3PAR StoreServ 10000, setting HA PORT is
the same as setting HA CAGE.
4
It is important to understand that drive magazines consist of four drives for HP 3PAR StoreServ 10000. Drive magazines consist of only a single drive in the
HP 3PAR StoreServ 7000 and 7450 Storage systems.
13
Technical white paper | HP 3PAR Thin Technologies
• There can be multiple deduplication CPGs in the system, each deduplication CPG providing storage for different sets of
volumes. This allows volumes with similar datasets to be grouped together and facilitates deduplication at the virtual
domain level in a multi-tenancy environment.
• When virtual domains are used, because a CPG can only belong to one virtual domain.
• When HP 3PAR Adaptive Optimization (AO) software is used, because a CPG can only belong to one Adaptive Optimization
policy.
• When using thin deduplication there is a limit of 256 TDVVs per CPG.
While there are several reasons to create multiple CPGs, it is recommended that the number of CPGs be kept low as each
CPG will reserve its own growth space.
Table 2. Default and limits for the growth increment per node pair
2 32 8 2,047.75
4 64 16 2,047.75
6 96 24 2,047.75
8 128 32 2,047.75
14
Technical white paper | HP 3PAR Thin Technologies
Sizing guidelines
What is the ideal size for a Thin Volume? There is no definite answer to this, but you should consider the following when
deciding on the size for thin volumes:
• The minimum size for a Thin Volume is 256 MB; the maximum size is 16 TB for all HP 3PAR OS versions on all types of HP
3PAR StoreServ systems.
• Thin Provisioning demonstrates the highest value in situations involving large-scale consolidation. For small virtual
volumes (256 MB up to a few tens of GB), the growth increment of the CPG of the TPVV may be many times higher than
the TPVV size which means that minimal benefit is realized after one growth increment is applied.
• It is possible to increase the size of a thin volume. This provides a significant amount of flexibility. However, the impact of
growing volumes for host-based OS needs to be considered. It is not possible to shrink a TPVV.
• When faced with a choice, it is preferable to make volumes larger than needed over making them too small. If the
volumes that are presented are too small for the ongoing requirements, then TPVV growth or additional volumes will be
required in the future, which is something that needs to be managed.
15
Technical white paper | HP 3PAR Thin Technologies
These include:
• Allocation warnings and limits for TPVVs
• Growth warnings and limits for CPGs
• Used physical capacity alerts
• Free physical capacity alerts
Allocation warnings
Allocation warnings provide a mechanism for informing storage administrators when a specific capacity threshold is
reached. An allocation warning can be specified independently for each TV and each CPG. It is recommended that allocation
warnings be used, at least on the CPG level, and acted upon when they are triggered.
16
Technical white paper | HP 3PAR Thin Technologies
The relevant CLI commands for setting allocation and growth warnings are:
• setvv –usr_aw <percent> <TV>: sets the allocation warning for the user space of the TV as a percentage of
the TV size
• setvv –snp_aw <percent> <TV>: sets the allocation warning for the snapshot space of the TV as a percentage
of the TV size
• setcpg –sdgw <num> <CPG>: sets the growth warning for the CPG in MB (append to the value num “g” or
“G” for GB or “t” or “T” for TB)
These warnings can be changed at any time and are effective immediately. The CLI commands showvv –alert and
showcpg –alert lists the allocation warnings that were set per TV and CPG.
Allocation limits
Applications sometimes get into an abnormal state writing data continuously to the storage device. Allocation limits provide
a mechanism to prevent such “runaway” applications from consuming disk capacity beyond a specified threshold. Allocation
limits can be specified independently for each TV and each CPG. For a TV, after the allocation limit is reached, the capacity
allocated to the TV stops growing and new writes by the application fail. Similarly, for a CPG, after the allocation limit is
reached, the automatic creation of new LDs, if configured, is disabled.
The relevant CLI commands related to setting allocation and growth limits are:
• setvv –usr_al <percent> <TV>: sets the allocation limit for the user space of the TV as a percentage of the
TV size
• setvv –snp_al <percent> <TV>: sets the allocation limit for the snapshot space of the TV as a percentage of
the TV size
• setcpg –sdgl <num> <CPG>: sets the growth limit of the CPG in MB (append to the value num “g” or “G”
for GB, or “t” or “T” for TB)
These alerts can be changed at any time and are effective immediately. The CLI commands showvv –alert and
showcpg –alert list the allocation limits that were set per TV and CPG.
The VV allocation limits and warnings can be set with the HP 3PAR Management Console by selecting Show advanced
options checkbox when creating or editing a VV as shown in figure 5.
Figure 5. VV allocation limit and warning options
17
Technical white paper | HP 3PAR Thin Technologies
The CPG growth limits and warnings can be set with the HP 3PAR Management Console by selecting Show advanced
options checkbox when creating or editing a CPG as shown in figure 6.
Figure 6. CPG allocation limit and warning options
It is important to note that the growth limit for a CPG is a hard limit and the CPG will not grow beyond it. Once the CPG hard
limit has been reached any VVs that require more space will not be able to grow. This will result in write errors to host
systems until the CPG allocation limit is raised. Therefore, it is recommended that TV, CPG, and free space warnings and
limits are set to sensible levels and managed when they are triggered. As an example, the CPG warning limit should be set
sufficiently below the CPG allocation limit so that it alerts the storage administrator with ample time to react before the CPG
allocation limit is reached.
These serve as array-wide, advance warnings to the storage administrator to plan for and add necessary physical capacity.
The alerts generated should be monitored and promptly acted upon to prevent all free space of a particular drive type be
consumed.
18
Technical white paper | HP 3PAR Thin Technologies
Alerts can be forwarded (setsys RemoteSyslogHost) to a log host for viewing them in an enterprise management
application.
User recommendations
The monitoring of alerts for available capacity by storage administrators and internal business processes are a critical
component of a successful HP 3PAR Thin Provisioning management and administration strategy. You should nominate a
primary and if possible a backup storage administrator for each site with HP 3PAR StoreServ equipment. The storage
administrator’s roles include:
• Proactively monitor free space availability per TV and CPG.
• Proactively monitor consumption rates for TVs and CPGs.
• Proactively monitor consumed TV capacity and compare to licensed thin provisioning capacity.
• Proactively monitor physical capacity thresholds for each disk type and for the entire array.
• Ensure adequate purchasing and installation of additional physical disk capacity buffer and thin-provisioning license
upgrades in a timely manner.
• Nominate an escalation contact who has proper authority to drive the customer responsibilities outlined in this document
if the nominated storage administrators fail to carry out their responsibilities.
If you have a network connection with HP 3PAR Central via the Service Processor, the health of the HP 3PAR StoreServ can
be proactively monitored for CPG growth problems, and you can request to receive thin provisioning and other alerts by mail
or via phone. You retain responsibility for managing the thin-provisioning capacity and CPGs; HP is not responsible for any
failure when thresholds are met or exceeded.
Capacity Efficiency
HP 3PAR OS 3.2.1 MU1 introduces two metrics to measure the capacity efficiency of the HP 3PAR Thin Technologies,
compaction ratio and dedup ratio. The compaction ratio is how much physical storage space a volume consumes compared
to its virtual size and applies to both thin provisioned and thinly deduped volumes. The dedup ratio is how much physical
storage space would have been used without deduplication, compared to the actual storage space used by a thinly deduped
volume. The ratios are shown as decimals and have an implied: 1 i.e. 4.00 is actually 4:1 (4 to 1). The dedup ratio does not
include savings from inline zero-detection.
These capacity efficiencies can be shown per volume, volume family, CPG, virtual domain or system and are in terms of
usable storage (i.e. not including RAID overhead). The efficiencies displayed will depend on the scope of the request. For
example, it is likely that the dedup ratio of a CPG will be higher than that of the individual TDVVs because the data in the
Dedup Store can be shared by multiple TDVVs. Note that the capacity efficiency ratios are calculated every 5 minutes and
therefore changes to the data may not be immediately reflected in the capacity efficiency ratios displayed.
19
Technical white paper | HP 3PAR Thin Technologies
Base Volumes
The capacity efficiencies of a base volume are shown by the showvv -space command and are calculated as follows:
• The compaction ratio of a TPVV is the virtual size of the volume divided by the sum of its used admin space and its used
data space.
• The compaction ratio of a TDVV is the virtual size of the volume divided by the sum of its used admin space, its used data
space and its used Dedup Store space.
• The dedup ratio of a TDVV is the size of the data written to the TDVV divided by the sum of the data stored in the TDVV
and the data associated with the TDVV in the Dedup Store.
In this following example vv1 is a TPVV that has virtual size of 100 GB and has 10 GB of data written to it, vv2 is a TDVV that
has virtual size of 100 GB and has 10 GB of non-deduplicable data written to it, and vv3 is a TDVV that has virtual size of 100
GB and has two copies of the vv2 data written to it.
cli % showvv -space vv1 vv2 vv3
---Adm--- ---------Snp---------- ----------Usr-----------
--(MB)--- --(MB)--- -(% VSize)-- ---(MB)---- -(% VSize)-- -----(MB)------ -Capacity Efficiency-
Id Name Prov Type Rsvd Used Rsvd Used Used Wrn Lim Rsvd Used Used Wrn Lim Tot_Rsvd VSize Compaction Dedup
383 vv1 tpvv base 256 10 0 0 0.0 -- -- 24576 10240 10.0 0 0 24832 102400 10.0 --
380 vv2 tdvv base 256 8 0 0 0.0 -- -- 8704 1 0.0 0 0 8960 102400 10.0 1.0
382 vv3 tdvv base 256 13 0 0 0.0 -- -- 8704 2 0.0 0 0 8960 102400 10.0 2.0
------------------------------------------------------------------------------------------------------------------
3 total 768 31 0 0 41984 10243 33024 307200 10.0 1.5
Note that the vv2 and vv3 have very low counts in the Usr Used column as the majority of their data resides in the Dedup
Store. Only the data from blocks that had hash collisions have been stored in the TDVVs themselves. The capacity
efficiencies in the total row are averages of the volumes displayed.
The base volume savings can also be seen in the HP 3PAR Management Console by selecting the particular VV. Figure 7
shows the space savings displayed by the Management Console for vv3.
Figure 7. Virtual Volume space savings
Volume Families
If a volume has snapshots then the showvv command will display the individual snapshots below their parent volumes but
the capacity efficiency displayed is that of the entire volume family because all the snaps of a base volume share the same
snapshot area. If a block changes on a base volume, the old data is copied to the snapshot area for that volume and all
snaps of the volume point to that single block. This allows a volume to have hundreds of snaps without requiring additional
space or incurring additional performance impact.
20
Technical white paper | HP 3PAR Thin Technologies
The capacity efficiencies of a volume family are shown by the showvv -space command and are calculated as follows:
• The compaction ratio of a TPVV volume family is the sum of the virtual size of the volume and the virtual volume sizes of
all its snapshots divided by the sum of the used admin space and the used data space.
• The compaction ratio of a TDVV volume family is the sum of the virtual size of the volume and the virtual volume sizes of
all its snapshots divided by the sum of the used admin space, the used data space and the used Dedup Store space of the
volume and its snapshots.
• The dedup ratio a TDVV volume family is the sum of the data written to the TDVV and all its snapshots divided by the sum
of the data stored in the TDVV and the Dedup Store for the volume and its snapshots.
In this example snapshots were created from the previous VVs and 1GB of data was changed on vv1 and 100 MB of data
was changed on vv3. The result is the compaction ratio for the vv2 volume family is twice that of the vv2 base volume and
the compaction ratios of the other volume families have increased but not by as much due to the change in the data.
cli% showvv -space vv*
---Adm--- ---------Snp---------- ----------Usr-----------
--(MB)--- --(MB)--- -(% VSize)-- ---(MB)---- -(% VSize)-- -----(MB)------ -Capacity Efficiency-
Id Name Prov Type Rsvd Used Rsvd Used Used Wrn Lim Rsvd Used Used Wrn Lim Tot_Rsvd VSize Compaction Dedup
395 vv1 tpvv base 256 12 8704 1000 1.0 0 0 24576 10240 10.0 0 0 33536 102400 18.2 --
400 vv1_snp snp vcopy -- *0 -- *0 *0.0 0 0 -- -- -- -- -- -- 102400 -- --
397 vv2 tdvv base 256 9 512 0 0.0 0 0 8704 1 0.0 0 0 9472 102400 20.0 1.0
402 vv2_snp snp vcopy -- *0 -- *0 *0.0 0 0 -- -- -- -- -- -- 102400 -- --
398 vv3 tdvv base 256 14 512 0 0.0 0 0 8704 2 0.0 0 0 9472 102400 19.8 2.0
404 vv3_snp snp vcopy -- *0 -- *0 *0.0 0 0 -- -- -- -- -- -- 102400 -- --
-----------------------------------------------------------------------------------------------------------------------
6 total 768 35 9728 1000 41984 10243 52480 614400 19.3 1.5
Note that the space usage columns for the snapshots contain “--“ as the space usage of the snapshots is maintained in the
base volume.
• The compaction ratio is the sum of the virtual sizes of all the volumes and shapshots in the CPG (CPVV, TPVV and TDVV)
divided by the sum of their in use admin space, data space, snapshot space and used Dedup Store space.
• The dedup ratio is the sum of all the data written to the TDVVs and TDVV snapshots of a CPG divided by the size of the
Dedup Store of the CPG (metadata and data).
In this example the TPVV_CPG CPG contains only TPVVs so it only has a compaction ratio, whereas the TDVV_CPG CPG
contains TDVVs so it also has a dedup ratio.
cli% showcpg -space
---------------(MB)---------------
--- Usr --- -- Snp --- --- Adm --- - Capacity Efficiency -
Id Name Warn% Total Used Total Used Total Used Compaction Dedup
0 TPVV_CPG - 16896 16896 15872 0 8192 256 10.0 -
4 TDVV_CPG - 34304 34304 31232 0 24576 10496 10.7 1.6
----------------------------------------------------------------------------
2 total 51200 51200 47104 0 32768 10752 10.4 1.6
Note that unlike the ratios of the virtual volumes, the CPG calculations include the space consumed by the metadata (admin
space) associated with the virtual volumes and the Dedup Store. For small data sets the CPG values may appear lower as a
result.
21
Technical white paper | HP 3PAR Thin Technologies
Virtual Domains
The capacity efficiencies of a virtual domain are shown by the showsys -domainspace command and are calculated as
follows:
• The compaction ratio is the sum of the virtual sizes of all the volumes in the virtual domain (CPVV, TPVV and TDVV)
divided by the sum of the used admin space, the used data space and the used snapshot space of all CPGs in the virtual
domain.
• The dedup ratio is the sum of all the data written to the TDVVs and TDVV snapshots in the virtual domain divided by the
sum of the space used by the TDVVs, the TDVV snapshots and the Dedup Stores in the virtual domain.
In this example there is an additional dom1 virtual domain which contains just TPVVs.
cli% showsys -domainspace
--------------CPG(MB)---------------
-Non-CPG(MB)- -----Usr----- ----Snp---- ----Adm----- -----(MB)------ -Capacity Efficiency-
Domain Usr Snp Adm Total Used Total Used Total Used Unmapped Total Compaction Dedup
- 0 0 0 0 0 0 0 0 0 0 0 0.0 -
dom0 0 0 0 102400 102400 94208 1024 98304 32256 0 294912 10.5 1.6
dom1 0 0 0 33792 33792 31744 0 24576 768 0 90112 10.0 -
-------------------------------------------------------------------------------------------------
0 0 0 136192 136192 125952 1024 122880 33024 0 385024 10.3 1.6
System
The capacity efficiencies of the system are shown by the showsys -space command and are calculated as follows:
• The compaction ratio is the sum of the virtual sizes of all the volumes in the system (CPVV, TPVV and TDVV) divided by
the sum of the used admin space, the used data space and the used snapshot space of all CPGs.
• The dedup ratio is the sum of all the data written to the TDVVs and TDVV snapshots in the system divided by the sum of
the space used by the all TDVVs, the TDVV snapshots and the Dedup Stores.
This example shows how the system wide capacity efficiencies are displayed.
cli% showsys -space
------------- System Capacity (MB) -------------
Total Capacity : 20054016
Allocated : 1542144
Volumes : 385024
Non-CPGs : 0
User : 0
Snapshot : 0
Admin : 0
CPGs (TPVVs & TDVVs & CPVVs) : 385024
User : 136192
Used : 136192
Unused : 0
Snapshot : 125952
Used : 1024
Unused : 124928
Admin : 122880
Used : 33024
Unused : 89856
Unmapped : 0
System : 1157120
Internal : 321536
Spare : 835584
Used : 0
Unused : 835584
Free : 18511872
Initialized : 18511872
22
Technical white paper | HP 3PAR Thin Technologies
Uninitialized : 0
Unavailable : 0
Failed : 0
------------- Capacity Efficiency --------------
Compaction : 10.3
Dedup : 1.6
1:1 0%
1.5:1 33%
2:1 50%
4:1 75%
10:1 90%
TPVVs
Use the showvv –s –p -prov tpvv command to see how much admin, user, and snapshot space is used by each
TPVV. The reserved totals show how much space has been allocated, whereas the used totals show how much of the space
is currently in use by the VV. A significant difference between the space in use and the reserved space would indicate that
space reclaim has been initiated on the VV, and the reserved space will decrease over time as the space is reclaimed in the
background. This is an example of the showvv –s output:
cli% showvv –s -p -prov tpvv
---Adm--- ---------Snp---------- -----------Usr------------
--(MB)--- --(MB)--- -(% VSize)-- ----(MB)----- -(% VSize)-- -----(MB)------ -Capacity Efficiency-
Id Name Prov Type Rsvd Used Rsvd Used Used Wrn Lim Rsvd Used Used Wrn Lim Tot_Rsvd VSize Compaction Dedup
370 vv1 tpvv base 256 16 0 0 0.0 -- -- 25088 21582 21.1 0 0 25344 102400 4.7 --
371 vv2 tpvv base 256 40 0 0 0.0 -- -- 66048 61642 60.2 0 0 66304 102400 1.7 --
372 vv3 tpvv base 256 47 0 0 0.0 -- -- 74240 73378 71.7 0 0 74496 102400 1.4 --
373 vv4 tpvv base 256 28 0 0 0.0 -- -- 47616 41062 40.1 0 0 47872 102400 2.5 --
23
Technical white paper | HP 3PAR Thin Technologies
---------------------------------------------------------------------------------------------------------------------
4 total 1024 131 0 0 212992 197664 214016 409600 2.1 --
The same information is displayed in the HP 3PAR Management Console when viewing the VV provisioning as shown in
figure 8.
Figure 8. VV allocation sizes
CPGs
Space in use on the array can be tracked per CPG. The showcpg –r command shows the user, snapshot, and admin space in
Used and Raw Used amounts. You can work out the unallocated space within the CPGs by subtracting the used space
from the Totals listed.
In addition to showing the CPG usage, the showspace –cpg command will also show how much LD space may still be
created, given the amount of free chunklets in the system and the CPG parameters (e.g., RAID level, HA level, device types,
etc.).
cli% showspace -cpg *
------------------------------(MB)----------------------------
CPG -----EstFree------- --------Usr------ -----Snp---- ----Adm---- -Capacity Efficiency-
Name RawFree LDFree Total Used Total Used Total Used Compaction Dedup
TPVV_CPG 18499584 9249792 16896 16896 15872 512 8192 256 10.0 -
TDVV_CPG 18499584 9249792 34304 34304 31232 0 24576 10496 10.7 1.6
FC_r5 176078848 154068992 21951616 21951616 104320 3072 32768 8480 13.1 -
NL_r6 54263808 40697856 27969280 27969280 96512 69632 23552 21248 8.3 -
System space
Not all space on the physical disks is used for storing your data. A small portion of the space on the array is dedicated to
volumes with an administrative function.
There is a fully provisioned volume named admin that is used to store system administrative data such as the System
Event Log. The logging LDs, starting with the name log, are used to store data temporarily during physical disk failures and
disk replacement procedures. There are also preserved data space logical disks (PDSLDs) which start with the name pdsld.
Preserved data is the data moved from the system’s cache memory to the PDSLD space in the eventuality of multiple disk
or cage failures. On HP 3PAR StoreServ systems running HP 3PAR OS 3.1.2, the HP 3PAR System Reporter software is
integrated into the OS and executed on the controller nodes, and the database files are stored in a fully provisioned volume
called .srdata.
24
Technical white paper | HP 3PAR Thin Technologies
The amount of raw disk space consumed by these system functions is summarized in table 4.
2 240 GB 20 GB 128 GB 60 GB
4 480 GB 20 GB 256 GB 80 GB
5
These are maximum values. The exact values will vary depending on the HP 3PAR StoreServ model and disk configuration.
6
The total capacity of the PDSLDs is equal to the sum of all data cache memory located in the controller nodes of the system.
25
Technical white paper | HP 3PAR Thin Technologies
The value in the System section of the output is the raw space in use by the admin and the .srdata volume, the PDSLDs, the
logging LDs, and the spare chunklets. The internal entry lists the space associated with the local boot disks of the nodes so
this should be treated separately from the space allocated from the regular drives.
7
The System Reporter functionality has been included in HP 3PAR OS 3.1.2. Prior versions of HP 3PAR OS will require the optional HP 3PAR System Reporter
software on an external host.
26
Technical white paper | HP 3PAR Thin Technologies
Note
If using Adaptive Optimization on all the CPGs, automatic CPG compaction is a configurable option of Adaptive Optimization.
It is recommended to run the startao command with the -compact trimonly option to defer the LD
defragmentation until the scheduled CPG compaction runs.
27
Technical white paper | HP 3PAR Thin Technologies
28
Technical white paper | HP 3PAR Thin Technologies
reclaimed back to the CPG. Defragmentation occurs if there is more than 1 GB of available space to reclaim and results in
free 128 MiB regions. Up to 16 volumes at a time can be queued for reclaim processing.
How quickly the space is reclaimed depends on a number of factors. If there is a large amount of freed space on a volume,
then this may not be processed within a single reclaim period and once the reclaim process runs on a TPVV, the reclaim
process will not run again on that TPVV again for at least 90 minutes. Therefore, large space reclaims can take several hours
to complete.
In addition, the reclamation on a TPVV can be deferred for various reasons. For example, if the SD space of a TPVV is grown,
then reclaims on the volume will be deferred for 60 minutes. Also, if reclaim is de-fragmenting a TPVV and the
defragmentation does not complete during the reclaim interval further reclaim will be deferred for 8 hours.
Thin Persistence reclamation may not reclaim all the free space on a volume. There is a 4 GB per node threshold below
which the TPVV will not be inspected for available 128 MiB regions that can be reclaimed back to the CPG. The free space will
still be available for reuse by the TPVV.
Those new to Thin Provisioning often like to verify Thin Persistence reclamation by creating a test scenario of filling a file
system then deleting the files and running a space reclamation tool. It is important to understand that the space will not be
returned to the CPG immediately. The showvv –s command will show how much space has been allocated to the TPVV
and the difference between the space in use and the reserved space shows the amount of space reclaimed for use within
the TPVV. The amount of reserved space will decrease over time as the space is reclaimed back to the CPG in the
background by the reclaim thread.
Thin Persistence and deduplication
From a host perspective the methods used to initiate space reclaim on TDVVs are the same as those on TPVVs (i.e. zero files
or UNMAP) but internally there are differences in the way space reclaim operates. With TDVVs it is not necessary to have a
zero detection mechanism to scan all incoming I/Os as all zero blocks will be reduced to a single zero block by the
deduplication engine. However, the original pages in the Dedup Store cannot be removed as they may be in use by other
TDVVs. This is also the case for space reclaimed by the UNMAP command.
HP 3PAR Thin Deduplication uses an online garbage collection process that periodically checks for data in a Dedup Store
that is not referenced by any TDVVs in that CPG.
When reclaim is initiated on a TDVV the metadata for the pages being freed are pointed to the deduped zero block. The
garbage collection process then scans all the TDVVs belonging to the same CPG and builds a list of hashes being referenced.
This is then compared with the hashes in the Dedup Store and if any pages are not longer referenced then they are marked
as free.
Once the garbage collection has completed the SD space associated with the TDVVs and Dedup Store is processed by the
normal Thin Persistence reclaim thread and the space is returned to the CPG.
29
Technical white paper | HP 3PAR Thin Technologies
achieve good reclaim rates, the utilities need to fill the majority of the available free space so they are generally run
manually during an outage window or scheduled to run during a quiet period to avoid applications failing due to a lack of
space.
For the zerofile utilities to work the zero-detect policy needs to be set for each TPVV. Blocks of 16 KiB of contiguous zeros
are freed and returned for reuse by the VV; if 128 MiB of space is freed, it is returned to the CPG for use by other volumes.
For more information on Thin Persistence methods for various operating systems see Appendix B.
The replication can be between volumes of different types on different tiers. For example a thinly deduplicated volume on
SSD drives on a primary array can be in a remote copy relationship with a thin provisioned volume on NL drives on a
secondary array.
While deduplicate data is rehydrated HP 3PAR Remote Copy offers advanced data transmission techniques to optimize
bandwidth utilization, for more information please refer to the HP 3PAR Remote Copy White Paper 8.
8
h20195.www2.hp.com/V2/GetPDF.aspx%2F4AA3-8318ENW.pdf
30
Technical white paper | HP 3PAR Thin Technologies
HP 3PAR Dynamic Optimization is a powerful software feature that enables storage administrators to perform several
online volumes optimizations:
• Perform online data movement of existing volumes to different RAID levels and different tiers of storage. For example,
application that requires high performance only during certain windows can be tuned to RAID 1 on SSDs and during lower
activity periods moved to more cost-effective RAID 6 storage on Nearline disks.
• Convert existing volumes to a different volume type. For example Thinly Provisioned volumes on a HDD CPG to a Thinly
Deduped volume on an SSD tier. Or the conversion of a full volume to a thinly provisioned volume. This conversion
happens within the HP 3PAR StoreServ Storage system transparently and non-disruptively.
This “thinning” of volumes enables data centers to meet green IT targets, reduce wasted space, and increase utilization
rates. Because the fat-to-thin conversion mechanism is built into the array hardware, volume conversions take place inline
and at wire speeds, while preserving service levels, and without causing disruption to production workloads.
The conversion requires the HP 3PAR Dynamic Optimization Software license.
To convert the volume to a thin deduplication volume in the SSD_r5 CPG the following command would be used:
cli% tunevv usr_cpg SSD_r5 -tdvv vol01
When the –tpvv, -tdvv or -full options for the usr_cpg subcommand are specified, the tune will automatically
rollback on a failure. These options do not support virtual volumes with remote copy. These options will only convert virtual
volumes using snapshots if the -keepvv option is used, but the snapshots will reside in the virtual volume specified by
the -keepvv option.
During the thin conversion, the HP 3PAR ASIC will assist in reducing the amount copied by using its zero-detect capability to
remove the need to copy blocks of zeros and deduplicate data if converting to a TDVV. To make optimal use of this feature,
it is advantageous to write zeros to the allocated but unused space on the fully provisioned volume prior to the conversion.
9
For earlier versions of HP 3PAR OS, this “thinning” operation does not complete online: a brief disruption of service is required to change the host mapping
from the full to the thin-provisioned volume. A HP 3PAR Full Copy software (or Physical Copy) operation of the source volume is required for HP 3PAR OS
3.1.1 and prior, the license for which is included in HP 3PAR OS.
31
Technical white paper | HP 3PAR Thin Technologies
The Dedup Estimate can also be launched from the HP 3PAR Management Console by selecting a VV set or multiple TPVVs
and then Right clicking and choosing the Dedup Estimate menu item as shown in Figure 9.
Figure 9. Dedup Estimate
The estimated dedup ratio will be shown under the task's detailed information, which can be accessed via showtask -d.
cli% showtask -d 11705
Id Type Name Status Phase Step -------StartTime-------- -------FinishTime------- -Priority- -User--
11705 dedup_dryrun checkvv done --- --- 2014-10-12 22:25:45 CEST 2014-10-12 22:26:15 CEST n/a 3parsvc
-----(MB)------ (DedupRatio)
Id Name Usr Estimated Estimated
395 vv1 10240 -- --
431 vv2 10240 -- --
--------------------------------------
2 total 20480 10239 1.82
In this example checkvv is estimating that if vv1 and vv2 were converted to TDVVs and put in the same CPG there would
be a dedup ratio of 1.82:1 for the CPG.
32
Technical white paper | HP 3PAR Thin Technologies
Conclusion
HP 3PAR StoreServ Storage is the only platform that offers a comprehensive thin strategy that not only allows storage to
start thin, but to get thin and stay thin. Compaction technologies such as thin deduplication, thin provisioning and thin
reclamation offer efficiency benefits for primary storage that can significantly reduce both capital and operational costs with
spinning media and SSDs. Not only is HP 3PAR StoreServ Storage viewed as the industry’s thin technology leader, but third-
party testing and competitive analysis confirm that HP 3PAR StoreServ offers the most comprehensive and efficient thin
technologies among the major enterprise storage platforms. In addition, HP 3PAR Thin Technologies protect SSD
performance and extend flash-based media life span while ensuring resiliency.
HP 3PAR Thin Technologies enables simple, yet powerful tools for improving the efficiency of storage. Following the best
practices outlined in this paper will allow IT staff to help maximize the benefit of HP 3PAR Thin Technologies and do more
with less. To supplement the dramatic savings of thin provisioning, HP 3PAR StoreServ features a unique HP 3PAR ASIC with
thin capabilities built in and a range of software offerings that can save enterprises 50 percent or more on the cost of a
storage technology refresh.
33
Technical white paper | HP 3PAR Thin Technologies
Oracle
In an Oracle environment, HP 3PAR Thin Provisioning provides the better storage space utilization when used in the following
cases:
• Autoextend-enabled tablespaces—Oracle Automatic Storage Management (ASM)-managed datafiles with the autoextend
feature grow as the need for tablespace grows. Using such auto-extendable datafiles on thin-provisioned volumes allows the
system to allocate disk space to a database as it grows.
However, Oracle’s process of extending a tablespace is I/O intensive and can affect the performance of the database during the
file extension. Lessen the frequency of the performance impact by increasing the increment of space that Oracle adds (the
AUTOEXTEND ON NEXT parameter of the CREATE TABLESPACE command). Set autoextend to a larger value for rapidly
growing datafiles, and a smaller value if the data is growing at a slower rate.
• ASM disk groups as archive log destinations—The combination of Oracle ASM and HP 3PAR Thin Provisioning enables an
expanding ASM archive log destination. Oracle databases become inaccessible when the archive log destinations fill up. Put
Oracle archive logs on thin-provisioned ASM disk groups, which allows the underlying storage to self-tune, accommodating
unexpected increases in log switch activity. After a level 0 backup, remove the archive logs and use the Oracle ASM Storage
Reclamation Utility (ASRU, described later) to free up the storage that was allocated to the old logs.
HP 3PAR Thin Provisioning may not be the best option for the following:
• Datafiles residing on file systems—When an Oracle datafile is first created, Oracle initializes the data blocks with a block headers
and other metadata. Due to small block size of most databases, there are usually no contiguous ranges of zero blocks for space
reclamation. This causes a TPVV to provision all of its space, nullifying the value of thin provisioning.
• Systems with high file system utilization—If the file systems or ASM disk groups are full, then the benefits of thin provisioning
are reduced. Consider any ASM disk group or file systems with utilization rates above 80 percent as inappropriate for use with
thin provisioning. In this case, it may be more efficient to use fully provisioned virtual volumes to hold this data.
• Oracle datafiles that are not in “autoextend” mode—Oracle databases write format information to datafiles during tablespace
creation. This has the same effect as provisioning file systems with high utilization and may be inefficient depending upon the
ratio of provisioned storage to database size.
Microsoft Hyper-V
The virtual hard disk (VHD) and the new virtual hard disk (VHDX) formats of Microsoft Hyper-V are fully compatible with HP
3PAR Thin Provisioning. Both formats offer a choice of fixed or dynamic sizing. A fixed VHD has an allocated size that does
not change whereas a dynamic VHD will expand as data is written to it.
34
Technical white paper | HP 3PAR Thin Technologies
Although it may seem natural to use a fully provisioned VV with a fixed VHD, contiguous zeros are written to the space allocated so
they are ideal candidates for thin provisioning, as the zeros will be reclaimed by the HP 3PAR Thin Persistence software.
In the past, fixed VHDs were recommended for production instead of dynamic VHDs, as their I/O performance was higher.
With the new VHDX format introduced in Windows Server 2012, performance of dynamic VHDs has been significantly
improved and there is now little difference between the two VHD types.
In addition, VHDX disks report themselves to the guest OSs as being “thin-provision capable.” This means that if the guest
OS is UNMAP-capable it will be able to send UNMAPs to the VHDX file, which will then be used to ensure that block
allocations within the VHDX file that are freed up for subsequent allocations as well as forwarding the UNMAP requests to
the physical storage.
SAP
By migrating data from traditional arrays to HP 3PAR StoreServ via Thin Conversion, legacy SAP systems can reduce up to
80 percent of the capacity in the storage environment. In an SAP system, data gets moved around or deleted within system
storage volumes. HP 3PAR Thin Persistence enables that the thin volumes used by SAP systems stay efficient as possible by
reclaiming unused space associated with deleted data.
Typically, SAP databases store data in the form of a matrix (tables and indexes) consisting of rows and columns. Most of the
columns in the tables are of fixed length, and there are often leading or trailing zeroes in these columns. The HP 3PAR ASIC is able
to detect these zeros if they form a contiguous 16 KB block and prevent storage from being allocated. In testing with SAP ERP6
IDES, a LUN with HP 3PAR Thin Persistence software enabled consumed 10 percent less storage space than a traditional thin LUNs
after the initial database creation. See the HP 3PAR StoreServ Storage for SAP Systems—technical white paper for further details.
VMware vSphere
VMware vSphere also has its own thin provisioning capabilities therefore a decision has to be made about which level to
implement thin provisioning.
When implementing HP 3PAR StoreServ TPVVs, administrators often ask whether implementing vSphere Thin Provisioning
for Virtual Machine Disk (VMDK) files makes any sense. In general, Thin Provisioning with HP 3PAR StoreServ and vSphere
accomplish the same end-result, albeit at different logical layers. With VMware vSphere Thin Provisioning, administrators
realize greater VM density at the Virtual Machine File System (VMFS) layer, at the cost of some CPU and disk I/O overhead as
the volume is incrementally grown on the ESXi hosts. By implementing HP 3PAR StoreServ TPVVs, the same VM density
levels are achieved; however, the thin provisioning CPU work is offloaded to the HP 3PAR StoreServ ASIC. If the goal is to
reduce storage costs, help maximize storage utilization, and maintain performance, then use HP 3PAR Thin Provisioning
software to provision Virtual Machine File System VMFS volumes. If performance is not a concern but overprovisioning VMs
at the VMFS layer is important, then administrators can consider implementing both Thin Provisioning solutions. However,
administrators should realize that there is no additional storage savings realized by using vSphere Thin Provisioning on top
of HP 3PAR TPVVs; and in fact, implementing both solutions adds more management complexity to the environment.
When creating VMs, there are several options for the file system layout of the VMDK files. By default, VMDK files are created
with the “Thick Provision Lazy Zeroed” option, which means the VMDK is sparsely populated so not all the blocks are
immediately allocated. When a guest VM reads from the unallocated areas of the VMDK, the vSphere server detects this and
returns zeros rather than reading from disk. This VMware thin provisioning capability enables the oversubscription of the
sizes of the VMDK files within the datastore.
For performance-intensive environments, VMware recommends using “Thick Provision Eager Zeroed” (EZT) virtual disks.
These EZT disks have lower runtime overhead but require zeros to be written across all of the capacity of the VMDK at the
time of creation. On traditional arrays, this VMDK format would negate all the benefits of thinly provisioned LUNs as all of
the physical storage is allocated when the volume is zero-filled during creation. However, HP 3PAR Thin Persistence
software allows clients to retain thin provisioning benefits when using EZT VMDKs without sacrificing any of the
performance benefits offered by this VMDK option. In this case, Thin Persistence helps ensure that, when a new EZT volume
is created, the entire volume is not allocated from physical storage since all zeros that have been written to the VMDK were
intercepted and discarded by the HP 3PAR ASIC.
35
Technical white paper | HP 3PAR Thin Technologies
The default behavior of issuing UNMAPs can be disabled on a Windows 2012 server by setting the “disabledeletenotify”
parameter of the fsutil command. This will prevent reclaim operations from being issued against all volumes on the
server.
To disable reclaim operations, run the following PowerShell command:
fsutil behavior set DisableDeleteNotify 1
36
Technical white paper | HP 3PAR Thin Technologies
The vmkfstools -y command creates a temporary “balloon” file in the datastore that can be up to the size of the free space
available in the datastore, and then it issues UNMAPs for the blocks of the balloon file. However, it is recommended that the
reclaim percentage is not more than 60 percent as the resulting balloon file temporarily uses space on the datastore that
could cause the deployment of new virtual disks to fail while the vmkfstools command is running. This is an important
consideration as there is no way of knowing how long a vmkfstools -y operation will take to complete. It can be
anywhere from few minutes to several hours depending on the size of the datastore and the amount of space that needs to
be reclaimed.
There is an alternative method that can be faster and takes advantage of the ASIC-based zero-detect capability of HP 3PAR
StoreServ. Run the vmkfstools command with the -d eagerzeroedthick option to create a zerofile that can then
be removed:
# cd /vmfs/volumes/<volume-name>
# vmkfstools -c <size of space to reclaim> -d eagerzeroedthick zerofile
# rm zerofile-flat
In vSphere 5.5, the vmkfstools -y command is deprecated and in its place a new esxcli command UNMAP
namespace has been added that allows deleted blocks to be reclaimed on thin-provisioned LUNs that support the UNMAP
primitive. The reclaim mechanism has been enhanced so that the reclaim size can be specified in blocks instead of a
percentage value to make it more intuitive to calculate, and the unused space is reclaimed in increments instead of all at
once. However, reclaim operation is still manual.
To reclaim unused storage blocks on a vSphere 5.5 VMFS datastore for a thin-provisioned device, run the command:
# esxcli storage vmfs unmap –l volume_label | -u volume_uuid [–n number]
The datastore to operate on is determined by either using the -l flag to specify the volume label or -u to specify the
universal unique identifier (UUID). The optional -n flag sets the number of VMFS blocks to UNMAP per iteration. If it is not
specified, the command uses a default value of 200.
The VMware hypervisor does not report disks as being thin provisioning capable to the guest OS, when using VMDKs
therefore to reclaim thinly provisioned storage you must leverage the zero detection capabilities of HP 3PAR StoreServ. This
means using standard file system tools (such as SDelete in Microsoft Windows, dd in UNIX®/Linux) to write zeros across
deleted and unused space in a VM’s file system. The zeros will be autonomically detected by the HP 3PAR ASIC and the disk
space they were consuming will be freed up and returned to the thin-provisioned volume.
If running VMware ESX 4.1, it is strongly recommended to install the HP 3PAR VAAI plug-in before creating EZT VMDKs as it
will speed up their creation by 10x to 20x.
37
Technical white paper | HP 3PAR Thin Technologies
This will cause the RHEL 6.* to issue the UNMAP command, which in turn causes space to be released back to the array from
the TPVV volumes for any deletions in that ext4 file system. This is not applicable for fully provisioned virtual volumes.
In addition, the mke2fs, e2fsck, and resize2fs utilities also support discards to help ensure the TPVV volumes are
enhanced when administration tasks are performed.
There is also batched discard support available using the fstrim command. This can be used on a mounted file system to
discard blocks, which are not in use by the file system. It supports ext3 and ext4 file systems and can also re-thin a file
system not mounted with the discard option.
For example, to initiate storage reclaim on the /mnt file system, run:
# fstrim -v /mnt
/mnt: 21567070208 bytes were trimmed
The Linux swap code will also automatically issue discard commands for unused blocks on discard-enabled devices and
there is no option to control this behavior.
38
Technical white paper | HP 3PAR Thin Technologies
By default, the reclamation does not affect unmarked space, which is the unused space between subdisks. If a LUN has a lot
of physical space that was previously allocated, the space between the subdisks might be substantial. Use the -o full
option to reclaim the unmarked space.
# /opt/VRTS/bin/fsadm -V vxfs -R /<VxFS_mount_point>
The vxrelocd daemon tracks the disks that require reclamation. The schedule for reclamation can be controlled using the
vxdefault command. The reclaim_on_delete_wait_period parameter specifies the number of days after a
volume or plex is deleted when VxVM reclaims the storage space. The default value is 1, which means the volume is deleted
the next day. A value of -1 indicates that the storage is reclaimed immediately and a value of 367 indicates that the storage
space can only be reclaimed manually using the vxdisk reclaim command. The reclaim_on_delete_start_time
parameter specifies the time of day when VxVM starts the reclamation for deleted volumes and this defaults to 22:10.
To completely disable thin-reclaim operations, add the parameter reclaim=off to the /etc/vxdefault/vxdisk file.
39
Technical white paper | HP 3PAR Thin Technologies
These changes result in the creation of unused ASM disk space that can built up over time and although this space is
available for reuse within ASM, it remains allocated on the storage array. The net result is that the storage utilization on the
array eventually falls below desirable levels.
To solve this problem, HP and Oracle have partnered to improve storage efficiency for Oracle Database 10g and 11g
environments by reclaiming unused (but allocated) ASM disk space in thin-provisioned environments. The Oracle ASRU is a
stand-alone utility that works with HP 3PAR Thin Persistence software to reclaim storage in an ASM disk group that was
previously allocated but is no longer in use. Oracle ASRU compacts the ASM disks, writes blocks of zeroes to the free space, and
resizes the ASM disks to original size with a single command, online and non-disruptively. The HP 3PAR StoreServ, using the
zero-detect capability of the HP 3PAR ASIC, will detect these zero blocks and reclaim any corresponding physical storage.
You can issue a SQL query to verify that ASM has free space available that can be reclaimed:
SQL> select name, state, type, total_mb, free_mb from v$asm_diskgroup where name = ‘LDATA’;
Run the Oracle ASRU utility as the Oracle user with the name of the disk group for which space should be reclaimed:
# bash ASRU LDATA
Checking the system ...done
Calculating the new sizes of the disks ...done
Writing the data to a file ...done
Resizing the disks...done
/u03/app/oracle/product/11.2.0/grid/perl/bin/perl –I /u03/app/oracle/product/
11.2.0/grid/perl/lib/5.10.0 /home/ora/zerofill 5 /dev/oracleasm/disks/LDATA2
129081 255996 /dev/oracleasm/disks/LDATA3 129070 255996 /dev/oracleasm/disks/
LDATA4 129081 255996 /dev/oracleasm/disks/LDATA1 129068 255996
126928+0 records in
126928+0 records out
133093654528 bytes (133 GB) copied, 2436.45 seconds, 54.6 MB/s
126915+0 records in
126915+0 records out
133080023040 bytes (133 GB) copied, 2511.25 seconds, 53.0 MB/s
126926+0 records in
126926+0 records out
133091557376 bytes (133 GB) copied, 2514.57 seconds, 52.9 MB/s
126915+0 records in
126915+0 records out
133080023040 bytes (133 GB) copied, 2524.14 seconds, 52.7 MB/s
40
Technical white paper | HP 3PAR Thin Technologies
Learn more at
hp.com/go/3PARStoreServ
© Copyright 2012-2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only
warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should
be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
Microsoft and Windows are U.S. registered trademarks of the Microsoft group of companies. Oracle is a registered trademark of Oracle and/or its affiliates.
UNIX is a registered trademark of The Open Group. Red Hat is a registered trademark of Red Hat Inc. in the United States and other countries. SAP is a
registered trademark of SAP AG in Germany and other countries.