AIX Micro Partitioning
AIX Micro Partitioning
AIX Micro Partitioning
Abstract
Starting from year 2000 IBM announced the possibility to partition the pSeries family systems. Initially it was only possible to run dedicated processors in a logical partition. Now p5 series has integrated new virtual engine system technologies into their hardware and software. This introduced new features as Micro-Partitioning which provides the ability to share physical processors among logical partitions, Virtual Lan which provides network utilization features capabilities that permit you to prioritize traffic on shared networks and allows secure communication between logical partitions without the need of a physical network adapter, Virtual I/O which provides the ability to dedicate I/O adapters and devices to a virtual server. It allows a single physical I/O adapter to be used by multiple logical partitions on the same server. This allows consolidation of I/O servers and minimizes the amount of I/O adapters required. New concepts of shared partitioning have been implemented in the p5 series compared to the IBM mainframe. In this paper I will provide a detailed overview of shared partitioning and its performance metric considerations . INDEX 1 2
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9
INTRODUCTION CONCEPTS
Power Hypervisor HMC - Hardware Management Console Virtual CPUs Partition profiles Micro-Partitioning SMT Symmetric Multi Processing Logical CPUs DLPAR dynamic logical partitioning PLM - Partition Load Manager
2 2
2 2 2 3 3 4 4 5 5
3
3.1 3.2 3.3
7
7 9 10
4 5
10 12 12
BIBLIOGRAPHY
Introduction
Before starting to measure shared partition activity its essential for the performance analyst to obtain a general overview of the new terms introduced and understand how logical partitioning is configured and works in the P5 series environment. The following arguments will be discussed in this paper: Power Hypervisor, Hardware Manager Console (abbreviated as HMC), partition profiles, Micro-Partitioning, entitled capacity, virtual CPUs, logical CPUs, Symmetric Multi Processing (abbreviated as SMT), Partition Load Manager (abbreviated as PLM), performance tools as vmstat, lparstat, mpstat and performance case examples.
2
2.1
Concepts
Power Hypervisor
The Power Hypervisor (Hypervisor in the following) is a component of system firmware that permits partitioning and dynamic resource movement across multiple operating systems. It performs the time slicing of the physical processors between the logical partitions (LPARs). Equivalent to the IBM mainframe PR/SM software, it acts as a hidden partition with no processors assigned to it. The software lays between the system hardware and LPARs. The communication between the LPAR and the Hypervisor is done on behalf of functions generated by the operating system kernel running in the partition. The Hypervisor is scheduled to generate an interrupt every 10ms as a timing mechanism for controlling the dispatch of physical processors between LPARs. Each LPAR is guaranteed to get its share of processor cycles during each 10 ms dispatch window. The operating system kernel software generates hypervisor calls that permit LPAR integrity. The number and type of calls can be analysed by the lparstat utility.
2.2
The HMC is a dedicated Linux black-box computer for configuring and managing LPARs on a system. As up to date the HMC can manage up to 254 LPARs. It provides a set of functions to manage the LPARs, perform capacity on demand functions, create LPAR and system profiles, display system and LPAR status, act as a virtual console for each LPAR and provide cluster support. HMC also supports dynamic partitioning (the ability to add or remove resources while the LPAR is running).
2.3
Virtual CPUs
A virtual CPU is the operative system view of a physical processor. Every LPAR is assigned a minimum, desired and maximum virtual CPUs. The more virtual CPUs defined the more parallel work can be executed. The total assigned capacity divided by the virtual CPUs provides the amount of capacity each virtual CPU can execute each time its dispatched. A maximum number of 10 virtual cpus can be allocated for each physical processor. The Hypervisor firmware schedules the virtual CPUs on the physical processors.
2.4
Partition profiles
Partition profiles are created by the HMC. Each LPAR can have one or more partition profiles. Depending on the type of workload you can define a profile that assigns the resources which with the LPAR should boot. In order to refresh the profile the operating system needs to be shutdown and the LPAR deactivated. In a partition profile you assign all the resources as virtual CPUs, memory, physical devices and virtual devices. When activating the LPAR you need to select the partition file otherwise it will use the default profile. Its also possible to create a system profile which can manage more then one partition profile. If you use a system profile all the partition profiles are activated in the order specified by the system profile. At the moment mixing shared and dedicated processors in the same LPAR is not possible but both types of partitions can exist on the same system: one with dedicated CPUs or one with virtualized processors that share a pool of physical processors (see the Micro-Partitioning chapter). Dedicated CPUs go in the shared pool if their partitions are not activated1. The maximum number of virtual processors supported by a LPAR is 64.
2.5
Micro-Partitioning
Micro-Partitioning provides the ability to share physical processors among LPARs. This feature allows LPARs to allocate fractions of processors. The minimum amount of processor units that can be assigned to a LPAR is 1/10 of a physical processor. In the partition profile it is represented as 0.10. When creating a partition file you establish the amount of maximum, desired and minimum processing capacity for each LPAR. When the LPAR boots it tries to obtain the desired amount; if this amount is not available it will try to obtain the maximum amount it can get from the minimum and desired amount defined. This amount of capacity is called entitled capacity. If the sum of the entitled capacity from the active LPARs and the one required to start the next LPAR over-commits the number of physical processors the LPAR will not boot2. The entitled capacity is always available for the LPAR if it needs it irrespective to the load of the system. The Hypervisor does not change the entitlement of a LPAR, it just controls the number of cycles a LPAR can use. However the entitled capacity can be changed by dynamic logical partition operations manually or by the PLM. The minimum and maximum capacity entitled values set in the partition profile set the limits for dynamic partition operations. Unallocated CPU cycles are those cycles not included in the entitled capacity. Furthermore a LPAR can use less then its entitled capacity and donate the unused CPU cycles to the Hypervisor. The sum of unallocated and unused capacity is the available capacity. A LPAR can be capped or uncapped.
1 2
It' s possible to set an option in the HMC to avoid this behaviour. The entitled capacity is guaranteed to the logical partition and can not be used to start another partition.
A capped LPAR can only consume up to its entitled capacity. You have to assign each uncapped LPAR a weight value which can range from 1 to 2553. Micro-Partitioning provides additional CPU capacity taken from the available capacity in the shared pool to requesting uncapped LPARs based on their weight value. An uncapped LPAR is only limited from the amount of virtual processors assigned to the LPAR and from the available capacity. The additional capacity used above the entitled is known as the excess capacity. When more uncapped LPARs need additional capacity at the same time the Hypervisor will distribute available CPU cycles based on the LPAR priority. The LPAR priority is calculated by dividing the current excess capacity (used capacity entitled capacity) by the weight. A lower value indicates an higher priority. An example of LPAR priorities calculation is showed in Table 1.
Used capacity 0.8 0.7 Entitled capacity 0.5 0.5 Excess capacity 0.3 0.2 Table 1 Weight 100 200 Lpar priority 0.003 0.001 (higher priority)
Lpar1 Lpar2
2.6
Symmetric Multi Processing allows applications to increase there overall resource utilization by virtualizing virtual CPUs through the use of multi threading. Two separate instruction streams can run concurrently on the same virtual CPU. Each thread is treated as a logical processor. This feature can be enabled or disabled dynamically or after reboot by means of the smtctl command or from the smit panel. The smtcnt command shows the threads allocated to the virtual processors. A LPAR can be booted in single thread mode (ST) or multi thread mode (SMT) which is the default. The mpstat reports the usage of the virtual and its associated logical CPUs. The mpstat utility also shows the partitions entitled capacity. The entitled capacity is broken down into the virtual CPUs and then logical CPUs. Setting the smt_nooze_delay to 0 by the schedo command maximizes the speed of work of the processors because it gives the processor capacity back to the hardware when the thread is in an idle loop. The evaluated performance increase is 30%. This feature supports dedicated and shared CPUs and provides advantages to application where the number of transactions outweighs the importance of speed. Not in all cases the performance is better. In some cases it would be better to disable the SMT feature and provide dedicated processors where workloads can not tolerate the variability due to resource sharing.
2.7
Logical CPUs
When SMT is enabled each virtual processor is assigned two threads that are seen by the operating system as two logical CPUs. This means the number of logical CPUs seen by the operating system is always double the number of virtual CPUs when SMT is enabled. The first output record created by the standard native utilities as vmstat and mpstat always shows the number of logical CPUs. It is presented by the lcpu field.
3
2.8
Dynamic partitioning allows resources to be moved from different LPARs. For dedicated processors it' s only possible to move and take whole processors. For shared LPARs it' s possible to change the entitled capacity, the weight of the partition, the mode (capped or uncapped), the number of virtual processors. The highest possible values for entitled capacity and number of virtual CPUs are dependant on the maximum values defined in the partition profile. To verify the changes the lparstat -i command can be used.
2.9
Partition Load Manager is part of the advanced virtualization feature; it helps maximize the resource utilization for LPARs exploiting dynamic logical partition capabilities by making automatic adjustment. Based on defined policies, resources as memory and CPU are moved to LPARs with higher demands from LPARs with low demands. Partition Load Manager has to be installed on a LPAR but that LPAR can be shared with other workloads. It uses resource monitoring to control and communicate with the managed partitions. To communicate with the HMC station, a SSH (secured socket) connection needs to be configured. The policy defines the minimum and maximum resource thresholds4 and other specific parameters for each LPAR. LPARs that have reached an upper threshold become resource requesters and LPARs that have reached minimum thresholds become resource donors. Partition Load Manager uses the load average values to decide when to add or remove resources. However the resource increments and decrement changes5 can not go out of the minimum and maximum values defined in the HMC partition policy. The LPAR that needs the resource and still didn' t arrive at its maximum resource defined can get the resource from one of the following areas: 1) a pool of free unallocated resources 2) a resource donor 3) a lower priority LPAR If a request can not be honoured it is queued until it can find resources. When PLM needs to distribute the resources between more LPARs requesting them at the same time, it uses the partitions shares (a PLM parameter similar to the HMC weight) and the guaranteed entitlement (another PLM parameter by default set equal to HMC desired entitlement) to determine the LPAR priority.
4 5
This values have to be inside the HMC minimum and maximum values. The capacity moved is a fixed percentage defined as the delta value in the PLM policy.
The calculation is similar to the one showed in Table 1 but in this case the LPAR priority is calculated dividing the current excess entitlement (used capacity guaranteed capacity) by the shares. Also in this case a lower value indicates an higher priority. Its important to note that Hypervisor uses the weights to distribute the excess CPUs in addition to each LPAR entitled CPUs while PLM uses the shares to increase or decrease the LPAR entitled CPUs. A consequence of this different behaviour is that the distribution of resources at peak times can be different when using Hypervisor or PLM. Here is an example provided from one of the IBMs redbook . On a system with 6 physical processors we have the following configuration:
Lparname Lpar1 Lpar2 Entitled capacity 1 2 Table 2 weight 20 10 shares 20 10
We have 3 excess CPUs that can be used (6 physical - 3 entitled = 3 excess). When PLM is not activated the number of excess CPUs to be assigned to each LPAR is calculated dividing the LPAR weight by the total sum of the weights and multiplying the result by the number of excess CPUs. The total number of CPUs each partition can get is: Lpar1 - 1 entitled CPU + 2 excess CPUs (20/30*3) Lpar2 - 2 entitled CPUs + 1 excess CPU (10/30*3) When PLM is activated the number of entitled CPUs to be assigned to each LPAR is calculated dividing the LPAR shares by the total sum of the shares and multiplying the result by the number of total CPUs. In this case the total number of CPUs each partition can get is: Lpar1 - 4 entitled CPUs (20/30*6) Lpar2 - 2 entitled CPUs (10/30*6) If running workloads are very variable it could be necessary to define LPARs with many virtual processors (much more than the LPARs normally need). This of course can cause a lot of overhead. One of the main advantages of PLM is the possibility to define a small amount of virtual processors and let PLM manage the virtual CPUs towards workload needs. This way there is a higher utilization for the virtual CPUs and lower overhead.
3
3.1
The lparstat utility is a new utility introduced with the AIX 5.3 operative system. This command provides a report of LPAR related information and utilization statistics. This command provides a display of current LPAR related parameters and Hypervisor information, as well as utilization statistics for the LPAR. The various options of lparstat are exclusive of each other. The lparstat command with no options will generate two rows of statistics. The first row, included once when the command starts and again whenever there is a change in the system configuration, displays the System Configuration; the second row contains the Utilization Statistics which will be displayed in intervals and at any time the values of these metrics are deltas from pervious interval. If the -h option is specified, the report will include summary statistics related to the Hypervisor. If an Interval and Count are specified, a report is repeated for every Interval seconds and for Count iterations. The following information is displayed in the system configuration row:
Statistic Description type Partition Type. Can be either dedicated or shared. mode Indicates whether the partition processor capacity is capped or uncapped allowing it to consume idle cycles from the shared pool. Dedicated LPAR is implicitly capped. smt Indicates whether simultaneous multi threading is enabled or disabled in the partition. lcpu Number of online logical processors. mem Online Memory Capacity. psize Number of online physical processors in the pool. ent Entitled processing capacity in processor units. This information is displayed only if the partition type is shared. Table 3
The following statistics are displayed only when the partition type is shared6:
Statistic physc %entc lbusy app phint Description Shows the number of physical processors consumed. Shows the percentage of the entitled capacity consumed. Shows the percentage of logical processor(s) utilization that occurred while executing the user and system level. Shows the available physical processors in the shared pool. Shows the number of phantom (targeted to another shared partition in this pool) interruptions received. Table 5
The following statistics are displayed only when the -h flag is specified:
Statistic %hypv Hcalls Description Shows the percentage of time spent in hypervisor. Shows number of hypervisor calls executed. Table 6
More detailed information about the system configuration can be obtained using the -i option; they are listed below7:
Description Logical partition name as assigned at the HMC. Number of this Logical Partition. Number of CPUs (virtual engines) currently online. Maximum possible number of CPUs (virtual engines). Amount of memory currently online. Maximum possible amount of Memory. Indication whether the LPAR is using dedicated or shared CPU resource. Indication whether the LPAR processor capacity is capped, or if it is uncapped and allowed to consume idle cycle from the shared pool. Dedicated LPAR is implicitly capped. Entitled Capacity The number of processing units this LPAR is entitled to receive. Variable Capacity Weight The priority weight assigned to this LPAR which controls how extra (idle) capacity is allocated to it. A weight of -1 indicates a soft cap is in place. Minimum Capacity The minimum number of processing units this LPAR was defined to ever have. Entitled capacity can be reduced down to this value. Maximum Capacity The maximum number of processing units this LPAR was defined to ever have. Entitled capacity can be increased to this value. Capacity Increment The granule at which changes to Entitled Capacity can be made. A value in whole multiples indicates a Dedicated LPAR. Maximum Dispatch The maximum number of nanoseconds between dispatches of the LPARs virtual Latency CPUs. Maximum Physical CPUs The maximum possible number of physical CPUs in the system containing this in System LPAR. Active Physical CPUs in The current number of active physical CPUs in the system containing this LPAR System Active CPUs in Pool The current number of physical CPUs in the shared processor pool being used by this LPAR. (i.e. online physical processors in pool ) Unallocated Capacity Number of processor units currently unallocated in the shared processor pool being used by this LPAR.
6 To obtain the app field one needs to authorize the logical partition profile to provide this information by means of the HMC. 7
Statistic Partition Name Partition Number Online Virtual CPU Maximum Virtual CPUs Online Memory Maximum Memory Type Mode
Physical CPU Percentage Minimum Memory Minimum Virtual CPUs Unallocated Weight Minimum Virtual Processor Required Capacity Partition Group ID
Fractional representation relative to whole physical CPUs that these LPARs virtual CPUs equate to. This is a function of Entitled Capacity / Online CPUs. Dedicated LPARs would have 100% Physical CPU Percentage. A 4 way virtual with Entitled Capacity of 2 processor units would have a 50% physical CPU Percentage. Minimum memory this LPAR was defined to ever have. Minimum number of virtual CPUs this LPAR was defined to ever have. Number of variable processor capacity weight units currently unallocated within the LPAR group. The minimum entitled capacity required by the operating system, for each online virtual CPUs. LPAR group that this LPAR is a member of. Shared Pool ID Identifier of Shared Pool of Physical processors that this LPAR is a member. Table 7
The H option provides detailed Hypervisor information. This option basically displays the statistics for each of the Hypervisor calls. The various details displayed by the H option are listed below:
Statistic Call Number of calls Total Time Spent Hypervisor Time Spent Average Call Time Maximum Call Time Description Hypervisor call type Number of Hypervisor calls made. Percentage of total time spent in this type of call. Percentage of Hypervisor time spent in this type of call. Average call time for this type of call in nanoseconds. Maximum call time for this type of call in nanoseconds. Table 8
3.2
mpstat
The mpstat command collects and displays performance statistics for all logical CPUs in the system. Users can define both, the number of times the statistics are displayed, and the interval at which the data is updated. When the mpstat command is invoked, it displays two sections of statistics. The first row, included once when the command starts and again whenever there is a change in the system configuration, displays the System Configuration; the second row contains the Utilization Statistics which will be displayed in intervals and at any time the values of these metrics are deltas from pervious interval. When the -s flag is specified, the mpstat command reports simultaneous multi-threading (SMT) utilization, if it is enabled. This report displays the Virtual CPU Engines Utilization and the utilization of each thread (logical CPU ) associated with the Virtual CPU engine. Table 9 shows an example of virtual processors and logical CPUs utilization information obtained using the mpstat s command for a SMT enabled LPAR. Proc0 and Proc1 are virtual CPUs while cpu0, cpu1, cpu2 and cpu3 are logical CPUs.
Proc0 cpu0 cpu1 Proc1 cpu2 cpu3 0,27% 0,17% 0,10% 49,63% 3,14% 46,49% Table 9
3.3
vmstat
The vmstat utility has been on UNIX for many years. Now the utility has been enhanced to support the Micro-Partitioning technology. Now two new metrics are reported; they are the number of physical processors consumed and the percentage of entitled capacity consumed. The physical processors consumed is represented by the pc output variable. The percentage entitled is represented by the ec variable. The percentage of entitlement consumed is calculated by the following formula. pc number of processors used ent number of entitled processors ec=(pc/ent)*100; percent of entitled capacity consumed.
Performance Case
In order to provide a better understanding of the systems behaviour we performed some tests on one of our logical partitions. A perl program was created that generates a loop 100 million times. We submitted different groups which performed parallel executions of this program in the background and timed the elapsed time of each execution and took the system performance statistics. The name of the group represented the number of programs that were executed in parallel. For example GROUP1 executed one program while GROUP2 executed two programs in parallel. The tests were performed with SMT enabled and disabled on an IBM 9117-570 evaluated at 54 RPERF. The operating system was AIX 5.3. The following sections provide a description of the utilities we used and the results produced. The lparstat i command provides us with the logical partitions characteristics :
Node Name Partition Name Partition Number Type Mode Entitled Capacity Partition Group-ID Shared Pool ID Online Virtual CPUs Maximum Virtual CPUs Minimum Virtual CPUs Online Memory Maximum Memory Minimum Memory Variable Capacity Weight Minimum Capacity Maximum Capacity Capacity Increment Maximum Physical CPUs in system Active Physical CPUs in system Active CPUs in Pool Unallocated Capacity Physical CPU Percentage Table 10 test testlpar 6 Shared-SMT Uncapped 1.00 32774 0 2 10 1 2048 MB 2560 MB 1536 MB 250 0.80 1.20 0.01 12 12 9 0.00 50.00%
As seen from the above output this logical partition has 12 physical processors while 9 physical processors are in the shared pool and 3 physical processors are dedicated. There are two virtual CPUs defined with an entitled capacity of 1. The following lparstat output provides statistics for each group of processes that were executed when SMT was enabled.
GROUP# GROUP1 GROUP2 GROUP3 GROUP4 %user %sys %wait %idle physc %entc lbusy vcsw phint 93,4 0,2 6,4 1,01 100,9 25,0 219 3 93,8 0,1 6,1 2,00 199,9 50,1 135 7 96,6 0,2 3,2 2,00 200,0 74,8 93 11 99,9 0,1 2,00 199,9 100,0 233 14 Table 11
As seen very clearly from the above output the %entc jumps immediately to 100% for GROUP1 because it already reaches the amount of its entitled capacity which is 1 CPU. This is confirmed by the phsyc field which says that only one physical processor is being used. The lbusy fields shows the logical CPUs or threads utilization. As seen in reports the utilization always performs jumps of 25% until it uses all the available threads. In order to understand better how the logical CPUs are working, I provided statistics for each GROUP using data obtained from the mpstat utility which shows the breakdown of the virtual and logical CPUs for the execution of each group.
GROUP# GROUP1 GROUP2 GROUP3 GROUP4 Proc0 cpu0 cpu1 Proc1 cpu2 cpu3 12.66% 11.82% 0.84% 88.95% 80.95% 8.00% 100.00% 6.26% 93.74% 100.00% 6.66% 93.34% 100.15% 12.65% 87.50% 100.00% 43.75% 56.25% 99.96% 49.86% 50.10% 99.96% 50.13% 49.84% Table 12
The following output was produced by the vmstat utility. The results are the same from the lparstat utility. Here one can see the number of processes in the run queue ( r ) each run.
GROUP# GROUP1 GROUP2 GROUP3 GROUP4 r b 1 0 2 0 3 0 4 0 avm 200171 200364 200560 200750 fre re pi po fr sr cy in sy 9493 0 0 0 0 0 0 13 217 9297 0 0 0 0 0 0 0 86 9076 0 0 0 0 0 0 1 103 8881 0 0 0 0 0 0 1 31 Table 13 cs 212 185 175 180 us sy 94 0 94 0 97 0 99 0 id wa 6,0 6,0 3,0 pc 1,01 2,00 2,00 2,00 ec 100,6 200,1 200,0 200,0
We performed a series of similar tests with SMT disabled8. The number of logical CPUs (lcpu) changes from 4 to 2 and the smt parameter changed from On to Off. The following example shows how the logical partition behaved executing the processes after the change.
GROUP# %user %sys %wait %idle physc %entc lbusy vcsw phint GROUP1 99,7 0,2 1,00 100,2 50,0 40 3 GROUP2 99,7 0,3 2,00 200,0 100,0 1 6 Table 14
8
When SMT is enabled or disabled the mpstat utility notifies that the change was taken place.
Table 14 shows the one to one relationship between the virtual CPUs and logical CPUs. When the first process was executed the lbusy is 50%, When both of the virtual CPUs are busy the lbusy is 100%. The following table contains a comparison of the different groups executions when SMT was enabled and disabled. As seen from these statistics when SMT is enabled the elapsed time is much better when the number of processes increases,
Number of processes 1 2 4 8 16 24 no SMT (elapsed) 77 76 154 306 658 947 Table 15 SMT (elapsed) 88 82 91 167 364 576
Conclusions
Micro-Partitioning technology has introduced many new concepts which must be understood before analysing shared logical partitions. This new technology introduced also changes in the well known native utility vmstat and introduced the new utility named lparstat which provides LPAR configuration, CPU and Hypervisor statistics.
Bibliography
Advanced POWER Virtualization on IBM System p5, IBM Redbooks, SG24-7940-01 AIX 5L Practical Performance Tools and Tuning Guide, IBM Redbooks, SG24-6478-00 AIX 5L on IBM eServer i5 Implementation Guide, IBM Redbooks, SG24-6455-00 IBM eServer p5 590 and 595 System Handbook, IBM Redbooks, SG24-9119-00 Introduction to pSeries Provisioning, IBM Redbooks, SG24-6389-00 Partitioning Implementation for IBM eServer p5 Servers, IBM Redbooks, SG24-7039-02