Tuesday, September 1, 2015

#vmworld DRS Advancements in vSphere 6, Advanced Concepts and Future Directions

Most customers are using DRS on full automation with Affinity rules. Less than 1/2 are using Resource Pools although 99.8% are using maintenance mode. This discussion focuses on the specifics of how it works.

A variety of stats and metrics are considered during Initial Placement (IP) and Load Balancing (LB). There are a few that are key such as CPU and Memory reserved. The same holds true for VM performance statistics. For example active memory and CPU. All these stats are taken into account to ensure the VM has sufficient resources.

In addition the constraints within the cluster are looked at as well. These include HA and the admission control policies. As well as affinity and anti-affinity rules. The number of concurrent vMotions and how long it would take to vMotion the VM are also taken into account. Data store connectivity is also a factor when DRS considers load balancing. Finally the vCPU to pCPU ratio, existing shares and agent or special VMs (e.g. vShield Edge or Fault Tolerant (FT) VMs).

For every move DRS makes a cost vs. benefit analysis is done. The general idea is that the VM benefits must be higher than the overall negative impact of moving the VMs. The last consideration is the threshold setting configured by the vSphere administrator. VMware recommends not changes the default aggressiveness settings for most environments (the setting is set to 3 by default).

In vSphere 6.0 you can now specify a network bandwidth reservation on VMs as one of the metrics. This will invoke DRS if you have either a pNIC saturation or failure.

In addition vSphere 6 introduced a Cross-VC xvMotion placement. In this case we make a unified host and data store recommendation for x-VC motion. In this case a combined DRS and Storage DRS (SDRS) algorithm is run. All the same constraints are respected in a x-VC motion. To preserve affinity or anti-affinity rules they are migrated in an x-VC motion as well. This is referred to as a rule migration.

vSphere 6 increased the Cluster scale to 64 hosts and 8,000 VMs running DRS and HA. In general the operational throughput has been increased by 66%. VMs will Power-on quicker, clone faster, vMotion quicker and provide faster transition to host maintenance mode on vSphere 6.

vSphere Upgrade Manager uses DRS extensively to facilitate a upgrade. In addition many other components of the SDDC leverage either DRS directly or the DRS algorithms.

If you want Uniform distribution across all hosts you can set either of these advanced options

  • LimitVMsPerESXHost
  • LimitVMsPerESXHostPercent

Note: this will not violate DRS algorithm to ensure capacity and resources for the VM.

There are some best practice guidelines that VMware recommends:

  • Full connectivity to all storage pools for all hosts
  • Set BIOS power management on the host to “OS control” (note: OS control is a min BIOS setting; High Performance is a max but provides no power savings)
  • Make sure the power mode on ESXi is set to balanced
  • Fully automated is considered a best practice
  • Don’t dilute Resource Pool Shares by powering on to many VMs within them when you create them
  • Do not set CPU-affinity as it pegs the VM to that core vs. guarantying any resources.

In the future DRS will support proactive High Availability. The proactive HA will trigger based on hardware health metrics. For example if the host is partially degraded, DRS will quarantine the host. This means that DRS will opportunistically evacuate VMs and not use it to migrate VMs to. If it is fully degraded then the VMs would be proactively evacuated from the host. Like DRS, there will be a aggressiveness setting to allow you to throttle the reaction of DRS with proactive HA. With tighter integration with NSX, flow-id’s can be used to co-locate chatty VMs. This is not easy to do on significant scale but with NSX the information is already available.

With integration to vRealize Operations, DRS will use the predictive demand algorithm to allow the environment to adjust based on demand that is expected.  VMware is already running Hybrid DRS which allows DRS to seamlessly burst into vCloud Air. This will be available in future releases of the solution as well.

No comments:

Post a Comment