Isolated clusters for mission critical applications?
I’ve seen a theme at several customers who are virtualizing mission critical applications on vSphere: The isolated vSphere cluster used just for that application. I’ve seen organizations do this many times when virtualizing Exchange 2010, often dedicating two or three vSphere hosts just for Exchange and related components (domain controller, virtual load balancer, etc).
There are several (understandable) reasons why organizations choose to go down this road. The most common I’ve heard are:
1) “I don’t want my Exchange/SQL/SharePoint guys messing with the rest of the VMs on my production cluster(s).”
2) “I don’t want any other workload taking resources away from my Exchange/SQL/SharePoint VMs.”
I can see why folks make both of these arguments and can understand their logic. Many vSphere admins live in the Hosts and Clusters view where they see all virtual machines rather than those organized into folders (where granular permissions can be set more easily). And some are concerned that setting reservations on virtual machines to guarantee access to resources will have a negative impact on available slots in a cluster.
I don’t believe that either of these arguments are enough to require a dedicated vSphere cluster. A few thoughts are below, starting with addressing the two most common reasons I listed above.
1) VMs can be grouped into application specific folders, and granular permissions can be set to only grant access to those folders or VMs.
2) Reservations aren’t a bad thing or something to be scared of, they just need to be factored into any design. And using the “Percentage of Cluster Resources'” policy gets around worrying about individual slot sizes (though total resource assignments still need to be considered and calculated).
3) Fewer hosts in a cluster gives VMware features like DRS or HA fewer options when trying to initiate a failover or migrate virtual machines based on resource consumption. If an application is truly mission critical, I’d rather have more options to optimize availability and performance rather than fewer.
4) Let’s be honest – vSphere isn’t free or cheap, so dedicating hosts to a specific application is likely to result in heavily underutilized hosts and higher costs. Aren’t underutilized servers and cost reduction some of the reasons why we got into this virtualization business in the first place? If you made dedicated clusters for all of your mission critical applications you would end up with application silos that greatly reduce the efficiency of virtualization.
At the end of the day a design decision like this should always come down to the requirements. There may be very valid requirements, like compliance, security, or extremely large applications that make an isolated vSphere cluster a necessity. Assuming those requirements are not necessary in your environment I would strongly consider whether there is really a reason to use a dedicated cluster for your mission critical applications.
If you agree/disagree/think I’m crazy - feel free to let me know in the comments..
Clearing up confusion regarding HA/vMotion support for Exchange 2010
As I discussed last week, Microsoft has updated their guidance and support regarding hypervisor high availability and live migration technologies with Exchange 2010. This is welcome news for anyone looking to virtualize Exchange 2010 on vSphere, as it now allows you to take advantage of vMotion and HA on all Exchange servers, even those in a Database Availability Group (DAG).
Microsoft’s language on this change in support is causing a little bit of confusion on exactly what they do and do not support. See the following section below taken from the Exchange 2010 System Requirements page on TechNet (bolded emphasis mine):
Exchange server virtual machines, including Exchange Mailbox virtual machines that are part of a Database Availability Group (DAG), can be combined with host-based failover clustering and migration technology as long as the virtual machines are configured such that they will not save and restore state on disk when moved or taken offline. All failover activity must result in a cold start when the virtual machine is activated on the target node. All planned migration must either result in shut down and a cold start or an online migration that utilizes a technology such as Hyper-V live migration.
The bolded section seems to be the part that is confusing some people into thinking that some VMware features are not supported. But think about it for a second – does that description really sound like any VMware feature other than a simple Suspend? vMotion doesn’t save anything to disk during a migration and HA doesn’t save anything during a failure/restart of a VM. So what is Microsoft describing here?
Remember back when Hyper-V was released and Microsoft was touting a feature called Quick Migration to compete with VMware's vMotion? It was called Quick Migration and not Live Migration because the virtual machine was paused briefly and the state of the virtual machine was saved to disk during the migration. From the Quick Migration with Hyper-V (opens a .doc) whitepaper:
For a planned migration, quick migration saves the state of a running guest virtual machine (memory of original server to disk/shared storage), moves the storage connectivity from one physical server to another, and then restores the guest virtual machine onto the second server (disk/shared storage to memory on the new server).
It’s clear that the language in the TechNet article applies only to Hyper-V Quick Migration and not vMotion or even Hyper-V’s newer Live Migration feature. Microsoft probably felt they had a large enough installed base of Hyper-V RTM that clarifying this language was important.
Rumor has it that enough people are confused by this language that Microsoft is planning on clarifying it soon. At the end of the day this language doesn’t apply to those of us virtualizing Exchange 2010 on vSphere. Both HA and vMotion are now supported and that’s all that matters to us.
vSphere and Exchange admins can live in harmony – Microsoft finally supports HA and vMotion
This Saturday, Microsoft published a new white paper entitled “Best Practices for Virtualizing Exchange Server 2010 with Windows Server 2008 R2 Hyper V” that provides a lot of great info that is applicable to VMware as well. One of the most important things in this entire document is a change in policy regarding supporting virtualized Exchange 2010 with Database Availability Groups (DAG) in combination with hypervisor high availability and live migration. Previously Microsoft did not support the use of high availability or live migration even on its own Hyper-V platform. In the VMware world this of course means HA and vMotion. The whitepaper states the following:
Exchange server virtual machines, including Exchange Mailbox virtual machines that are part of a Database Availability Group (DAG), can be combined with host-based failover clustering and migration technology as long as the virtual machines are configured such that they will not save and restore state on disk when moved or taken offline. All failover activity must result in a cold start when the virtual machine is activated on the target node. All planned migration must either result in shut down and a cold start or an online migration that utilizes a technology such as Hyper-V live migration
This new document, as well as a post on the MS Exchange Team blog, confirms the new support stance. The Technet page has been updated as well. Note that you must be running Exchange 2010 SP1 in order to support these features.
Folks may know that I’m a big proponent of virtualizing mission critical, tier-1 applications like Exchange 2010. I’ve written about it here, touched on it here, and commented to TechTarget on the subject here and most recently here. It’s clearly an important subject to me and I applaud Microsoft for introducing this change. I think this will help to convince organizations that it is safe to virtualize Exchange 2010 on all hypervisors.
But..there must always be a but..
Remember that just because Microsoft now officially supports something doesn’t actually change anything in terms of functionality. Did VMware HA and vMotion work properly in combination with Exchange 2010 and DAGs before this policy change? VMware HA – sure, it just wasn’t officially supported. VMware vMotion – umm, hang on a minute there.
The DAG ultimately relies on Windows Failover Clustering to work, and WFC is notoriously finicky about even brief drops in network connectivity and loss of heartbeat. When performing a live migration using vMotion there is usually at least one ping dropped, and in my experience that single drop is often enough to cause databases to failover to other nodes in the DAG.
Does this mean that even though Microsoft supports vMotion now that you still can’t use it? Of course not, but it does require a slight change in your design to increase the cluster heartbeat timeout value to allow for the brief network interruption.
The values that need to change are the following:
SameSubnetDelay: The value (in milliseconds) of the cluster heartbeat frequency. By default, this value is 1,000 milliseconds.
SameSubnetThreshold: The value represents the amount of missed heartbeats that will be tolerated before a failover event occurs. By default this value is 5, so combined with the above value that means 5 seconds of lost heartbeats will result in a cluster failover by default.
Five seconds seems like enough time for a vMotion to complete, but in practice I’ve seen databases failover at multiple clients when using the default heartbeat values. Luckily you can change these values very easily by using PowerShell. The following commands show how to raise the timeout to 10 seconds (the Microsoft recommended max) from the default of 5, taken directly from the Microsoft whitepaper:
Import-module FailoverClusters
(Get-Cluster).SameSubnetThreshold=10
(Get-Cluster).SameSubnetDelay=1000
Depending on your environment you may not need to make this change, so always test first before implementing any cluster wide change like this. Make sure you have enough bandwidth on your hosts to account for migrating an Exchange VM that may have 32GB of RAM or more. And of course always stick with configurations that are supported by Microsoft.
I’m happy that Microsoft made this change, and hope that it signals a trend towards more virtualization friendly licensing in the future.
Exploring the performance benefits of VAAI
Over the long Thanksgiving weekend I decided to do some testing of one of the coolest new features in vSphere 4.1 - vStorage APIs for Array Integration. My original thought was to see if the performance benefits of using VAAI would justify more heavily using the eagerzeroedthick VMDK format because of the faster deployment times. I'll get to the results of that testing in a second, but first some background.
VAAI is a technology that allows the ESX/ESXi host to offload certain storage functions directly to the storage array rather than processing the data itself. A typical operation such as deploying a VM from template requires the ESX/ESXi host to read the data from the template via whatever storage protocol is in use (fiber, iSCSI, etc) and then write that data to the storage when cloning the VM. That isn't the most efficient use of resources, and it is compounded when cloning multiple VMs at once as those read/write operations become redundant.
By leveraging VAAI, those operations are offloaded to the storage array and so it eliminates much of those redundant reads/writes. As a result these operations complete much faster and with reduced CPU overhead to manage the process. In order to use VAAI you'll need both vSphere Enterprise as well as a storage array that supports it. Although the number of supported arrays is small that number will most certainly grow.
For my testing I used a Dell EqualLogic PS5000E running the 5.0.2 firmware which fully supports VAAI. My original thought was to see how much quicker deploying eagerzeroedthick VMDKs was with VAAI compared to without VAAI. Using eagerzeroedthick disks helps with performance of the VM by zeroing out all of the blocks in advance instead of when they are first accessed. This format is required for VMware Fault Tolerance and is recommended for high I/O servers such as Exchange and SQL.
To the results:
Right or wrong, go with what’s supported…
There has been a lot of drama recently between Microsoft and VMware with respect to virtualizing Exchange 2010. The background and details are summarized well in this article (in which I'm quoted). I think this situation brings up some important points about virtualizing critical Tier 1 applications - vendor support is very important, and disregarding vendor requirements for support can lead to problems when you need support the most.
There are a number of seemingly strange support requirements from Microsoft and VMware which often leads people to get confused on what is supported or how to configure their environments. A few of them are listed below:
Three (important) things you might not know about the vmkiscsid.log file
While working on an issue for a client recently I discovered a few things about the vmkiscsid.log file that I didn't know. I thought I'd share them in case others didn't know this information.
The vmkiscsid.log file, located in /var/log on your ESX host, maintains information and errors about iSCSI connections. We had a problem with our SANs where a NIC flapping issue caused errors to be written to that log file on a regular basis. That problem was resolved with a SAN firmware upgrade and things returned to normal.
While looking into another issue I noticed that the /var partition had filled up on all of the hosts (you're creating separate /var partitions for ESX installs, right....?). Taking a closer look revealed that the vmkiscsid.log file had grown to 3.7GB on all of the hosts before running out of space on /var. In the process of troubleshooting this situation, I learned a few important things about this log file.
1) The vmkiscsid.log file does not use automatic log rotation like other log files.
2) The vmkiscsid.log file is automatically deleted and recreated at bootup.
3) You cannot simply delete the vmkiscsid.log file to reclaim space on the partition. You must reboot.
I spoke with VMware support about this and a feature request has been created so that the log file does automatically rotate. For now I was told the only way to clear it is to reboot the host.
Is this something you need to be concerned about? Probably not - in most cases this log file remains very small and won't present a problem. It's likely only to be a problem if you're experiencing persistent problems with an iSCSI connection.
Update 11/10/2010: VMware has released a Knowledge Base article that address the behavior I described above. They describe a process of looking for open files that have the deleted vmkiscsid.log file locked. For what it's worth I tried to go down this road and wasn't able to find the file so a reboot was necessary for me.
Update 3/6/2011: VMware Support sent me an email saying that they're going to change this behavior in vSphere 4.0 Update 3. The logs will now use automatic log rotation(.1, .2, etc) just like other log files. There was no mention of when this will make it to vSphere 4.1, but I assume it will be included in the next update.
More vSphere 4.1 Enhancements – Welcome Back PVSCSI Driver!
As I keep digging into documents and KB articles I keep finding more and more things to like about vSphere 4.1. Today's find has to do with the PVSCSI driver.
With the release of vSphere 4.0, VMware added a new paravirtualized SCSI driver into the VMware Tools that provides better virtual disk performance than the standard LSI driver. The PVSCSI driver promised to deliver better performance and lower overall CPU utilization for workloads that had high I/O demands. Unfortunately the PVSCSI driver wasn't supported on virtual machine boot volumes, so folks held off on making this the default SCSI driver for all virtual machines.
After vSphere 4 Update 1 was released, VMware lifted the restriction and now supported the PVSCSI driver on boot volumes. Folks began considering adopting the PVSCSI driver in all virtual machines similar to how the VMXNET driver is a standard for nearly all virtual NICs. Soon afterwards VMware came out with a knowledgebase article stating that virtual machines that did not have heavy I/O demands could actually experience worse performance using the PVSCSI driver. They recommended only using the driver for workloads that had I/O demands in excess of 2,000 IOPS.
With the release of vSphere 4.1 that is no longer a problem and you can use the PVSCSI driver in all circumstances. Want details? Read on!
Nice Addition to vSphere 4.1 Enterprise License
When vSphere 4.1 was released a lot of people were talking about all of the amazing new features available in the product. Along with introducing many new features, VMware also shuffled around some existing features into different license levels. The one that got the most press was that vMotion was now available in vSphere Standard as well as the Essentials Plus bundle.
In my opinion one of the biggest changes isn't being talked about much at all. In vSphere 4.1, VMware has made the use of 3rd party multipathing plug-ins available in vSphere Enterprise edition and higher. In vSphere 4 this was only available in the Enterprise Plus license, making it out of reach for smaller customers.
This link compares the features available in vSphere 4.1 and it states that vStorage APIs for Multipathing are supported in both Enterprise and Enterprise Plus. Multipathing plug-ins like EMC's PowerPath/VE and Dell's new EqualLogic Multipathing Extension Module allow you to have better integration and control over multipathing than VMware's native multipathing plugins. I challenge you to say multipathing more times in one sentence than I just did.
The other really important point about VMware making this feature available in the Enterprise license is the likelihood that the Enterprise license will persist going forward. VMware had originally planned on phasing out the Enterprise license in favor of Enterprise Plus but backed down on that plan after customer complaints. They stated at the time that they expected sales of Enterprise to decrease over time as customers purchased Enterprise Plus. Taking a feature from Enterprise Plus and making it available in Enterprise shows, at least to me, a commitment to keep that license level around for the foreseeable future.

