The Lowercase w
15Apr/137

New Book Coming Soon: Virtualizing Microsoft Business Critical Applications on VMware vSphere

I’m very pleased and excited to announce that the book I’m co-authoring, Virtualizing Microsoft Business Critical Applications on VMware vSphere, is available for both pre-order on Amazon as well as in Rough Cuts format from VMware Press.  Though the release dates online currently list a September release date we’re working with VMware Press to try to get it released before VMworld 2013.

I’m co-authoring the book with Alex Fontana of VMware.  Alex is a great resource for virtualizing business critical applications and he’s helped produce quite a bit of the white papers and best practices for virtualizing applications like Microsoft Exchange 2010.  He and I also worked together to help VMware create materials for the Virtualizing Business Critical Applications partner competency back in 2011/2012.  Having Alex as a co-author adds so much to the book and I’m thrilled to have him.

The book will cover the latest enterprise products from Microsoft, including Exchange 2013, SQL 2012, SharePoint 2013, and Windows Server 2012.  In addition, the book will provide some higher level detail on benefits, risks, and strategies for success that are applicable to virtualizing almost any business critical application.  Even if your organization is only planning on virtualizing one or two applications from the book I’m confident you’ll still find value.

I’m very much looking forward to getting the book finished and released. I’ve really enjoyed the whole process of writing and I’m already thinking about what to write next.  As any other author will tell you writing a technology book is a lot of work but I find it to be very rewarding and a lot of fun.  Ok so maybe my idea of fun is a little warped since most of the content for this book was written between the hours of 10:00PM and 2:00AM.  After all of the effort we’ve put in I can’t wait to see this baby in print!

As we get a little closer to the release date of the book I’ll be giving away several copies to readers so stay tuned for your chance to get your hands on the book for free. For now though feel free to head on over the Amazon and pre-order your copy today!

Amazon pre-order: http://www.amazon.com/Virtualizing-Microsoft-Business-Applications-Technology/dp/0321912039/

Safari “Rough Cuts” edition: http://my.safaribooksonline.com/book/-/9780133400373  (Only a few chapters are up so far but more will show up soon)

image

30Jan/132

Cloning Windows Server 2012 Domain Controllers on vSphere 5

Microsoft introduced a lot of virtualization awareness in Windows Server 2012, particularly for domain controllers.  They are, for the most part, considered virtualization “safeguards” in that they prevent against some of the classic problems of virtualizing domain controllers.  Historically things like virtual machine snapshots, restoring from virtual machine image backups, or cloning domain controllers was either difficult or impossible.  With the introduction of the VM-GenerationID it is now safe to use virtual machine snapshots and even clone existing domain controllers.

How Does It Work?

The VM-GenerationID is a unique identifier exposed to the virtual machine by the hypervisor that helps to prevent issues with domain controller snapshots, cloning, etc.  When virtualizing Windows Server 2012 on vSphere, the VM-GenerationID is included as part of the virtual machine’s VMX file in the attribute vm.genid.  This attribute is present on all Windows Server 2012 VMs, not just those that are domain controllers.  See below for a few lines from a VMX file of a Windows Server 2012 VM with the vm.genid value highlighted.

evcCompatibilityMode = "TRUE"
softPowerOff = "FALSE"
vm.genid = "-5266020153200197717"

You need to be running a version of vSphere that supports VM-GenerationID.  That includes vSphere 5.0 Update 2 or vSphere 5.1 (if running vCenter 5.1, the ESXi hosts have to be running at least vSphere 5.0 Update 2).  You can tell if you’re running a compatible hypervisor by looking for “Hyper-V Generation Counter” in Device Manager of a Windows Server 2012 VM.

You’ll need to show hidden devices in Device Manager to see it.  Select View\show hidden devices:

Show Hidden Devices

Then expand System Devices and you’ll see Hyper-V Generation Counter.

Hyper-V Generation Counter

When a Windows Server 2012 VM is promoted to a domain controller, the unique value of the VM-GenerationID is stored in the msDS-GenerationID attribute on its copy of the Active Directory database.  Open up Active Directory Users and Computers, select View\Advanced Features, and then right click on your domain controller and select Properties.  You can see the msDS-GenerationID attribute in the Attribute Editor tab. Note that since this value is not replicated to other domain controllers, the value will appear as "<not set>” if you try to look at the attributes of a different domain controller.

image

Prerequisites And Overview

Now that we’ve confirmed that VM-GenerationID is supported how do we go about cloning our domain controllers?  It’s really a very simple process, though you’ll see it doesn’t work exactly like cloning other virtual machines.  First, the prerequisites:

1) The domain controller running the PDC Emulator role must be running Windows Server 2012 and must be online during the process.  It isn’t necessary for this domain controller to be virtualized but it certainly can be.

2) The hypervisor needs to support VM-GenerationID.  Though we already stated this, it’s important to re-state it in case you have a cluster with a mix of versions (or are in the process of upgrading).  If you’re using DRS in fully automated mode then vCenter will automatically pick the best host to power up your cloned DC and if that host does not support VM-GenerationID then your clone will fail.  Yet another reason to keep all hosts in a cluster at a consistent build.

Now, on to a high level overview of the process:

1) Promote a server to a domain controller that will be used primarily for cloning.  This domain controller should not have any FSMO roles and should not be the primary/secondary DNS server for any servers on your network.  This step is not absolutely required, but you’ll see why I’m recommending this as we go through the cloning process.

2) Add the domain controller you just created to the “Cloneable Domain Controllers” group in Active Directory (located under the Users OU).

3) Create a list of “allowable” software on the DC you’re cloning.

4) Create a configuration XML file that specifies the settings for the new domain controller.

Preparing the DC for cloning

At this point you’ve promoted your domain controller and made sure it has no FSMO roles or acts as primary DNS for any servers.  Now it’s time to add the VM to the Cloneable Domain Controllers group in AD.

image

It is recommended that you remove the DC from this group after you’re finished cloning.

Creating the Allowable Applications List

Windows maintains a list of the applications and services that are allowed to be running on a domain controller that is used as the source for a clone.  These are mostly familiar Windows services, and you can view the full list at c:\Windows\System32\DefaultDCCloneAllowList.XML.  A Windows Server 2012 domain controller cannot contain any applications or services that may not function properly if the server is cloned.  It is intended to catch things like DHCP services that need to be authorized in AD and are better installed manually than cloned.  In order to see if any of these applications exist on the server, issue the following PowerShell command:Get-ADDCCloningExcludedApplicationList  Below is the output of that command on a domain controller in my lab:

PS C:\> Get-ADDCCloningExcludedApplicationList

Name                                                        Type
----                                                        ----
Microsoft Visual C++ 2008 Redistributable - x64 9.0.3072... Program
VMware Tools                                                Program
Microsoft Visual C++ 2008 Redistributable - x86 9.0.3072... WoW64Program
VMTools                                                     Service
vmvss                                                       Service
WLMS                                                        Service

You’ll notice some familiar applications on there, most notably the components of VMware Tools.  We’ve known for years that VMware Tools is fully compatible with cloning all virtual machines, so we simply need to “allow” these applications to be present on our domain controller before cloning by generating an XML file called CustomDCCloneAllowList.xml.  To generate the list, simply issue the same command with the –GenerateXML switch as shown below.  Note that the XML file will be created wherever you’ve stored your Active Directory database, or C:\Windows\NTDS by default.

PS C:\> Get-ADDCCloningExcludedApplicationList -GenerateXml
The inclusion list was written to 'C:\Windows\NTDS\CustomDCCloneAllowList.xml'.

Creating the Clone Configuration File

The next (and last) step you’ll need to perform is to create an XML file that contains the configuration details of your new domain controller.  This includes the new domain controller’s name, networking information, and AD Site (if required).  You can create the XML file using the PowerShell command New-ADDCCloneConfigFile.  As you can see below, you pass the necessary configuration parameters to the New-ADDCCloneConfigFile cmdlet to create the XML file:

PS C:\> New-ADDCCloneConfigFile -Static -IPv4Address "192.168.1.153" -IPv4DNSResolver "192.168.1.140" -IPv4SubnetMask "255.255.255.0" -IPv4DefaultGateway "192.168.1.1" -CloneComputerName "W12-DC5" -SiteName "NJ"

If everything worked you’ll see output similar to what you see below and a DCCloneConfig.xml file will be created in the same directory as your AD database:

image

What if you want to clone multiple domain controllers instead of just one at a time?  You can leave out all of the configuration information above and just provide the DNS server for the domain:

PS C:\windows\system32> New-ADDCCloneConfigFile –IPv4DNSResolver “192.168.1.120”

If you do not choose a server name for the clone, the name of the new domain controller will match the source domain with “-CLnnnn” added to the name. In our example, the source DC is named W12-DC4 so the new domain controller would be named W12-DC4-CL0001.  It is certainly possible to rename a domain controller after it has been promoted but it is a manual process that is different than renaming any other server.

You’ll need to create a new DCCloneConfig.xml file each time you want to clone the domain controller.  The file is not reusable and becomes invalid after a reboot of the source domain controller whether a clone operation has occurred or not.

Clone the VM

At this point you’re ready to clone the source domain controller.  You’ll need to power off the source domain controller before attempting to clone it.  Remember I mentioned above that you’ll want to create a dedicated DC for cloning?  This is why – even though cloning can be quick (especially with VAAI enabled arrays) it isn’t a great idea to take down a DC that other servers and workstations use as a primary or secondary DNS server or worse, one that has FSMO roles.  If you try to clone your DC without shutting it down first you’ll end up with a cloned DC that does not get customized and is an exact replica of your source DC.

Simply clone the virtual machine as you would any other VM that you clone, with one exception.  Do not try to apply a customization specification to customize the cloned VM.  The customization information is contained within the DCCloneConfig.xml file, so trying to use a guest customization from vSphere will result in a VM that does not boot.  I’m going to experiment with this a bit more but my initial testing with customization specifications was unsuccessful.

After the clone is completed, power on both the source domain controller as well as the newly cloned DC.  The cloning customization process will happen automatically.

image

That’s all there is to it! Now you’ve got a new fully functional cloned domain controller.

image

Have you tried out Windows Server 2012 yet?  I honestly believe that if people could look past the change in user interface they would see that Windows Server 2012 offers many significant improvements to the Windows Server platform.  It didn’t take me long to get used to the new interface and really enjoy working in Server Manager as a central management tool.  Give it a try!

17Jan/1321

vSphere Home Lab Upgrade–Synology DS1812+

I built up my home lab back in late 2011 after finally deciding that I needed something that was completely mine and not a shared lab with others.  I built pretty much an identical lab to Jase McCarty’s (http://www.jasemccarty.com/blog/?p=1516) and have been very happy with it.  The only problem I’ve had is with my home lab storage.

A funny thing happened on the way to figuring out what storage to use in my home lab.  Faced with the prospect of using a home NAS with 4 SATA drives, I wanted to see if I could find something that would give me better performance.  I had the opportunity to get my hands on a server that had 6 x 146GB 10K RPM drives and I jumped at the chance.  That server ended up being an old DL380 G4 (possibly even G3, not sure).  It seemed so smart at the time – why use 7200 RPM consumer SATA drives when I could use 10K RPM enterprise SCSI drives and get better performance.  I didn’t factor in one important thing: cache, or lack thereof.

After seeing miserable performance I researched and bought some battery backed write cache – a whopping 128MB worth that had to be split between reads and writes.  Even with that, and using iSCSI software that let me create a RAM cache, I still had pretty bad performance.  How bad?  This bad.

Holy crap that's some high latency!

Yep, that’s over 4,000ms of latency.  It wasn’t consistently this bad but trying to do multiple operations at once, like rebooting two VMs at once, would cause it.  The server was old, not true 64-bit, and just not the right fit.  There were probably other contributing factors beyond the lack of cache as well.  Not to mention the electricity cost of running a true server class computer in my house.  I realized my mistake and knew I needed to replace it with dedicated NAS storage.

I know I could have used something like Nexenta Community Edition to get better performance out of the DL380.  For a variety of reasons that didn’t make sense in this situation.

After much research and quite a bit of unnecessary delaying on my part (with the appropriate amount of ribbing from @ChrisWahl and @Millardjk) I finally decided on the Synology DS1812+.  I loaded it up with 4 x Sandisk 240GB SSD and 4 x Western Digital Red 2TB SATA drives and plan to use it for my home lab as well as for backing up my PC, pictures, videos, etc.

That's some sweet lab gear you got there Matty!


So how well does it work?  Is it a worthy replacement to the DL380? Seriously, what isn’t a worthy replacement to that old server?

I have been extremely impressed with the Synology DSM software and how easy it is to set up volumes, create iSCSI targets, and configure link aggregation (more on that in a bit).  It also has lots of great features to use it as a home NAS so I’m very happy with my choice.  Performance has been great both on the SSDs and on the Western Digital Red drives.  The days of seconds of latency are gone – as you can see in the screenshots from esxtop I’m able to push extremely high I/O (both reads and writes) with less than 3ms of latency.  I may do some more detailed testing with the I/O Analyzer fling but for now this is good enough for me.

Now we're talkin...

Not too shabby..

My only disappointment is that I cannot configure true 802.3ad Dynamic Link Aggregation.  Unfortunately the switch I use in my home lab, a Dell PowerConnect 2816, only supports static link aggregation and not dynamic.  There are many posts on the Synology forum complaining about this but it’s really Dell’s issue and not anything wrong with the DS1812+.  I consider that a “nice to have” for a home lab but certainly not worth investing hundreds of dollars in a new switch that supports the proper link aggregation configuration.

All in all I’m very happy with the addition of the Synology DS1812+ into my home lab.  The performance is great, the DSM software is very good, and there are some great things coming in the new DSM 4.2 (currently in beta).  I highly recommend any of the Synology models to folks who are looking to upgrade their home lab storage.

Tagged as: 21 Comments
15Jan/132

Honored to be an EMC Elect

I found out this weekend that I was named as an EMC Elect in the program’s inaugural first year.  The EMC Elect award is similar to the VMware vExpert award in that it honors community involvement and knowledge sharing.  Needless to say I’m thrilled and honored to have won the award especially when I see my name associated with many other folks I respect.   Being rewarded for doing something we are all passionate about is what makes awards like the EMC Elect and vExpert all the more meaningful.

You can learn more about the EMC Elect award here and see the full list of winners here.  If you follow the virtualization community you’ll likely see many familiar names.

I didn’t even think I qualified for this award until Matthew Brender (@mjbrender) suggested that I nominate myself.  I ended up getting nominated by someone else in the community and subsequently won the award.  Thanks to Matt for introducing me to the program and recommending that I participate.  I still don’t feel like I truly deserve the award compared to some of the others who won, but I’m honored to be included among the winners.

Looking forward to an exciting first year in the EMC Elect program!

I won I won!!

7Jan/130

PVSCSI Bug Causing Exchange 2010 Jetstress to Crash

A few weeks back I was called in to help a customer who was experiencing problems completing Jetstress testing for an Exchange 2010 deployment. It wasn’t an issue of Jetstress reporting failed tests. Rather, they were unable to get through most of their tests without the Jetstress application actually crashing (JetstressWin.exe has stopped working). They would see the following after the Jetstress testing completed but before it could write any log files to disk.

image

The only Jetstress related error in the Application log was an ESE error with Event ID 482:

JetstressWin (3584) Instance3584.6: An attempt to write to the file “F:\DB\Jetstress006001.edb” at offset 63087017984 (0x0000000eb0478000) for 32768 (0x00008000) bytes failed over 0 seconds with system error 1117 (0x0000045d): “The request could not be performed because of an I/O device error.”. The write operation will fail with error –1022 (0xfffffc02). If this error persists then the file may be damaged and may need to be restored from a previous backup.

During the process of Jetstress completing a test run, it generates a large amount of I/O as it flushes anything in cache to disk. It was at this point that the Jetstress application was crashing. This behavior is normal but it’s an important clue because of the high disk I/O generated.

The customer was using vSphere 4.1 and the Exchange 2010 Mailbox servers were each configured with PVSCSI virtual SCSI controllers using VMDK files. As it turns out, they were hit with the PVSCI bug described in this VMware KB:

Windows 2008 R2 virtual machine using a paravirtual SCSI adapter reports the error: Operating system error 1117 encountered  http://kb.vmware.com/kb/2004578

The interesting thing to note here is that although Exchange is specifically called out here in the KB, it doesn’t mention that it may cause the application (in this case Jetstress) to crash.  The crashing led the team to troubleshoot Jetstress initially, thinking something was wrong with Jetstress and the various DLLs it requires to run.

At the end of the day the issue was resolved by following the instructions in the KB and changing the virtual SCSI driver to LSI Logic SAS.  After making that change there were no subsequent issues with Jetstress.

In case you haven’t read the KB linked above, I want to note that this issue is resolved in all versions of vSphere from 4.1 to 5.0. You’ll need to install the updates described in the KB if you want to use the PVSCSI driver and vSphere 4.1 through 5.0 (it is resolved in vSphere 5.1).

Hopefully this helps anyone who might be experiencing this issue. I also hope it doesn’t dissuade anyone from using the PVSCSI driver for their business critical applications, as it can deliver better performance with lower CPU utilization when high I/O workloads are virtualized.

5Dec/124

Windows 2012 Failover Clusters and vSphere 5.1

How many of you actually read through the release notes of a new vSphere release?  Ok, I know that Maish does.  If you read Maish’s post or read the vSphere 5.1 Release Notes you would see the following:

Windows Server 2012 Failover Clustering is not supported
If you try to create a cluster for Failover Clustering in Windows Server 2012, and select to run validation tests, the wizard completes the validation tests with warnings, and after that returns to running the validation tests again. The wizard in the Windows Server 2012 guest operating system does not continue to the cluster creation stage.

Workaround: None.

I admit I didn’t notice it myself and had it recently pointed out to me.  It’s unfortunate that this piece of information is buried in the Release Notes and not listed in the VMware Knowledgebase, the vSphere 5.1 Clustering Guide, etc.  Despite the fact that many in the industry despise virtualized Windows Failover Clusters, the fact remains that organizations use them and will continue to use them.  vSphere HA can sometimes serve as a replacement but not always.  More on my thoughts on this subject here.

To make sure everyone understands the issue here – when creating a Windows Failover Cluster, you can run through a validation process to make sure the cluster will function properly.  Microsoft will only support a cluster (virtual or physical) if it has been validated.  As you can see in the Release Notes, it appears that Windows 2012 Failover Clusters do not successfully complete the validation step.

I figured I’d give this a try and see if I can reproduce the issue and to my surprise I cannot.  I’ve tried the traditional method of clustering virtual machines by using RDMs in physical compatibility mode.  I’ve also tested using storage presented via the iSCSI initiator inside the virtual machine.  In both cases I was able to validate the cluster and receive a validated rating that is good enough to qualify for Microsoft support.

image

I’m not sure what the actual problem is with Windows 2012 Failover Clusters but I’m unable to reproduce the behavior listed in the Release Notes.  The behavior described in the Release Notes is what happens when the cluster fails the validation test and is actually expected.  If your cluster passes the validation tests, as mine did in the tests I ran, it proceeds to create the cluster without an issue.

Hopefully VMware will clarify this issue for us and/or change their support stance.  I firmly believe you should stick with what is supported when virtualizing business critical applications.  In this case I can’t even figure out why it isn’t supported in the first place, but do have to recommend that organizations hold off on virtualizing Windows 2012 Failover Clusters until this support statement is changed.

Anyone else tried to virtualize a Windows 2012 Failover Cluster and see the behavior that VMware describes?

19Nov/121

Licensing SQL 2012 in a VM with Hyper-threading

When I talk to customers about running SQL in a virtual machine I often refer to the SQL 2012 Licensing Guide to make sure I have the latest information.  This document has changed quite a few times over the last year and I often find something new that I didn’t notice in the prior version (or simply wasn’t there).

Last week I was looking at the SQL 2012 Virtualization Licensing Guide (opens a PDF) when I noticed something I hadn’t seen before with regards to using hyper-threading and virtualizing SQL 2012.  From the guide:

When hyper-threading is turned on, a core license is required for each thread supporting a virtual core. In
the example below, hyper-threading is enabled for the physical processor supporting a VM. Since hyperthreading
creates two hardware threads for each physical core
, a total of 8 core licenses would be required
in this scenario. A core license allows a single virtual core to be supported by a single hardware thread.

Hard as that is to believe, Microsoft is saying that if you use hyper-threading then you need to license both hardware threads rather than the individual core.

This is only really applicable of if you’re licensing individual VMs rather than the entire server as many organizations do.  If you license all cores in a physical server with SQL 2012 Enterprise and maintain Software Assurance then you can virtualize an unlimited number of SQL virtual machines.  That is likely to be the preferred licensing model, and in fact Microsoft says as much in the same document:

This is especially relevant for private cloud scenarios with a large number of VMs being moved dynamically between different physical servers, when self-provisioning is enabled, or when hyper-threading is turned on.

I support Microsoft’s move to a per-core licensing model instead of a per-CPU model since I recognize this had to happen eventually.  On their decision to make organizations buy extra licenses if they use hyper-threading, though, I’m a bit lost.  It’s not as if a hyper-threaded core behaves exactly as a physical core and can provide double the performance.  This one makes me scratch my head.

At the end of the day it is likely not very cost effective to license individual VMs (rather than the entire host) with or without hyper-threading in most scenarios so this is probably not a big deal.  It’s yet another reminder that if you’re looking to perform any kind of SQL virtualization project, you should strongly consider the SQL 2012 Enterprise with SA license for maximum virtualization rights.

28Sep/120

Error 1711 When Upgrading vCenter to 5.1

I was recently going through the process of upgrading my home lab from vSphere 5.0 to 5.1.  Everything was going great until I got the following error:

Error 1711. An error occurred while writing installation information to disk.  Check to make sure enough disk space is available, and click Retry, or Cancel to end the installation.

Error 1711

I found a few references online to a bad MSI file when this happened to folks in previous versions of VMware products (not limited to vCenter apparently).  Then on Twitter, Shawn Bass (@shawnbass) told me he had this happen to him in the past and it also turned out to be a bad download.  I figured that was more credibility than a random forum post.

I was pretty skeptical that that would work, but I decided to download the 3GB ISO file for vCenter 5.1 again.  The result?

image

Thanks to Shawn and others for pointing me in this direction.  I was skeptical but it worked!  Hopefully this helps anyone searching for this error since I didn’t really find anything definitive when I went looking.

Quick update – some folks have asked if I verified the MD5 checksum after the download. Very good and valid question and the answer is no.  However, I still have both the “bad” ISO file as well as the newly downloaded ISO file that eventually worked.  In both cases the MD5 checksum was valid.

image

If I had to guess, the corruption happened at some point when copying files from my laptop to my NAS, from my NAS to my vCenter VM, from vCenter VM to ESXi datastore, etc.  I’m guessing at some point during that process (complicated lab setup, don’t ask) the file got corrupted in some way. On the second download I went directly from my laptop where it was downloaded directly to the ESXi datastore.

My home lab setup is annoying, with only a weak wireless signal connecting my laptop/desktop to the lab downstairs.  Copying files from my laptop to the lab takes a while and I can’t shut it down until it’s done so I try to avoid doing that. Lesson learned for the future I guess.

26Sep/120

SQL 2012 AAG/FCI added to VMware “Supported Configurations” KB

Just a quick post to note that VMware has added SQL 2012 AlwaysOn Availability Groups and AlwaysOn Failover Cluster Instances to their “Microsoft Clustering on VMware vSphere: Guidelines for Supported Configurations” knowledgebase article.  This is a KB article that I refer to frequently when speaking with customers about virtualizing business critical applications and clustered servers. 

As expected, SQL 2012 AAGs are fully supported and have no vMotion/HA restrictions just like Exchange 2010 DAGs.  AAGs do not utilize shared storage and as such do not have the same requirements as a traditional Microsoft cluster.  This appears to be the direction that Microsoft is taking their biggest clustered applications which is very good news.

SQL 2012 AlwaysOn Failover Cluster Instances, which are similar to the traditional SQL clustering model, are supported just as they were in previous versions of SQL.  That is, up to 5 nodes per cluster (if running vSphere 5.1 and Windows 2008 SP2 or later, see this post for more info) and either RDMs for cluster across boxes or VMDKs for cluster in a box configurations.  As stated in my other post, the KB currently only lists 2 nodes but VMware is aware and will be updating that KB shortly.

Happy SQL virtualizing, everyone - it is, after all, the year of SQL virtualization!

Here is the table listing the supported configurations from the KB with the SQL 2012 AlwaysOn additions highlighted.

image

17Sep/128

New in vSphere 5.1–Support for five node Failover Clusters

As has been the case with the last few vSphere releases, VMware has crammed vSphere 5.1 full of new features and functionality. I won’t try to go into all of the details since there are many blog posts out there already.

Since virtualizing business critical applications is near and dear to my heart, one of the changes that immediately jumped out to me is a big change to the support for virtualized Windows Failover Clusters.  Since the first version of the Setup for Failover Clustering guide was released (probably back in the VI3 days or maybe even earlier), there has been one restriction: Virtualized clusters are limited to just two nodes.  That was true whether both were virtual, or one was physical and one was virtual.

The limitation was technically an on-paper limitation – nothing would stop you from adding more than 2 nodes to a cluster.  You could run into issues with SCSI locking with more than 2 node clusters so VMware didn’t support it.

With vSphere 5.1, the limitation has been raised to allow support for up to five node clusters provided you are running at least Windows 2008 SP2 or higher.  If you are running an older version of Windows, you’re still limited to just 2 nodes.

Here’s a link to the above referenced Setup for Failover Clustering and and Microsoft Cluster Service document updated for vSphere 5.1:  http://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-setup-mscs.pdf

One thing to note – the well known and very handy “Microsoft Clustering on VMware vSphere: Guidelines for Supported Configurations” KB still lists the maximum node limit at 2.  I would expect VMware to update that KB soon (and I will be reaching out to them about it shortly).

This also only applies to what are known as “shared disk clusters” or clusters that share the same disk resource among active/passive nodes.  For solutions that leverage non-shared disk clusters, such as Exchange 2010 DAGs or SQL 2012 AAGs, there is no such limit and the only limit is whatever is supported by the application.

This is good news to those that still need to support virtualized clusters, or for migration/long term coexistence between physical and virtualized clusters.

Update 9/19/12 - Cormac Hogan has posted some more technical details on the change in support for virtualized clusters at his blog.  Read about it here: http://cormachogan.com/2012/09/19/vsphere-5-1-storage-enhancements-part-10-5-node-mscs-support/

Go to top ↑