vSphere 6 Fault Tolerance

VMware vSphere Fault Tolerance (FT) provides continuous availability for applications on virtual machines.

FT creates a live clone instance of a virtual machine that is always up-to-date with the primary virtual machine. In the event of a host/hardware failure, vSphere Fault Tolerance will automatically trigger a failover, ensuring zero downtime and data loss. VMware vSphere Fault Tolerance utilized heartbeats between the primary virtual machine and the live clone to ensure availability. In case of a failover, a new live clone will be created to deliver continuous protection for the VM.

The VMware vSphere Fault Tolerance FAQ can be found here.

Screen Shot 2014-12-30 at 5.58.55 PM

 

On a first glance, VMware vSphere Fault Tolerance seems like a great addition to vSphere HA Clusters to ensure continuous availability within your VMware vSphere environment.

However, in VMware vCenter Server 4.x and 5.x only one virtual CPU per protected virtual machine is supported. If your VM uses more than one virtual CPU, you will not be able to enable VMware vSphere Fault Tolerance on this machine. Obviously, this is an enormous short-come and explains why many companies are not using VMware’s FT capability.

So what’s new with vSphere 6 in regards to Fault Tolerance?

  • Up to 4 virtual CPUs per virtual machine
  • Up to 64 GB RAM per virtual machine
  • HA, DRS, DPM, SRM and VDS are supported
  • Protection for high performance multi-vCPU VMs
  • Faster check-pointing to keep primary and secondary VM in sync
  • VMs with FT enabled, can now be backed up with vStorage APIs for Data Protection (VADP)

With the new features in vSphere 6, Fault Tolerance will surely get much more traction, since you can finally enable FT on VMs with up to 4 vCPUs.

vSphere 6 NFSv4.1

As most of us know, VMware supports many storage protocols – FC, FCoE, iSCSI and NFS.
However, only NFSv3 was supported in vSphere 4.x and 5.x. NFSv3 has many limitations and shortcomings like:

  • No multipathing support
  • Proprietary advisory locking due to lack of proper locking from protocol
  • Limited security
  • Performance limited by the single server head

Starting with vSphere 6, VMware introduces NFSv4.1. Compared to NFSv3, v4.1 brings a bunch of new features:

  • Session Trunking/Multipathing
    • Increased performance from parallel access (load balancing)
    • Better availability from path failover
  • Improved Security
    • Kerberos, Encryption and Signing is supported
    • User authentication and non-root access becomes available
  • Improved Locking
    • In-band mandatory locks, no longer proprietary advisory locking
  • Better Error Recovery
    • Client and server not state-less any more, with recoverable context
  • Efficient Protocol
    • Less chatty, no file lock heartbeat
    • Session leases

Note: NFSv4.1, does not support SDRS, SIOC, SRM and vVOLs.

Supportability of NFSv3 and NFSv4.1:

  • NFSv3 locking is not compatible with NFS 4.1
    • NFSv3 uses propriety client side locking
    • NFSv4.1 uses server side locking
  • Single protocol accessforadatastore
    • Use either NFSv3 or NFSv4.1 to mount the same NFS share across all ESXi hosts within a vSphere HA cluster
    • Mounting one NFS share as NFSv3 on one ESX host and the same share as NFSv4.1 on another host is not supported!

Kerberos Support for NFSv4.1:

  • NFSv3 only supports AUTH_SYS
  • NFSv4.1 support AUTH_SYS and Kerberos
  • Requires Microsoft AD for KDC
  • Supports RPC header authentication (rpc_gss_svc_none or krb5)
  • Only supports DES-CBC-MD5
    • Weaker but widely used
    • AES-HMAC not supported by many vendors

Implications of using Kerberos:

  • NFSv3 to NFSv4.1
    • Be aware of the uid, gid on the files
    • For NFSv3 the uid & gid will be root
    • Accessing files created with NFSv3 from NFSv4.1 – Kerberized client will result in permission denied errors
  • Always use the same user on all hosts
    • vMotion and other features might fail if two hosts use different users
    • Host Profiles can be used to automate the usage of users

 

Remove VIB – Device or resource busy

Today, I was playing around with some vSphere Installation Bundles (VIB) and ran into an issue when I tried to remove vib:

esxcli software vib remove -n vmware-esx-KoraPlugin 
 [InstallationError]
 Error in running rm /tardisks/:
 Return code: 1
 Output: rm: can't remove '/tardisks/': Device or resource busy

Even adding the –force attribute did not help in this situation.

The following workaround seemed to be working for me:

  1. Stophostd on theESXi host – this will be non-disruptive to yourVMs
    /etc/init.d/hostd stop
  2. Runlocalcli to uninstallVIB
    localcli software vib remove -n vmware-esx-KoraPlugin

    Note: We need to run localcli, since esxcli is not available if hostd is stopped

  3. Starthostd on theESXi host
    /etc/init.d/hostd start
  4. Verify thatvmware-esx-KoraPlugin no longer shows up
    esxcli software vib list | grep  vmware-esx-KoraPlugin

You should no longer see the VIB installed on your ESXi host.
Localcli is not widely spread within the community and mainly used by VMware’s Technical Support. It provides more troubleshooting capabilities, even if hostd is not running.

 

 

Unmount VMware Datastore – Device Busy

Welcome back, I hope everyone had some time to relax and spend the christmas holidays with their families. I was lucky enough to have some time off and play with my lab.

After playing around with some newly deployed NFS datastores, I tried to unmount them and I got a device busy error and on the CLI I’ve got Sysinfo error on operation returned status : Busy. Please see the VMkernel log for detailed error information.

Let me show you the steps I ran through:

  1. Mounted a newNFSdatastore throughtheESXCLI
    1. esxcli storage nfs add  --host=nas-ip --share=share-name --volume-name=volume_name
  2. List all NFS shares
    1. ~ # esxcli storage nfs list
      Screen Shot 2015-01-12 at 3.04.01 PM
  3. Verify that all VMs on this datastore are either powered off or have been vMotioned to another datastore
  4. Trytounmountthedatastore
    1. esxcli storage nfs remove -v b3c1
      Sysinfo error on operation returned status : Busy. Please see the VMkernel log for detailed error information
  5. Looking through the vmkernel.log, doesn’thelpmuch either. The only messageprintedthere is
    1. 2015-01-12T23:10:09.357Z cpu2:49350 opID=abdf676b)WARNING: NFS: 1921: b3c1 has open files, cannot be unmounted
  6. After some searching, I found this article on VMware
  7. Basically, the issue seems to be that vSphere HA Datastore Heartbeats are  enabled on this datastore, which is causing the device to be busy.

The solution for this problem is pretty simple. Open your vSphere Client and edit the vSphere HA settings, after selecting your vSphere HA Cluster. Within the settings, make sure to set the vSphere HA heartbeats to Use Datastores only from the specified list and deselect your datastore, which you try to unmount.

Screen Shot 2015-01-12 at 2.39.42 PM

After changing the setting, I was able to successfully unmount the NFS share with the following command esxcli storage nfs remove -v datastore_name 

VMware Update Manager 5.5 Installation

Last week I started to setup a VUM – VMware vSphere Update Manager 5.5 to get my ESXi hosts updated and some 3rd party software installed. If you have multiple ESXi hosts and you need an easy way to keep your vSphere environment current, VUM is the way to go.

Additionally, VUM plays quiet nicely with DRS (Distributed Resource Scheduler) to avoid any downtime to your VMs, while applying patches to the hosts. DRS will migrate all active VMs from the host  and then put the host into maintenance mode, one at a time.

For the people who do not know where to get VMware Update Manger from, it is part of the VMware vSphere 5.5 ISO. It took me a while to find it. Once you load the ISO you can find the VMware Update Manager under VMware vCenter Support Tools.

VUM1

The installation is pretty straight forward.

Screen Shot 2014-12-17 at 3.40.27 PM

Once you have launched the installation wizard, click Next and then accept the EUL – End User License Agreement.

Screen Shot 2014-12-17 at 3.40.35 PM

On the next screen, you can select whether you want updates to be automatically downloaded from the default sources after the installation. By default, this option is enabled but can be disabled if you prefer to review the default sources, first.

Screen Shot 2014-12-17 at 3.40.42 PM

Next, you have to specify your vCenter server address, port and credentials in order to register VUM with it.
Note: VCSA – vCenter Server Appliance is also supported since VUM 5.0. However VUM needs to be installed on a Windows Server.

Screen Shot 2014-12-17 at 3.41.04 PM

After you have specified your vCenter credentials, you have to choose if you want to install Microsoft SQL Server 2008 R2 Express instance or to use an already existing database/use a different database type. For smaller vSphere deployments it is ok to use the Microsoft SQL Server. However, if you plan to have more than 5 hosts or 50 VMs, you will be better off with a different database, more information can be found here.

Screen Shot 2014-12-17 at 3.41.12 PM

On the next screen, you can specify whether VUM will be identified by IP address or DNS name. Personally, I always chose IP since VUM would still be accessible even if the DNS server is down. Additionally, you can make changes to the default ports for SOAP, Web and SSL.

Screen Shot 2014-12-17 at 3.41.19 PM

Next, specify the installation folders for the VMware vSphere Update Manager and location for downloading patches.

Note: The patch location should have at least 20 GB free space.

Screen Shot 2014-12-17 at 3.41.31 PM

Now, VUM will start to extract the executable for the Microsoft SQL Server.

Screen Shot 2014-12-17 at 3.41.47 PM

For the actual Microsoft SQL Server installation, you do not need to do anything and it is automated by VUM.

Screen Shot 2014-12-17 at 3.43.08 PM

Last but not least, click Finish to complete the installation.

Screen Shot 2014-12-17 at 3.45.54 PM

Before you can start using VUM, log into your vCenter Server and click on Home. Under Solutions and Appliances, you should be able to see Update Manager.

Screen Shot 2014-12-17 at 3.54.28 PM

If the Update Manager does not show up, go to Plugins -> Manage Plugins and verify that the VMware vSphere Update Manager is enabled. You will need to install the VUM client on your local machine through the Plug-in Manager.

Screen Shot 2014-12-17 at 3.54.33 PM