Windows 2012 with e1000e could cause data corruption

A couple of days ago, I spent two hours setting up two Windows Server 2012 VMs on my ESXi 5.1 cluster and tried to get some performance tests done. When copying multiple ISOs across the network between those two VMs, I received an error that neither of my 5 ISOs could be opened on the destination.

After checking the settings of my VMs, I saw that I used the default e1000e vNICs. Apparently, this is a known issue with Windows Server 2012 VMs using e1000e vNICs, running on top of VMware ESXi 5.0 and 5.1.

The scary part is, the e1000e vNIC is the default vNIC by VMware when creating a new VM. This means, if you don’t carefully select the correct vNIC  type when creating your VM, you could potentially run into the data corruption issue.
The easiest workaround is to change the vNIC type from e1000e to e1000 or VMXNET3. However, if you use DHCP, your VM will get a new IP assigned as the DHCP server will recognize the new NIC.

If you prefer not to change the vNIC type, you might just want to disable TCP segmentation offload on the Windows 2012 VMs.
There are three settings which should be changed:

IPv4 Checksum Offload

IPv4_Checksum_Offload

 

Large Send Offload (IPv4)

Large_Send_Offload_IPv4

 

TCP Checksum Offload (IPv4)

TCP_Checksum_Offload

 

Further details can be found in VMware KBA 2058692.

 

SCSI UNMAP – VMware ESXi and Nimble Storage Array

Starting with VMware ESXi 5.0, VMware introduced the SCSI UNMAP primitive (VAAI Thin Provisioning Block Reclaim) to their VAAI feature collection for thin provisioned LUNs. VMware even automated the SCSI UNMAP process, however, starting with ESXi 5.0U1, SCSI UNMAP became a manual process. Also, SCSI UNMAP needs to be supported by your underlying SAN array. Nimble Storage started to support SCSI UNMAPs with Nimble OS version 1.4.3.0 and later.


What is the problem?

When deleting a file from your VMFS5 datastore (thin provisioned), the usage reported on your datastore and the underlying Nimble Storage volume will not match. The Nimble Storage volume is not aware of any space reclaimed within the VMFS5 datastore. This could be caused by a single file like an ISO but also be due to the deletion of a whole virtual machine.

What version of VMFS is supported?

You can run SCSI UNMAPs against VMFS5 and upgraded VMFS3-to-VMFS5 datastores.

What needs to be done on the Nimble Storage array?

SCSI UNMAP is supported by Nimble Storage arrays starting from version 1.4.3.0 and later.
There is nothing to be done on the array.

How do I run SCSI UNMAP on VMware ESXi 5.x?

  1. Establish a SSH session to your ESXi host which has the datastore mounted.
  2. Run esxcli storage core path list | grep -e ‘Device Display Name’ -e ‘Target Transport Details’  to get a list of volumes including the EUI identifier. list eui for scsi unmap
  3. Run VAAI status get to verify if SCSI UNMAP (Delete Status) is supported for the volume.
    esxcli storage core device vaai status get -d eui.e5f46fe18c8acb036c9ce900c48a7f60
    eui.e5f46fe18c8acb036c9ce900c48a7f60
    VAAI Plugin Name:
    ATS Status: supported
    Clone Status: unsupported
    Zero Status: supported
    Delete Status: supported
  4. Change to the datastore directory.
    cd /vmfs/volumes/
  5. Run vmkfstools to trigger SCSI UNMAPs.
    vmkstools -y
    For ESXi 5.5: Use 
    esxcli storage vmfs unmap -l
    Note: the value for the percentage has to be between 0 and 100. Generally, I recommend using 60 to start with.
  6. Wait until the ESXi host returns “Done”.

 

Further details for ESXi 5.0 and 5.1 can be found here  and for ESXi 5.5, please click here.