Replicating your VDisk stores with DFS-R

DFS-R_title_smallIn this blogpost I want to share some experiences and practical tips when using DFS-R to sync local VDisk stores between multiple Provisioning Servers. Lets begin with a quick intro :

If you are streaming your VDisks from the Provisioning Servers local storage, you often want to replicate the stores to other Provisioning servers to provide HA for the VDisks and target connections. I’m a big fan of running VDisks from local storage because :

– No CIFS layer between the Provisioning Servers and the VDisks, increasing performance and eliminating bottlenecks
– No CIFS single point of failure
– No expensive clustered file system needed to provide HA for the VDisks

Caching on the device local harddrive or in the device RAM is the best option when using local storage for your VDisk store, this way you can easily load balance and failover between Provisioning Servers. Of course there are some down sides when placing the VDisks on local storage :

– Need to sync VDisk stores between Provisioning Servers, resulting in higher network utilization during sync
– Double storage space needed
– Not an ideal solution when using a lot of private mode VDisks (VDisk is continuous in use and cannot sync). Luckily we now have the Personal VDisk option in XenDesktop so IMHO private VDisks aren’t really necessary anymore in a SBC or VDI deployment.

Because one size doesn’t fit all, you can always mix storage types for storing VDisks depending on your needs, but for Standard mode images combined with caching on the device hard drive or device RAM using the local storage of the Provisioning Server is a good option.

Since Provisioning Server version 6 Citrix added functionality regarding versioning, you can now also easily see if the Provisioning servers are in sync with each other, but you have to configure the replication mechanism yourself. I have worked with a lot of different replication solutions to replicate the VDisks between provisioning servers, from manual copy to scripts using robocopy and rsync running both scheduled and manual. Lately I use DFS-R more and more to get the job done. Because DFS-R provides a 2-way (full mesh) replication mechanism, it’s a great way to keep your VDisk folders in sync, but there are some caveats to deal with when using DFS-R. Below I will give you some practical tips and a scenario you can run into when using DFS-R :

Last Writer wins
This is one of the most important things to deal with, DFS-R uses the last writer wins mechanism to decide which file overrules others. It’s a fairly easy mechanism based on time stamps : whoever changes the file last wins the battle and will synchronize to the other members, it will overwrite existing (outdated) files!
If you hold this mechanism against the mechanism how Provisioning Server works you will quickly run into the following trap :

Imagine you’re environment looks like the image below.

DFS-R and PVS

Step 1:
Because you want to update the image, you connect to the Provisioning console on PVS01 and create a new VDisk version, this will create a maintenance (.avhd) file on PVS01. Because this file is initially very small it will quickly replicate to the other Provisioning servers.

Step 2:
You spin up the maintenance VM, at this point you don’t know from which Provisioning server the maintenance VM will boot (decided based on load balancing rules), so let’s say it boots from PVS03.
You make changes to the maintenance image and shut it down.
Now the fun part is going to start! Based on the changes you made and the size of the .avhd file, it can take some time to replicate the updated file to the other Provisioning Servers.

Step 3:
In the meantime, still connected to PVS01, you promote the VDisk to test or production.
When you promote the VDisk, the SOAP service will mount the VDisk and make changes to it for KMS activation etc.

Step 4:
You boot a test or production VM from the new version, and you don’t see you’re changes, further more they are lost!
What happened? Well you ran into the last writer wins mechanism trap of DFS-R :
The promote takes place on the Provisioning Server which you are connecting to, so in the example this is PVS01. PVS01 doesn’t have the updated .avhd from PVS03 yet, so you promoted the empty .avhd file created when you clicked on new version.
Because the promote action updates the time stamp of the .avhd file, it will replicate this file to the other Provisioning server (again quick because it’s empty) overwriting the one with your updates.

Here are 2 options how you can work around this behaviour :

Option 1 :
After you make changes wait till the replication is finished (watch the replication tab in the Provisioning console) promote the version when every Provisioning Server is in sync.

Option 2 :
If you can’t wait connect the console to the Provisioning Server where the update took place, promote the new version there, so you are sure you promote the right .avhd file

Below I will give some other practical tips when using DFS-R to replicate your VDisk stores.

1. Ensure you have always enough free space left for staging files and make your staging quotas big enough to replicate the whole VDisk (1,5 time the VDisk size for example)

2. Create multiple VDisk stores for your VDisks, this allows you to create multiple DFS-R replication folders, replication works better with multiple smaller folders then a very large one

3. Watch the event viewers for DFS-R related messages, DFS-R logs very informative events to the event log, keep an eye on high watermark events and other events related to replication issues

4. Check the DFS-R backlog to see what’s happening in the background and to check that there are no files stuck in the queue, you can use the dfsrdiag tool to watch the backlog, for example :

dfsrdiag backlog /receivingmember:PVS03 /rfname:PVS_Store_01 /rgname:PVS_Store_01 /sendingmember:PVS01

5. Exclude lok files from being replicated, they should not be the same on every Provisioning Server

6. Plan big DFS-R replica traffic during off-peak hours, when DFS-R is replicating booting up your targets will be slower, you can also limit the bandwidth used for DFS-R replica traffic

7. Before you start check your Active Directory scheme and domain functional level, if you want to use DFS-R your Active Directory scheme must be up-to-date and support the DFS-R replication objects. Also note that only DFS-R replication is necessary, no domain name spaces are needed.

Conclusion
I can be very short here, my conclusion is that DFS-R can be a very nice and convenient way to keep your VDisk stores in sync, but you must understand how DFS-R replica works and how it behaves when combined with Provisioning Server. Hopefully this blog post gave you a better understanding when using DFS-R in combination with Provisioning Server and keep above points in mind when you consider using DFS-R as the replication mechanism for your VDisk stores.

Please note that the information in this blog is provided as is without warranty of any kind.

Spinning up your Provisioning Services Environment

boot_headerSpinning up your Provisioning Services Environment

Just a quick blog before Christmas about options to spin up your Citrix Provisioning environment. As you might know you have different options when it comes to spinning up your Provisioning target devices. Which one to choose depends on your (network) setup and how much control you as a Citrix consultant\engineer\architect have in the customers environment. One of the most common boot scenario is using A: PXE or B: DHCP both in combination with TFTP.

A:
With PXE you don’t have to configure the DHCP options to provide the TFTP server to the targets, you can create a kind of redundant configuration by setting up multiple PXE and TFTP servers, but please note that this is not a real HA configuration because there is no logic involved which controls the way a TFTP server is provided to the targets. For example a broken or unavailable TFTP server can be provided to your targets, you can compare this a little with the way how DNS round robin works.

B:
With the DHCP options you can only provide one TFTP server to your targets, to enable HA here you can configure a load balancer (NetScaler for example) in front of the TFTP servers, this is more HA then option A because you can configure the load balancer to check the health of the TFTP servers and bypass TFTP servers that are currently in down state. But you will need to add multiple nodes in your load balancing configuration so you don’t have a single point of failure.

With both option A and B TFTP is used to deliver the bootstrap to your targets, but what if we can’t use PXE or TFTP because of network restrains or just because we want to eliminate the whole PXE and TFTP dependency… Yes we also have the option to create a bootable disk or ISO with the bootstrap embedded, it contains a list of the Provisioning Servers to provide HA for your VDISK.
To create the ISO we use the Provisioning Services Boot Device Manager which is part of the Provisioning Server installation.

Boot_Device_Manager

After configuring the Provisioning Servers, burn the ISO using the Citrix ISO Image Recorder :

Citrix_Image_recorder

Ok now we have the ISO what should we do with it?
We need to keep in mind that this ISO is now the crucial part for spinning up the targets, without it they simple won’t boot.

The following scenarios are possible to provide the ISO to your targets :

1: When using physical targets
You can burn the ISO to CD\DVD and put it permanently in the drive and boot from there,  another alternative is to create a bootable USB drive and put it in the back of the server.

2: When using virtual targets
Here we can leverage the Hypervisor to provide the ISO, because we need to ensure the availability of the ISO we don’t want to put it on a remote ISO (CIFS\NFS) share, because this would be a single point of failure again. If possible you can use the local storage of the Hypervisor to store the ISO.

With VMware ESX and Hyper-V it’s easy just put the ISO on the local disk or Data Store and attach the ISO to the VM, you can also create a template after this so if you use XenDesktop\PVS to automatically create VM’s for you the ISO is already connected when the VM is turned on.

But what about XenServer? We only have the option to create a CIFS or NFS based ISO repository from XenCenter and that’s something we don’t want in this case. Beneath is a procedure which you can use to create a local ISO repository in XenServer, so you can attach the ISO from there :

– Connect with WinSCP to the XenServer host
– Create the following folder :  /var/opt/xen/local_iso
– Connect to the XenServer console to access the command line interface
– Create the Local ISO Repository with the following command :

“xe sr-create name-label=”Local ISO” type=iso \device-config:location=/var/opt/xen/local_iso/ \device-config:legacy_mode=true content-type=iso”

– Copy the PVS boot ISO to the /var/opt/xen/local_iso folder
– Check the SR in XenCenter :

SR_Overview

And the content of the SR :

SR_Content

* Please note : Do not use the Local ISO repository to store other big ISO’s because it has limited space available.

– Finally attach the ISO to your VM’s and\or templates and check the result :

Booting_from_ISO

Conclusion :

Using a bootable ISO is a great way to overcome network related issues that can be the case when using TFTP and\or PXE. DHCP will always play an important role in your Provisioning Services environment, use split scopes or clustering there to guaranty the uptime of your Provisioning environment. Provisioning Services comes with a lot of “moving parts” compared to MCS, but with this boot option you can at least eliminate a few of them. It’s nice to see PVS is still alive in Excalibur so there is room for MCS to grow, the power is choice!

From here I wish everybody a merry Christmas and a happy new year!

Please note that the information in this blog is provided as is without warranty of any kind.