RDP protocol improvements in Windows 10

Intro
As you might know the RDP protocol in Windows 10 consists of different type of codecs (both proprietary and standardized video compression codecs). They belong to a broader set of technologies also known as RemoteFX. There are currently 2 type of codec configurations possible in Windows 10:

  • A combination of different codecs, one optimized for text and one for moving graphics (like video content)
  • The full screen AVC video codec

You can configure them with policies and check which configuration you are using by checking Event ID 162 in the following eventlog location:

Applications and Services Logs -> Microsoft -> Windows -> RemoteDesktopServices-RdpCoreTS -> Operational

  • Initial profile 2 means you are using the codec combination
  • Initial profile 2048 means you are using the full screen AVC codec

Both configurations gives a good out-of-the-box experience with a high level of quality. The full screen AVC codec implementation is pretty neat because they managed to leverage hardware encoders that normally only supports 4:2:0 encoding to reach a 4:4:4 quality level. While 4:2:0 compression is ideal for video content, 4:4:4 quality is needed to make text and still images sharp without blurry side affects. The full screen AVC codec implementation operates best when encoding can be done in hardware (GPU), it can however work with software based encoding (emulated GPU) but that will result in increased CPU utilization. Good to know is that the new HTML5 based web client always leverages the full screen AVC codec implementation.

(v)GPU
You might have heard that RemoteFX vGPU has been deprecated in Server 2019. Times have changed and GPU virtualization technologies have matured making the API intercept based technologies (like RemoteFX vGPU was) a legacy technology.  But no need to get sad about this, because we will get something nice in return: GPU Partitioning or GPU-P for short. It’s still under development but sounds very promising. With this technology multiple virtual machines can leverage the GPU directly (even load balance across multiple GPU’s) and by leveraging the GPU directly Microsoft can move away from the man in the middle role where they needed to maintain the API intercept driver to support new graphic standards. For now we can only leverage the GPU directly by using DDA (GPU pass through) or use GPU virtualization technologies from other vendors.

Windows Virtual Desktop (WVD)
The new GPU-P technology also opens the door for Microsoft to implement this on Azure, which would be a very welcome feature for WVD (the new RDS infrastructure  and multi-session Windows 10 edition hosted on Azure). Hopefully Microsoft will not be supporting the GPU-P technology only in Azure like they do with the new multi-session Windows 10 for WVD edition, this will really isolate this technology preventing broader use cases. I don’t think they will be doing this because they pull away RemoteFX vGPU and should provide an alternative for it.

What happened in a year time with the RDP protocol
With almost every new Windows 10 build the RDP graphics stack is updated, there is not much information you can find about such improvements, but they are certainly there.

While doing some investigation on different Windows 10 builds I noticed the protocol version is matched with the client to enable support for the latest features (both client and servers side). You can find this version numbers in the same eventlog as described in the intro. They look like this:

The client supports version 0xA0400 of the RDP graphics protocol (Build 1709)
The client supports version 0xA0600 of the RDP graphics protocol (Build 1809)

Some of the improvements in the RDP protocol are:

  • Screen regions and content are better classified (to make optimal use of the right codec and compression algorithm)
  • Webcam redirection improvements leveraging H.264
  • Down-scaling for 4K resolutions
  • GPU-P technology (announced) the AVC codec will also benefit from this

Time for a test!
I decided to do a simple test using Remote Display Analyzer to look at the improvements and changes Microsoft made to the RDP protocol in a year time. To do this I used 2 different Windows 10 builds: The 1709 and 1809 build (without updates) this will give more a less an indication of the improvements in a year time frame.
Remote Display Analyzer now also supports WVD, but I did not use it in this test because the current WVD private preview only has its RD gateways in the US and it doesn’t make much sense to let traffic flow across the globe. Will do some more testing with WVD later when it’s GA. To check the differences in the RDP protocol between the Windows 10 builds I performed the following test:

  • A direct RDP connection to both builds
  • Connection over LAN using a Windows 10 1809 client
  • Used the out-of-the-box RDP configuration on both builds
  • Both builds running on the same infrastructure
  • The test consists of playing a short video (not full screen) and scrolling some text. Exactly the same has been done on both builds
  • Please note that this was a manual test and it’s always better to automate such tests (I recommend REX analytics for this)
  • This results come without warranty of any kind and are based on my own observations using my own infrastructure. This is only to give you an indication of the differences I observed while performing this test

The results are below:

On the left you see the results of running the test on the 1709 build and on the right the results of running the exact same test on the 1809 build. I observed the following:

  • The 1809 build used less bandwidth (almost half) while I didn’t perceived a noticeable difference in frame quality. The send frames are more or less identical
  • The reported “available bandwidth detected” is different across the builds, I’m not sure what the reason for this is, the value of this counter looks a bit inconsistent so I’m not relying to much on this one
  • Overall my perceived user experience on the 1809 build was better (more fluid and snappier screen updates)

Conclusion
While you don’t hear much about it, Microsoft still makes improvements in their remote graphics stack and they should be doing this because it’s one of the most critical success factors of the upcoming WVD platform. The 1809 build performed much better on the LAN then the 1709 build, the lower bandwidth is also great news for WAN scenarios. I’m expecting more protocol improvements inline or shortly after the WVD release, I will certainly keep an eye out on this and will write a new blog post when more information is (publicly) available. Thanks for reading!

Remote Display Analyzer 3.0 released

Hi all,

We are happy to let you know that Remote Display Analyzer (RDA) 3.0 has been released!  As you might know we extended the RDA team to invest more time in building new features in RDA and to continuously add support for new remote display protocol versions and settings.

What’s new in Remote Display Analyzer 3.0:

  • Logging feature, this is a much requested feature and makes it possible to (automatically) log to an external logfile which you can use to visualize the output with for example Graphs and Power BI
  • Support for the latest Remote Display protocol configurations from all supported vendors (Microsoft, Citrix, VMware)
  • New subscriber edition (besides the free community version) to offer support and advanced features

To download RDA 3.0 and for more information visit the RDA website.

A comparison between display protocols and codecs

Please note this article has been updated.

Intro

18 November I did a presentation on E2Evc in Barcelona with Rasmus Raun-Nielsen (co-member of TeamRGE) about the differences between remote display protocols and codecs. The session was late in the afternoon so there was already some beers involved which made it a very fun presentation. We summarized the major remote display protocols available in the market. Rasmus then explained what NVENC is (a way to offload the display encoder process to the GPU) and which Nvidia cards supports this technology. I continued with an overview of the available codecs for each protocol and showed the latest Remote Display Analyzer (RDA) 2.0 version (which will be released soon!).

Before the presentation I performed some comparisons with RDA 2.0 in combination with the video codecs found in Microsoft RDP, Citrix HDX and VMware Blast. This blogpost is a small recap of the presentation. Please note that this comparison is an independent view and that conclusions are based on my own observations.

Before I continue: there is some fast progress being made in this space, so it’s important to include the version numbers which are used in this blogpost (I prefer to post updated blogs later instead of updating the same one). The versions used are in the table below:

Codecs used by this protocols

Basically all of the above display protocols consists of 2 types of codecs:

  • Bitmap codecs (JPG\PNG\BMP)
  • Video codecs (H.264\AVC & H.265)

While the bitmap codecs are great for text and static content, the video codecs are more efficient for the more graphic intense workloads. The video codec is based on the industry standard H.264 codec (which we also find in video streaming services like Netflix and YouTube). The use of this codec brings 2 other great advantages:

  • The fast majority of clients have video codec decoding capabilities
  • GPU can be used to accelerate and optimize the encoding process (like Nvidia NVENC)

This results in a very bandwidth efficient way to deliver remote display content to the client, but it comes with a down side: The video codec is by default not optimized for text, because it uses a 4:2:0 chroma subsampling algorithm. This breaks down the quality of text and users often perceive this as blurry (this depends on the user and type of workload, I have seen users working with it every day without complaining). But for text to become really sharp an optimization technique on top of 4:2:0 is needed or a codec operating in pure 4:4:4 mode. This is where the display protocols differentiate from each other like you can see in the below table:

RDP
With RDP you basically have one option when it comes down to the video codec and that is the AVC codec in 4:4:4 mode. It’s enabled by default in the latest Windows versions and you can switch back to the previous codec combination with policies. You can’t really configure this AVC 4:4:4 codec, for example you can’t switch to 4:2:0 mode or change quality levels manually. I must say that the latest RDP versions improved a lot compared to the older RDP versions found in previous Windows versions. While the protocol is still quite bandwidth intense, it offers a very good remote experience. The AVC 4:4:4 mode really delivers a very sharp (near local) experience. RDP has it’s own hardware and software encoding technology and don’t support encoding capabilities like NVENC for their video codec implementation, hopefully they will leverage capabilities offered by mainstream GPU manufactures like Nvidia in the near future. RDP together with RDSH (and the upcoming modern infrastructure) is a very cost effective remote display solution, especially with the latest RDP improvements.

Blast
Blast is a fairly new protocol compared to the others and it’s impressive to see how quick they are making progress with it. As you can see in the above table they do not yet offer a technique to achieve 4:4:4 quality for their video codec. They might be waiting for GPU’s to natively support 4:4:4 so they don’t have to build a software based optimization mechanism for this. They also have a bitmap encoder which you can switch on (Blast uses H.264 by default), but a combination of this codecs is not yet possible. Time will tell if they are focusing on a combination of codecs or optimizing their video codec for 4:4:4 quality. I think the latter makes more sense because currently their bitmap encoder isn’t that optimized compared to the others. Nevertheless I was impressed by the performance of their H.264 codec, as you can see in the comparison below.

HDX
As you can see in the above table HDX has 2 combinations available to address the H.264 4:2:0 limitation. 1: The video codec with a text optimization technique (which is software based and done in CPU). 2: A combination of the bitmap and video codec, this is also known as the “Use Video codec for active regions” setting. For Citrix this makes sense because their bitmap encoder is really optimized throughout the years. In this combination of codecs the video codec is only used for parts of the screen that are changing rapidly, like playing a video for example. I wrote about the differences between the display modes available in HDX here. But there are some limitations, for example you can’t use NVENC in combination with this optimization techniques as of today. I hope to see expanded support for this in the near future. In version 7.16 Citrix has added support for the H.265 codec, this codec is so awesome in many ways as it can massively reduce bandwidth consumption by leveraging a new intra prediction mechanism. This first implementation of H.265 is 4:2:0 only for now (and only when you have a supported GPU), but hope to see more available options in the near future.

All 3 protocols have ways to handle network congestion’s and changes in available bandwidth, they often refer to this with the term “adaptive transport”. They all standardized on UDP by now as the primary transport protocol for sending remote graphics down the wire.

Video codec comparison between RDP, HDX & Blast

Ok let’s continue with the comparison. As you might understand by reading the above it’s not easy to perform direct comparisons. You can just compare the default settings to each other and say how good one is and the other not or vice versa. But in my opinion a comparison only makes sense when you compare apples with apples. Because this comparison will only focus on the video codec I decided to compare the AVC\H.264 codecs to each other. In this comparison the H.264 codec is used for the entire screen and offloaded to the GPU (NVENC). There is one exception and that’s for RDP because we can’t turn off the AVC 4:4:4 mode. Please find the video codecs that are compared in the below table:

Important aspects of the test environment:

  • Agents on Windows 10 with identical specs running on the same hardware placed in a Datacenter in The Netherlands
  • Connection over internet directly to the agent (no connection brokers or gateways)
  • Connection from the same client with same internet conditions and latest client versions
  • All 3 protocols are UDP enabled
  • HDX and Blast use NVENC (Entire screen H.264)
  • Default encoder quality levels are used
  • Test is performed multiple times to verify results
  • Remote Display Analyzer is used to verify display settings and behavior

I separated the RDP comparison movie from the others to highlight even more that the operating mode is different between the video codecs (HDX\Blast operates in 4:2:0 mode in this comparison and RDP operates in 4:4:4 mode).

Please note that this video codec comparison was made for indication purposes only, it’s not reality that users play full screen videos often (while they can do this from time to time). The comparison is based on a manual testing approach while using screen recording software which will also do it’s own compression on the end result. This will affect the frame quality of the resulting movie. This is not comparable with a professional way of benchmarking with for example frame grabbers. For this type of benchmarks I recommend using REX Analytics (developed by co-members of TeamRGE). REX Analytics also leverages Remote Display Analyzer for remote display protocol insights.

The results of the above comparison is displayed below (please note that the Nvidia GPU utilization counter was turned off in this build of RDA, so that’s the reason it shows 0%. The total CPU time consumed is a cumulative of the %CPU usage of the encoder process).

 

My observations and conclusions from the HDX and Blast (pure H.264) video codec comparison:

  • The available bandwidth detected varies between the display protocols. It seems they use a different mechanism of calculating the available bandwidth. It’s interesting to dive deeper into this. This also made me think to add an average available bandwidth counter in RDA so there is an overall indication of the average available bandwidth during a specific run time
  • Observed higher frame compression while playing a full screen movie with HDX compared to Blast, this might be one of the reasons the lower bandwidth usage compared to Blast and the higher amount of frames send to the client (although the difference is not really shocking over a 1:30 minute during full screen movie)
  • With the default encoder quality settings, I observed a sharper frame quality with Blast. But at the cost of more bandwidth and CPU resources compared to HDX
  • Based on the above observations it’s key to compare frame quality to each other and also compare this with a local situation!

Below is an indication movie of the RDP AVC Video codec operating in 4:4:4 mode:

You can find the result below:

My observations and conclusions from the RDP video codec in the above test:

  • RDP can’t really be compared with the others in this video codec comparison because it operates in 4:4:4 mode by default. Perceived quality was very good but high bandwidth usage and some lagging was observed as you can see in the above movie
  • Really sharp images and text, perceived a near local experience with this video codec implementation. Great for office work with a lot of text reading
  • RDP leverages as much bandwidth as needed to achieve the best quality. Not sure what is best: A protocol that’s from itself as low on bandwidth as possible to give the user the best experience or a protocol that uses as much bandwidth needed to achieve the best experience. There should be some form of bandwidth limit or throttling per RDP session to prevent pipe exhaustion or noisy neighbors.  Scalability tests should point out, interesting topic for future testing

Overall conclusion

The above remote protocols aren’t build for remote streaming and remote gaming scenarios (while video and graphical content is increasing). They are build to accommodate a mix of different type of user profiles. That’s why I think it’s very important to have options based on this different type of users, because ones size doesn’t fit all. The display protocol is becoming more intelligent, it can switch codecs and settings by itself based on different content, it would be interesting to see if this will also happen based on the user type or even per application. Because this is not the case for now it’s good to have options!

For now I would say HDX has an advantage over the others because it has the most options available to accommodate a lot of different remote display scenarios. As you can see in the above results the difference between HDX and Blast is not that big when you compare the pure H.264 codec implementations to each other. They both performed really well in this test. The video codec implementation in RDP didn’t disappoint me either, well the bandwidth usage does but the experience was really good.

It’s very interesting to see how they all keep progressing in the near future, expecting to do more 4:4:4 mode comparisons in the future when this is becoming more mainstream. Remote Display protocols are still awesome technology with a lot of advantages, possibilities and use cases.

This blogpost comes without warranty of any kind and observations are based on my own interpretation. Always test and analyze by yourself with a workload that is similar to reality for your environment!