A comparison between display protocols and codecs

Please note this article has been updated.

Intro

18 November I did a presentation on E2Evc in Barcelona with Rasmus Raun-Nielsen (co-member of TeamRGE) about the differences between remote display protocols and codecs. The session was late in the afternoon so there was already some beers involved which made it a very fun presentation. We summarized the major remote display protocols available in the market. Rasmus then explained what NVENC is (a way to offload the display encoder process to the GPU) and which Nvidia cards supports this technology. I continued with an overview of the available codecs for each protocol and showed the latest Remote Display Analyzer (RDA) 2.0 version (which will be released soon!).

Before the presentation I performed some comparisons with RDA 2.0 in combination with the video codecs found in Microsoft RDP, Citrix HDX and VMware Blast. This blogpost is a small recap of the presentation. Please note that this comparison is an independent view and that conclusions are based on my own observations.

Before I continue: there is some fast progress being made in this space, so it’s important to include the version numbers which are used in this blogpost (I prefer to post updated blogs later instead of updating the same one). The versions used are in the table below:

Codecs used by this protocols

Basically all of the above display protocols consists of 2 types of codecs:

  • Bitmap codecs (JPG\PNG\BMP)
  • Video codecs (H.264\AVC & H.265)

While the bitmap codecs are great for text and static content, the video codecs are more efficient for the more graphic intense workloads. The video codec is based on the industry standard H.264 codec (which we also find in video streaming services like Netflix and YouTube). The use of this codec brings 2 other great advantages:

  • The fast majority of clients have video codec decoding capabilities
  • GPU can be used to accelerate and optimize the encoding process (like Nvidia NVENC)

This results in a very bandwidth efficient way to deliver remote display content to the client, but it comes with a down side: The video codec is by default not optimized for text, because it uses a 4:2:0 chroma subsampling algorithm. This breaks down the quality of text and users often perceive this as blurry (this depends on the user and type of workload, I have seen users working with it every day without complaining). But for text to become really sharp an optimization technique on top of 4:2:0 is needed or a codec operating in pure 4:4:4 mode. This is where the display protocols differentiate from each other like you can see in the below table:

RDP
With RDP you basically have one option when it comes down to the video codec and that is the AVC codec in 4:4:4 mode. It’s enabled by default in the latest Windows versions and you can switch back to the previous codec combination with policies. You can’t really configure this AVC 4:4:4 codec, for example you can’t switch to 4:2:0 mode or change quality levels manually. I must say that the latest RDP versions improved a lot compared to the older RDP versions found in previous Windows versions. While the protocol is still quite bandwidth intense, it offers a very good remote experience. The AVC 4:4:4 mode really delivers a very sharp (near local) experience. RDP has it’s own hardware and software encoding technology and don’t support encoding capabilities like NVENC for their video codec implementation, hopefully they will leverage capabilities offered by mainstream GPU manufactures like Nvidia in the near future. RDP together with RDSH (and the upcoming modern infrastructure) is a very cost effective remote display solution, especially with the latest RDP improvements.

Blast
Blast is a fairly new protocol compared to the others and it’s impressive to see how quick they are making progress with it. As you can see in the above table they do not yet offer a technique to achieve 4:4:4 quality for their video codec. They might be waiting for GPU’s to natively support 4:4:4 so they don’t have to build a software based optimization mechanism for this. They also have a bitmap encoder which you can switch on (Blast uses H.264 by default), but a combination of this codecs is not yet possible. Time will tell if they are focusing on a combination of codecs or optimizing their video codec for 4:4:4 quality. I think the latter makes more sense because currently their bitmap encoder isn’t that optimized compared to the others. Nevertheless I was impressed by the performance of their H.264 codec, as you can see in the comparison below.

HDX
As you can see in the above table HDX has 2 combinations available to address the H.264 4:2:0 limitation. 1: The video codec with a text optimization technique (which is software based and done in CPU). 2: A combination of the bitmap and video codec, this is also known as the “Use Video codec for active regions” setting. For Citrix this makes sense because their bitmap encoder is really optimized throughout the years. In this combination of codecs the video codec is only used for parts of the screen that are changing rapidly, like playing a video for example. I wrote about the differences between the display modes available in HDX here. But there are some limitations, for example you can’t use NVENC in combination with this optimization techniques as of today. I hope to see expanded support for this in the near future. In version 7.16 Citrix has added support for the H.265 codec, this codec is so awesome in many ways as it can massively reduce bandwidth consumption by leveraging a new intra prediction mechanism. This first implementation of H.265 is 4:2:0 only for now (and only when you have a supported GPU), but hope to see more available options in the near future.

All 3 protocols have ways to handle network congestion’s and changes in available bandwidth, they often refer to this with the term “adaptive transport”. They all standardized on UDP by now as the primary transport protocol for sending remote graphics down the wire.

Video codec comparison between RDP, HDX & Blast

Ok let’s continue with the comparison. As you might understand by reading the above it’s not easy to perform direct comparisons. You can just compare the default settings to each other and say how good one is and the other not or vice versa. But in my opinion a comparison only makes sense when you compare apples with apples. Because this comparison will only focus on the video codec I decided to compare the AVC\H.264 codecs to each other. In this comparison the H.264 codec is used for the entire screen and offloaded to the GPU (NVENC). There is one exception and that’s for RDP because we can’t turn off the AVC 4:4:4 mode. Please find the video codecs that are compared in the below table:

Important aspects of the test environment:

  • Agents on Windows 10 with identical specs running on the same hardware placed in a Datacenter in The Netherlands
  • Connection over internet directly to the agent (no connection brokers or gateways)
  • Connection from the same client with same internet conditions and latest client versions
  • All 3 protocols are UDP enabled
  • HDX and Blast use NVENC (Entire screen H.264)
  • Default encoder quality levels are used
  • Test is performed multiple times to verify results
  • Remote Display Analyzer is used to verify display settings and behavior

I separated the RDP comparison movie from the others to highlight even more that the operating mode is different between the video codecs (HDX\Blast operates in 4:2:0 mode in this comparison and RDP operates in 4:4:4 mode).

Please note that this video codec comparison was made for indication purposes only, it’s not reality that users play full screen videos often (while they can do this from time to time). The comparison is based on a manual testing approach while using screen recording software which will also do it’s own compression on the end result. This will affect the frame quality of the resulting movie. This is not comparable with a professional way of benchmarking with for example frame grabbers. For this type of benchmarks I recommend using REX Analytics (developed by co-members of TeamRGE). REX Analytics also leverages Remote Display Analyzer for remote display protocol insights.

The results of the above comparison is displayed below (please note that the Nvidia GPU utilization counter was turned off in this build of RDA, so that’s the reason it shows 0%. The total CPU time consumed is a cumulative of the %CPU usage of the encoder process).

 

My observations and conclusions from the HDX and Blast (pure H.264) video codec comparison:

  • The available bandwidth detected varies between the display protocols. It seems they use a different mechanism of calculating the available bandwidth. It’s interesting to dive deeper into this. This also made me think to add an average available bandwidth counter in RDA so there is an overall indication of the average available bandwidth during a specific run time
  • Observed higher frame compression while playing a full screen movie with HDX compared to Blast, this might be one of the reasons the lower bandwidth usage compared to Blast and the higher amount of frames send to the client (although the difference is not really shocking over a 1:30 minute during full screen movie)
  • With the default encoder quality settings, I observed a sharper frame quality with Blast. But at the cost of more bandwidth and CPU resources compared to HDX
  • Based on the above observations it’s key to compare frame quality to each other and also compare this with a local situation!

Below is an indication movie of the RDP AVC Video codec operating in 4:4:4 mode:

You can find the result below:

My observations and conclusions from the RDP video codec in the above test:

  • RDP can’t really be compared with the others in this video codec comparison because it operates in 4:4:4 mode by default. Perceived quality was very good but high bandwidth usage and some lagging was observed as you can see in the above movie
  • Really sharp images and text, perceived a near local experience with this video codec implementation. Great for office work with a lot of text reading
  • RDP leverages as much bandwidth as needed to achieve the best quality. Not sure what is best: A protocol that’s from itself as low on bandwidth as possible to give the user the best experience or a protocol that uses as much bandwidth needed to achieve the best experience. There should be some form of bandwidth limit or throttling per RDP session to prevent pipe exhaustion or noisy neighbors.  Scalability tests should point out, interesting topic for future testing

Overall conclusion

The above remote protocols aren’t build for remote streaming and remote gaming scenarios (while video and graphical content is increasing). They are build to accommodate a mix of different type of user profiles. That’s why I think it’s very important to have options based on this different type of users, because ones size doesn’t fit all. The display protocol is becoming more intelligent, it can switch codecs and settings by itself based on different content, it would be interesting to see if this will also happen based on the user type or even per application. Because this is not the case for now it’s good to have options!

For now I would say HDX has an advantage over the others because it has the most options available to accommodate a lot of different remote display scenarios. As you can see in the above results the difference between HDX and Blast is not that big when you compare the pure H.264 codec implementations to each other. They both performed really well in this test. The video codec implementation in RDP didn’t disappoint me either, well the bandwidth usage does but the experience was really good.

It’s very interesting to see how they all keep progressing in the near future, expecting to do more 4:4:4 mode comparisons in the future when this is becoming more mainstream. Remote Display protocols are still awesome technology with a lot of advantages, possibilities and use cases.

This blogpost comes without warranty of any kind and observations are based on my own interpretation. Always test and analyze by yourself with a workload that is similar to reality for your environment!