This is a very good HTPC video set up guide from Mai Sun’s blog:
非常清楚的解释了正确设置HTPC视频输出的要点。对于YUV 4:2:0 编码的视频播放来说，
For a HTPC based home theatre system I personally prefer to divide video calibration process into the following two steps:
- HTPC calibration: find the best settings for HTPC that gives optimal (or at least correct) output to display device.
- Display calibration: find the best settings for display device that gives best picture quality based on the input from HTPC.
In this article I’ll try to cover the topics with regards to HTPC calibration – that is how to get a proper output from our HTPC. There are several reasons why we want to calibrate our HTPC before our display device:
- To make sure that the input to our display is correct and avoid over-adjust our display.
- To avoid unnecessary conversion/adjustment. As an example, when feeding 16-235 to 0-255 display, we need to adjust brightness/contrast to get the correct image which is obviously not an optimal solution.
When we play a video in HTPC, there are several things in the HTPC video pipeline that can have impact on picture quality:
- Source splitter
- Video decoder
- Video renderer
- Graphics card
The target of HTPC calibration step is to find correct settings for all involved components listed above which gives best possible optimal output to our display device. I’ll first try to explain some theories behind scene and then focus on how to test the “correctness” of output from our HTPC.
Movies are normally encoded in YUV 4:2:0 with luma range 16-235, but most graphics card works in RGB 4:4:4 colour space with luma range 0-255, so conversion of both chroma and luma is required before sending video signal to our display. Besides chroma and luma resampling, resizing and de-interlacing are also considered as important factors when comparing different decoders/renderers, and I will try to describe them from a more practical perspective. Some theories behind smooth playback and different pull down techniques are discussed in the end of this section
1. Luma resampling – Why WTW and BTB is important?
Luma resampling refers to expanding video level (16-235) to PC level (0-255), and as we can easily see that the procedure produces fractional numbers which even HDMI 1.4 can not carry. In order to send luma through HDMI, fractional numbers are rounded to integers as the following example shows:
|Original Luma||Expanded Luma||Rounded Result|
Obviously there are several problems with such algorithm:
BTB and WTW is cut off after luma is expanded.
Banding is introduced (as the screenshot below shows). See the above example, the transition from 19-20 is mapped to 3-5. To reduce banding, we have to use a process called “dithering” which generates artificial pixels with luma 4 in between areas where luma is 3 and 5 in our example. Dithering is good in a way it reduces banding, but bad in a way it introduces artificial information which doesn’t exist in the original video.
Some graphics card later convert 0-255 back to 16-235 when using HDMI output which potentially causes even more information lost.
The luma information outside video range (0-16 referred to as BTB and 235-255 referred to as WTW) are important though we don’t normally see them in a movie. IMO they provide the following values:
It gives us the baseline when we calibration the brightness and contrast of our display.
It shows that the video luma is not cut off/expanded along the video pipeline.
Some movies have contents with luma value above 235, and as a result it’s recommended to reserve white up to 240 in order to see them.
Finally my suggestions are:
- Avoid luma expansion/compression if possible. It’s perfect that our HTPC outputs 0-255 without expanding luma and our display cuts BTB and WTW.
- If luma conversion cannot be avoided (most cases), make sure that dithering is enabled to reduce banding. We can use FFDshow with dithering enabled and HQ RGB conversion or MadVR to achieve this.
2. Chroma upsampling – Why more bits is good?
Chroma upsampling generates artificial colour information which doesn’t exist in the original video, therefore the results/quality are quite different in different renderers. Common renderers using different chroma algorithms are compared by Madshi here, and the results are summarized below:
MadVR is a unique renderer that uses 16bits processing pipeline, the rest renderers uses only 10bits or 8bits. As an example, ATI’s internal video process pipeline uses 10 bits. Below is the statement from Madshi with regards to why more bits is important:
“I’ve seen many comments about HDMI 1.3 DeepColor being useless, about 8bit being enough (since even Blu-Ray is only 8bit to start with), about dithering not being worth the effort etc. Is all of that true?
It depends. If a source device (e.g. a Blu-Ray player) decodes the YCbCr source data and then passes it to the TV/projector without any further processing, HDMI 1.3 DeepColor is mostly useless. Not totally, though, because the Blu-Ray data is YCbCr 4:2:0 which HDMI cannot transport (not even HDMI 1.3). We can transport YCbCr 4:2:2 or 4:4:4 via HDMI, so the source device has to upsample the chroma information before it can send the data via HDMI. It can either upsample it in only one direction (then we get 4:2:2) or into both directions (then we get 4:4:4). Now a really good chroma upsampling algorithm outputs a higher bitdepth than what you feed it. So the 8bit source suddenly becomes more than 8bit. Do you still think passing YCbCr in 8bit is good enough? Fortunately even HDMI 1.0 supports sending YCbCr in up to 12bit, as long as you use 4:2:2 and not 4:4:4. So no problem.
But here comes the big problem: Most good video processsing algorithms produce a higher bitdepth than you feed them. So if you actually change the luma (brightness) information or if you even convert the YCbCr data to RGB, the original 8bit YCbCr 4:2:0 mutates into a higher bitdepth data stream. Of course we can still transport that via HDMI 1.0-1.2, but we will have to dumb it down to the max HDMI 1.0-1.2 supports.
For us HTPC users it’s even worse: The graphics cards do not offer any way for us developers to output untouched YCbCr data. Instead we have to use RGB. Ok, e.g. in ATI’s control panel with some graphics cards and driver versions you can activate YCbCr output, *but* it’s rather obvious that internally the data is converted to RGB first and then later back to YCbCr, which is a usually not a good idea if you care about max image quality. So the only true choice for us HTPC users is to go RGB. But converting YCbCr to RGB increases bitdepth. Not only from 8bit to maybe 9bit or 10bit. Actually YCbCr -> RGB conversion gives us floating point data! And not even HDMI 1.3 can transport that. So we have to convert the data down to some integer bitdepth, e.g. 16bit or 10bit or 8bit. The problem is that doing that means that our precious video data is violated in some way. It loses precision. And that is where dithering comes for rescue. Dithering allows to “simulate” a higher bitdepth than we really have. Using dithering means that we can go down to even 8bit without losing too much precision. However, dithering is not magic, it works by adding noise to the source. So the preserved precision comes at the cost of increased noise. Fortunately thanks to film grain we’re not too sensitive to fine image noise. Furthermore the amount of noise added by dithering is so low that the noise itself is not really visible. But the added precision *is* visible, at least in specific test patterns (see image comparisons above).
So does dithering help in real life situations? Does it help with normal movie watching?
Well, that is a good question. I can say for sure that in most movies in most scenes dithering will not make any visible difference. However, I believe that in some scenes in some movies there will be a noticeable difference. Test patterns may exaggerate, but they rarely lie. Furthermore, preserving the maximum possible precision of the original source data is for sure a good thing, so there’s not really any good reason to not use dithering.
So what purpose/benefit does HDMI DeepColor have? It will allow us to lower (or even totally eliminate) the amount of dithering noise added without losing any precision. So it’s a good thing. But the benefit of DeepColor over using 8bit RGB output with proper dithering will be rather small.”
Besides MadVR which provides superb chroma upsampling quality, the YV upchroma shader inside MPC-HC seems to produce a very close and similar result. To use it, we need to feed NV12 (which is a special Nvidia colour space) to MPC-HC from our video decoder and choose EVR as renderer.
3. Resizing algorithms
There are also several resizing/scaling algorithms that we can choose from different renderers, examples are bicubic in EVR or VMR9 and nearest neighbor in overlay and VMR7. In general, bicubic provides better quality than other scaling algorithms which gives an advantage of using EVR renderer over VMR/overlay renderer.
There is a comparison among different scaling algorithms which can be found here, and some results are as follows:
Since EVR, Halli and MadVR provide superior scaling algorithm, I would suggest to use these renderers instead of others. If for some reason we have to stick with overlay or VMR, we should consider to use FFDShow to do scaling instead.
De-interlacing is the process of converting interlaced video, such as common analog television signals or 1080i format HDTV signals, into a non-interlaced form. More information about it can be found here. I don’t have much knowledge in this area, and the ATI hardware de-interlacing satisfies my requirements.
5. Smooth 24P playback – How to avoid judder?
The FPS(frame per second) in different video files might not be the same, for example BBS content is normally in 25P. However, most of movies are in 24P which is in fact 23.976 frames per second (23.976 comes from 24/1.001). In order to play such video at 60HZ refresh rate, 3:2 pulldown is introduced. 3:2 pulldown repeats the first frame 3 times, and then 2nd frame 2 time, so for every 2 frames it generates 5 (24/60=2/5). The potential problem of 3:2 pulldown lies in the fact that some frames stay on the screen longer than others which causes noticeable judder. True 24P playback doesn’t need 3:2 pulldown and the playback should be much smoother. In order to enable true 24P playback we need to make sure that:
- Our display device accepts 1080/24P input
- We can choose either 23HZ or 24HZ refresh rate in graphics card.
When our TV or projector receives 24P signal, it normally either does 5:5 pulldown(display each frame 5 times) or creatively generate intermediate frames(generate 4 frame between every 2 frames). Personally I prefer creative frame generation which is available in my Panasonic projector, but nevertheless, both options should give us smooth playback (comparing to 3:2 pulldown).
I normally perform the following tests to check the output from my HTPC, and running through these tests helps me to find potential software configuration errors. Before we do these tests, we need to know the following:
- The test videos can be downloaded here. We probably need both mp4 video files and blu-ray versions because they normally use different video pipelines. I normally check files first then check blu-ray (PowerDVD) to make sure levels and colours are consistent in both pipelines. Blu-ray test discs such as DVE HD Basics can also be used for this purpose (and we get colour filters also).
- It’s important that we undo all brightness, contrast and colour adjustments made to display device and use the most accurate picture mode available on the display device.
- Tests 1-4 are correctness tests while 5-7 are more like quality checks which can be skipped based on our requirements. For example, if we always play 1080P on 1080P display, there is no need to check resizing quality.
- The purpose of the test is to find the optimal configuration of our HTPC (by spotting obvious video problems), and it is never intended to be 100% accurate after this step. We will still need to calibrate our display device later.
1. Check pixel mapping
For 1080P display device, it is important to make sure we obtain 1:1 pixel mapping from our HTPC output. We can use the single pixel patterns available in section B2 or C (see screenshots below) from “Misc Patterns” to check if any of our device resizes the image.
If we don’t get 1:1 pixel mapping we need to check the resolution and overscan/underscan settings of our graphics card and display device. Test pattern 5 (see screenshot below) in “Basic Settings” can be used to detect overscan.
2. Check luma range
Our HTPC may give us different luma ranges for desktop and video depending on the settings of decoder, renderer and graphics card. Here I’d like to summarize some of the common combinations from my ATI graphics card in the table below:
|Cases||Desktop||Video (main content)||BTB
|1||0-255||0-255||No||OK||Video expanded to 0-255.|
|2||0-255||16-235||Yes||Washed||Video is not expanded.|
|3||16-235||16-235||No||OK||Video is expanded to 0-255 RGB first than everything is compressed to YCbCr.|
|4||16-235||2X-22X||Yes||Washed||Video range is not expanded to 0-255 before compression.|
|5||16-235||16-235||Yes||OK||Desktop is compressed to YCbCr while video is output directly in YCbCr colour space without modification which preserves BTB and WTW.|
Some notes from the table above:
- ATI HDMI adaptor never cuts BTB/WTW, instead it only does compression when necessary.
- Desktop can never be more compressed than video range.
- It gives washed black/white when video is not expanded to 0-255.
- Case 5 is the best because it gives untouched luma range without banding while at the same time preserves BTB/WTW.
- Case 1 and 3 gives correct result but without BTB/WTW. For case 1 we need to configure our display to accept 0-255 input.
- Some users including me never get 0-255 output for desktop no matter what pixel format is chosen.
In this most important step we need to make sure the luma output from our HTPC is correct and the levels we get for desktop and videos are consistent (case 1, 3 or 5). For this purpose we use “Grayscale Ramp” and/or “Grayscale Steps” patterns in section A of “Misc Patterns” (see screenshots below).
For video part we’d like to make sure that we can see most colour transitions or steps between “Reference Black” and “Reference White” and we should not see BTB and most of WTW (see the theory part). The following settings can be checked when we don’t get the optimal result:
- Input/output settings of video decoder;
- Output settings of renderer;
- Dynamic Range setting of CCC if DXVA is used;
- Input range selector of display device.
Beside video, we also need to check desktop luma range to make sure it’s consistent with video range. Any grayscale diagram like the follows can help us with this check:
3. Check banding issue
We can use the same “Grayscale Ramp” video to check if banding is introduced by the video pipeline. The following images shows the result with banding (left) and the result without banding (right).
Banding is normally introduced due to luma conversion, and it can be solved either by 1) eliminating luma conversion or 2) introducing dithering to luma conversion. Sadly to say that in most cases we have to go for (and live with) the latter option. We know that MadVR applies dithering automatically when expanding YV12 to RGB while FFDShow requires dithering being manually enabled, to my knowledge both solutions work fine.
4. Check colours
This step is to check colour conversion between RGB and YCbCr is carried out correctly. For historic reasons, SD and HD follows different conversion algorithms: ITU-R BT.709 for HD and ITU-R BT.601 for SD. If we sometimes play SD stuff, we probably also need to run colour test with ITU-R BT.601 encoded test patterns. To test correctness of primary colours (red, green and blue), we need to either use colour filters or wave monitor for input signal on display device. Test pattern “Flashing Primary Colours” from A4 in “Misc Patterns” is used in this test scenario(see screenshot below), and the idea is to look through the colour filters and make sure that we don’t see anything flashing. In practice we may still notice flashing even though the colour output from our HTPC is correct, and that is because our display device is not calibrated yet.
5. Check resizing quality
To check resizing quality, we can upscale a SD video to 1080P and observe if the result is acceptable. Different resizing algorithms give different results as discussed in theory part, and I think it’s really a personal taste which algorithm to prefer. Normally resizing/scaling are controlled by renderers, and for some renderers such as MadVR or EVR CP provide configuration options so that we can choose the algorithm we prefer. If the renderer we’re using is not configurable, we can consider to use a decoder like FFDShow which supports resizing/scaling and gives us configuration options.
6. Check tearing and judder
Tearing and judder are two different issues, but they can be both tested with a video that contains a lot of camera shifts. Players like MPC-HC even provides build-in test pattern for tearing which we can use for the same purpose.
If we see constant tearing in video playback, we need to check if Windows Aero is enabled(don’t laugh, Aero do remove tearing). Playing video in D3D full screen mode (with vertical sync on) can also solve this problem if it’s supported by the player.
Judder on the other hand is often cause by 1)mismatch of refresh rate and video FPS or 2) dual screen setup. It is always a good idea to use the right refresh rate for videos in different FPS. For example, with my display being able to accept 24P, 50P and 60P signal, I choose 23.976HZ (23HZ in CCC) when I play 24P, 50HZ when I play 25P, and 59.94HZ(59HZ in CCC) when I play 30P or 30i, and all these materials give me smooth playback without any noticeable judder. In general, we get smooth playback when we set refresh rate to be multiple times of FPS. In case that we can’t find a suitable refresh rate for a certain FPS, we should consider to use ReClock to slow down or speed up the FPS.
7. Check lip-sync
In HD world, video processing is normally more expensive and time consuming than audio processing, which as a result can make them out of sync, and we call this lip-sync problem. To check lip-sync we can play a movie with a lot of conversational content, and watch and listen carefully to see if lip movement is out of sync with voice. Small different between video and audio can be easily adjusted by audio delay in our AVR or HTPC, but if the difference is more than 0.5 second, we should probably check configuration of audio/video decoder and CPU/GPU usage.
Recommended decoder and renderer combinations
After trying many different combinations of decoders and renderers, I would like to recommend the following combinations which gives me no problem passing the above mentioned calibration tests:
- FFDShow+MadVR: Among all available renderers, MadVR produces best chroma upsampling and scaling quality. To use this renderer, we need a player that supports it (like MPC-HC) and a software video decoder (MadVR doesn’t support DXVA yet). In FFDShow we need to choose only YV12 output and enable subtitle if we need that. The drawbacks of this combination are 1) we will not get BTB and WTW out of it; 2)it consumes a lot of CPU (FFDShow) and GPU (MadVR) resource; 3) it is not commonly supported.
- FFDShow(HQ YV12 to RGB conversion+Dithering)+EVR: FFDShow itself can also provide high quality RGB output if we enable High quality YV12 to RGB conversion and Dithering options. HQ YV12 to RGB conversion uses 11 bits pipeline and dithering removes banding from the final result. The scaling of FFDShow can also be configured which provides very similar result to MadVR. We’re able to use this combination in most players and media centres, but it still won’t give us BTB and WTW.
- FFDShow(YV12 output)+Overlay: My old ATI HD4850 using this combination which gives me untouched YCbCr output with BTB/WTW, and the quality is the same as protected path used in PowerDVD disk mode (case 5 in the previous table). No banding is introduced which suggests no luma conversion throughout the pipeline, also resizing works extremely well which was a big surprise to me. This combination is definitely my first choice, however I can’t get this to work with my HD5770 graphic card, therefore I only recommend it to HD4XXX users.