EL PSY CONGROO！: 五月 2014

事实上关于硬解没什么好说的，小白只需要眼睛一闭用dxva纯硬解打地图炮就是了。但是总有那么几个人纠结是不是这样不够高端大气上档次，所以就引出了下面这段话。同时感谢ASkara对部分内容的指正。
关于几个比较主流的硬解方式：
1. 优先选择native硬解（包括lav里dxva2native和potplayer的DXVA）。native硬解的优点在于快速，省电，发热少，因为是纯硬解，不存在数据拷回流程，效率较高。可以说只要你不存在兼容性问题，没有被限制条件限制，就直接用native硬解绝无二话。
事实上是兼容性问题现在极少发生了（毕竟技术越来越成熟），同时native占用最小，也最省电，相当快速。缺点是限制太多，当然能选择它就选择它。
以下列举几条限制：dxva2.0规范下不支持xp，vista及以上系统不能使用非evr渲染器（madvr后来新增native硬解支持），不能使用后期滤镜

接下来是3个混合硬解的选择条件和优缺点介绍：
2.copyback 解码高码率低帧率视频相比native硬解速度基本持平，有一小部分情况甚至稍快一些，但是除此之外无一比得过native。同时占用和性能消耗略高。说起混合硬解的产生的原因自然是纯硬解让人汗颜的限制条件，特别是对后处理滤镜的限制使用。我想你们一定没有玩过ffdshow那强大的后处理滤镜，不过你们现在都迷上madvr了，所以对后期滤镜这种东西并不感冒。而现在很多人对copyback这种东西是不是真的省电快速持怀疑态度，因为感觉开了跟没开一样。cpu占用还是那么大，感觉没降多少。copyback的作用之所以让人感觉如此微妙，是因为它有一个copyback流程（在potplayer 里面的中文翻译是回写储存器），DXVA之后把显存内的数据copy back回系统内存，这样一来效率就降低了很多，而且由于其是基于VLD的HW，老A卡对MPEG2也是开不了的；兼容性也是比较差的，最后，还不支持 XP……
引用ASkara 23:41:29
“它是最后一个做出来的，当时成农企用户强烈要求 NEV才姑且做了一个。”
优点是不受native DXVA那么多限制，可以用非EVR渲染器，可以滤镜后处理等等。在硬解速度方面有时候甚至超过native硬解，这一点是N卡混合硬解和intel quicksync比不上的。缺点就是占用和性能消耗方面，此外应对高帧率视频较为抓鸡。兼容性方面较差（不过前面我说了现在这方面的兼容性问题很少出现了，所以不用在意）。
限制：不支持xp

3.N卡混合硬解(包括potplayer的CUDA和lav的cuvid)可以极大的减少cpu的负担，同时也是少数可以支持xp的硬解方式。兼容性比copyback要好，硬解码速度主要是看NVIDIA VP的处理速度，总体来说比intel的quick sync(不要和下文的quicksync弄混)要慢,但是比AMD UVD要快。如果你是至强+N卡的特殊组合，没有核显，只能用N卡硬解视频的话倒是可以考虑使用这个。或者是用ffdshow raw video filter+svp插帧的时候。据大A说法，CUDA是老黄重点发展项目，同时cuvid也是三个混合硬解中NEV最重视的一个。
优点是兼容性较好，且支持xp。在3种混合硬解方式中，cpu负担是最小的。而且不存在另外两种混合解码在高帧率方面的不足。这也是为什么svp插帧的时候很多人喜欢用N卡的混合硬解。缺点是发热和耗电较大。
限制：NVIDIA独显独占限定

4.intel quicksync
intel quicksync算是intel硬解的另一个备胎了。在核显支持的情况下，考虑到外挂后期滤镜等需要破除native纯硬解限制的问题上，可以选择 intel quicksync。它没有copyback那么大的性能消耗，不过当然无法做到像copyback那样某些小部分情况解码速度甚至超越native。但是在性能的消耗上做的更好。在解码高码率低帧率的视频，intel quicksync在cpu消耗的性能要远远小于copyback。
此外关于intel quicksync硬解和intel quick sync概念请大家不要搞混了，一个是quicksync（potplayer和lav上面的一种硬解方式）一个是quick sync（intel开发的一项解码技术）
个人感觉intel quicksync性能方面更像是copyback硬解速度和消耗的一个折衷。因为感觉在高帧率视频的混合硬解方面的表现和copyback相差无几，但在低帧率高码率视频的混合硬解的占用和性能消耗明显小于copyback，然而不具备copyback的快速。所以不要说intel quicksync一定比copyback好这种言论，倒不如说各有千秋。
限制：不支持XP，intel核显HD2000显卡以上独占限定，intel最新几款官方驱动限定（所以要养成及时更新核显驱动的好习惯）

注：以上native纯硬解仅考虑到开满到vld加速等级上，vld以下不在考虑范围。

I've seen many comments about HDMI 1.3 DeepColor being useless, about 8bit being enough (since even Blu-Ray is only 8bit to start with), about dithering not being worth the effort etc. Is all of that true?
It depends. If a source device (e.g. a Blu-Ray player) decodes the YCbCr source data and then passes it to the TV/projector without any further processing, HDMI 1.3 DeepColor is mostly useless. Not totally, though, because the Blu-Ray data is YCbCr 4:2:0 which HDMI cannot transport (not even HDMI 1.4). We can transport YCbCr 4:2:2 or 4:4:4 via HDMI, so the source device has to upsample the chroma information before it can send the data via HDMI. It can either upsample it in only one direction (then we get 4:2:2) or into both directions (then we get 4:4:4). Now a really good chroma upsampling algorithm outputs a higher bitdepth than what you feed it. So the 8bit source suddenly becomes more than 8bit. Do you still think passing YCbCr in 8bit is good enough? Fortunately even HDMI 1.0 supports sending YCbCr in up to 12bit, as long as you use 4:2:2 and not 4:4:4. So no problem.
But here comes the big problem: Most good video processsing algorithms produce a higher bitdepth than you feed them. So if you actually change the luma (brightness) information or if you even convert the YCbCr data to RGB, the original 8bit YCbCr 4:2:0 mutates into a higher bitdepth data stream. Of course we can still transport that via HDMI 1.0-1.2, but we will have to dumb it down to the max, HDMI 1.0-1.2 supports.
For us HTPC users it's even worse: The graphics cards do not offer any way for us developers to output untouched YCbCr data. Instead we have to use RGB. Ok, e.g. in ATI's control panel with some graphics cards and driver versions you can activate YCbCr output, *but* it's rather obvious that internally the data is converted to RGB first and then later back to YCbCr, which is a usually not a good idea if you care about max image quality. So the only true choice for us HTPC users is to go RGB. But converting YCbCr to RGB increases bitdepth. Not only from 8bit to maybe 9bit or 10bit. Actually YCbCr -> RGB conversion gives us floating point data! And not even HDMI 1.4 can transport that. So we have to convert the data down to some integer bitdepth, e.g. 16bit or 10bit or 8bit. The problem is that doing that means that our precious video data is violated in some way. It loses precision. And that is where dithering comes to rescue. Dithering allows to "simulate" a higher bitdepth than we really have. Using dithering means that we can go down to even 8bit without losing too much precision. However, dithering is not magic, it works by adding noise to the source. So the preserved precision comes at the cost of increased noise. Fortunately thanks to film grain we're not too sensitive to fine image noise. Furthermore the amount of noise added by dithering is so low that the noise itself is not really visible. But the added precision *is* visible, at least in specific test patterns (see image comparisons above).
So does dithering help in real life situations? Does it help with normal movie watching?
Well, that is a good question. I can say for sure that in most movies in most scenes dithering will not make any visible difference. However, I believe that in some scenes in some movies there will be a noticeable difference. Test patterns may exaggerate, but they rarely lie. Furthermore, preserving the maximum possible precision of the original source data is for sure a good thing, so there's not really any good reason to not use dithering.

So what purpose/benefit does HDMI DeepColor have? It will allow us to lower (or even totally eliminate) the amount of dithering noise added without losing any precision. So it's a good thing. But the benefit of DeepColor over using 8bit RGB output with proper dithering will be rather small.

有关HDMI 1.3启用DeepColor很多人的意见是这个玩意没什么用，反正即使蓝光也只有8位，抖动没有任何意义。这个是真的吗？
这要看情况。如果源设备（例如，蓝光播放器）解码的YCbCr源的数据，然后将其传递给电视/投影无需任何进一步的处理，HDMI 1.3启用DeepColor大多是无用的。但是事实完全不是这样的情况，因为HDMI无法传输YCbCr 4:2:0的数据（即使是HDMI 1.4也不行）不过我们可以通过HDMI传送的YCbCr 4:2:2或4:4:4 ，所以信源装置首先要进行chroma upsample（色度上采样），才能再通过HDMI发送数据色度信息。他可以是一个方向的上采样（然后我们得到4:2:2），也可以是两个方向的上采样（然后我们得到4:4:4）现在，一个非常好的chroma upsampling（色度上采样）算法，输出比输入更高的位深度。这样一来，8位的数据源就猛然的超过了8位。你仍然认为在8位Ycbcr图像传递是不够好？好在即使HDMI 1.0支持高达12位的Ycbcr图像发送，只要您使用4:2:2而不是4:4:4，所以没有问题。
但这样一来就有一个很大的问题：大多数优秀的视频processsing算法相比输入的位深度会产生更高的位深度。所以，如果你真的改变亮度（亮度）信息，或者将YCbCr数据转换为RGB ，原来8位的YCbCr 4:2:0将变异成一个更高的位深度数据流。
我们仍可以通过HDMI1.0-1.2来传输，但我们不得不在HDMI 1.0-1.2支持的情况下最大程度的简化这一步骤。
对我们来说， HTPC用户，它甚至更糟：该显卡不给我们开发者提供的任何方式来输出不变的YCbCr数据。相反，我们必须使用RGB 。好吧，例如在ATI的控制面板与某些图形卡和驱动程序版本可以激活YCbCr输出，但这是相当明显的，在内部的数据转换为RGB ，然后再后来回来到YCbCr ，这是一种通常不是一个好主意，如果你不在乎关于MAX的图像质量。所以对我们HTPC用户的唯一真正的选择是去的RGB 。但转换YCbCr到RGB增加位深度。不仅是从8位到9位可能或10位。其实YCbCr图像 - > RGB转换为我们提供了浮点数据！甚至不是HDMI 1.4可输送。因此，我们必须将数据转换下降到某一整数位深，如16位或10位或8位。问题是，这样做意味着我们的珍贵的视频数据被破坏以某种方式。它失去精度。而这正是抖动来营救。抖动允许以“模拟”更高的位深度比我们真的有。使用抖动意味着我们可升可跌，甚至8位，而不会失去太多的精度。然而，抖动并不神奇，它的工作原理是增加噪声源。这样的保藏精度来在噪音增加的成本。所幸由于胶片颗粒我们不是过于敏感，精细的图像噪声。此外噪音的通过抖动添加量是如此之低，噪声本身是不是真的可见。但增加的精度是可见的，至少在特定的测试图案（见上文图像比较）。
因此，抖动这种这种东西在现实生活中有作用吗？它对普通的影片欣赏有帮助吗？
嗯，这是一个很好的问题。我可以肯定地说，在大多数电影中大多数场景抖动不会做出任何明显的区别。不过，我相信，在一些电影的一些场景会出现一个明显的区别。测试模式可能夸大，但他们很少撒谎。此外，保留原始源数据的最大可能的精度是肯定的好事，所以没有什么好的理由不使用抖动。

I do sympathize with your general sentiment. However, I think I need to put this a bit into perspective:
AFAIK, before madVR existed, nobody in the HTPC world used dithering *at all* for video. madVR introduced this concept. And still today, I believe most consumer electronics devices don't use dithering. The Lumagen Radiance is the only exception I know, and it applies simple random dithering. Even the eeColor calibration box does *not* do dithering at all. Practically this means even the simplest and most ugly form of dithering is already significantly better than what you'd get by using typical consumer electronics devices.
Skip forward to v0.87.6 and not only do you get simple random dithering, but you get dynamic ordered dithering with a specially optimized 32x32 dither pattern. You also get error diffusion which to my best knowledge nobody else on this planet is using for video playback. And you don't just get normal Floyd-Steinberg error diffusion with its worm artifacts, but you get specially optimized algorithms which totally get rid of worm artifacts and which still keep noise down nicely etc. And you get "colored noise" which is a special algorithm to reduce luma noise. And you get dynamic error diffusion which is also something nobody else uses, AFAIK. So basically the dithering algorithms you get in v0.87.6 are already miles ahead of every other video playback product.

Now we're talking about linear light dithering, which is another improvement over what v0.87.6 already offers. It's a concept I've rarely seen addressed on the internet yet. So basically we're not resting on what is already better than what everybody else uses, but we continue chasing the best possible solution, which is good.
But here comes the catch: With real video content in many scenes many users already see no difference with dither turned on/off. And that difference is much bigger than the difference between random dithering and error diffusion. And that difference again is much bigger than the difference between gamma light error diffusion and linear light error diffusion. And that difference again is much bigger than the difference between various linear light error diffusion curves. Basically every step we take increases image quality by a much smaller amount than the previous step. So we're really talking about incredibly small differences here, especially at 8bit. So forgive me if I'm not willing to buy a 0.000000000000000001% quality improvement with a lot of extra development time, a more complicated settings dialog and the average user becoming confused by all the weird options. Usability is important to me, too.
我很理解你大众式的情绪，但是我觉得我应该丢掉一点这样的情绪投入到我自己的观点中来。
据我所知，在madvr出现并介绍了这个概念之前，没有人会在htpc用抖动处理视频。即使现在我也相信大多数电子设备都不会使用dithering。 Lumagen Radiance是我所知的唯一一个的例外，它能处理简单的抖动。即使是eeColor色彩处理器也不能处理抖动。这就表明了就算是最简单的抖动处理方法也比你那所谓的高级电子处理设备先进的多。
不讲后面的直接讲我v0.87.6的版本好了。这个版本的抖动可不是简单的随机抖动，而是经过动态矩阵，专门优化的32x32抖动模式抖动，你更应该知道在地球上没有其他任何人能和我做的这个误差扩散算法相比肩的。你得到的可不是Floyd- Steinberg这种古老的抖动算法所达到的蠕虫效果，而是专门优化的算法。它完全摆脱了前者蠕虫效果但是仍然很好的保持了低噪点的效果。而且你还得到了“colored noise（有色的噪点）”，这是一种特殊的抖动算法，通过增加色度噪点来降低重要的亮度方面的噪点。这是前所未有的东西。据我所知，所以你在我 v0.87.6的版本得到的抖动算法可以甩其他所有的视频播放产品一条街。
现在，我们正在谈论的线性光抖动，这是v0.87.6的版本提供的另一种改进算法。我们谈论的其实是一个在网络很少有人谈论到的概念，我们绝不会满足于现状而会有更高的目标。
但来这里的收获是：现状是随着视频的内容日益真实化，在一些场景当中很多观众很难观测到开不不开抖动的区别。而这种差距是random dithering（随机抖动）算法和 error diffusion（误差扩散）算法之间的区别大，而这种差距又比gamma light error diffusion（伽马光线误差扩散）和linear light error diffusion(光波线性误差扩散)的算法之间的区别大的更多。还是和以上演算法相比比various linear light error diffusion curves算法有着更大的差异。所以我们的算法越是进步，和我们的上一个的算法的差异越是不明显。所以我们讨论的其实是让人难以想象的小差距，尤其是在 8bit范围上。所以请原谅我，我干嘛要为0.000000000000000001％这么小的差距占用那么多的开发时间？尤其是到最后许许多多怪异的对话框设置项搅在一起根本无法分清。可用性对我来说也是最重要的。

EL PSY CONGROO！

2014年5月26日星期一

关于几个比较主流的硬解方式

2014年5月24日星期六

madvr的言论总结（翻译整理）