-->
Save your FREE seat for 流媒体 Connect this August. 现在注册!

是时候让PSNR退休了

文章特色图片

Virtually all experts on video quality metrics agree that the peak signal-to-noise ratio (PSNR) metric is a poor predictor of subjective quality. 然而,, PSNR comparisons are included in almost all codec comparisons, most recently in the excellent IEEE white paper, "Comparing VVC, HEVC and AV1 Using Objective and Subjective Assessments." It's time to retire PSNR, at least for these types of analyses. 

I was reminded of PSNR's poor performance, yet again, when I started reviewing test results for a new video quality metric from the ITU called ITU-T Rec. P.1204. 采取 退一步, the reason streaming producers use video quality metrics is to help make encoding decisions that improve subjective quality. 出于这个原因, the most critical performance feature for any metric is how accurately it predicts how human eyes will rate the same video. 

To assess this, researchers compile databases of videos and subjective ratings from multiple viewers. Then they rate the score with the metric and see how these scores compare to the subjective scores for the same video. You can see three such comparisons in the graphic; on the left for P.1204.3, in the middle for PSNR, and on the right for Video Multimethod Assessment Fusion (VMAF). 在这三个国家中, vertical axis presents the results of subjective ratings, while the horizontal axis is the metric score. 

If the metric score matched the subjective rating perfectly, you'd see a solid line from the lower left to the upper right. No metric is perfect, so you never see a solid line. 然而, the more closely packed the datapoints are around that line, the more accurate the predictions. 看图表,P.1204.3是 most accurate, with VMAF next, and PSNR the 至少到目前为止. You can see a similar graph for SSIMPLUS.

The pattern in the graphs is verified by Pearson's correlation coefficient (PCC) for each dataset. Briefly, PCC measures the linear correlation between two variables: X and Y. According to Statistics Solutions, "If the coefficient value lies between ± 0.50和±1, then it is said to be a strong correlation." So despite the seeming randomness in the PSNR plot, a mathematician would say that the correlation is strong. Still, when more accurate metrics like VMAF, SSIMPLUS, and P.1204 are available, PSNR as a measure of quality is a waste of time and space. 

Interestingly, PSNR has a "canary in a coal mine" utility, which is to identify VMAF hacking methods. 这是, hacking techniques like pre-encoding sharpening and contrast adjustments can send VMAF scores through the roof, but will also send PSNR scores through the floor. If you see VMAF scores that seem excessively high, you should run a quick PSNR test to verify the hack. Even that use is waning, however, as Netflix recently introduced a no-hacking模型 that all codec testers should explore.

Keep your eye out for more on P.1204.3, which is a "no-reference" metric that can compute a score without comparing the encoded file to the source. This makes it much more convenient than full-reference metrics like PSNR and VMAF. 如果P.1204.3 finds its way into tools like FFmpeg and the Moscow State University 视频质量 Measurement Tool, this will provide even greater justification for dropping PSNR from codec comparisons.

流媒体覆盖
免费的
合资格订户
现在就订阅 最新一期 过去的问题
相关文章

How to Choose a 视频质量 Metric

1月时 discusses the pros and cons of three key objective quality metric tools: Moscow State University, SSIMplus, 和Hybrik(杜比).

When it Comes to 视频质量 Measurements, Average Won't Cut It

Average scores can be deceiving, so be sure you're using a tool that gives you a more accurate assessment of your video quality

Video: How to Measure Picture Quality

Tektronix Applications Engineer Andrew Scott discusses objective and subjective methods and tools for picture quality measurement in this clip from his presentation at 流媒体 West 2018.