Page 289 - 《软件学报》2020年第11期
P. 289

3604                                Journal of Software  软件学报 Vol.31, No.11, November 2020

                 Abstract:    Fast block-wise motion estimation algorithm based on translational model solves the high computational complexity issue to
                 some  extent, but it sacrifices the  motion  compensation quality,  whilst  the higher-order  motion  model still  exhibits the problems of
                 computationally inefficiency  and unstable  convergence.  Through  a number of  experiments, it is found that  about 56.21% of the  video
                 blocks contain zoom motion, thus a conclusion is drawn that zoom motion is one of the most important motion forms in video except for
                 the translational motion. Therefore, a zoom coefficient  is introduced  into  the conventional  block-wise translational model  by  bilinear
                 interpolation, and model the motion-compensated error into a quadratic function with regard to the zoom coefficient. Subsequently, the
                 approach is derived to compute the optimal zoom coefficient under the condition of 1D zoom motion through Vieta’s theorem, which is
                 further extended to the condition of 2D zoom motion with equal proportion. Based on the above, a fast block-matching motion estimation
                 algorithm  is presented  and is optimized by the  adaptive  zoom  coefficient. It first uses  the diamond search (DS) to  compute the
                 translational  motion vector,  and then determines  an optimal  matching block for the block to be predicted  with the  adaptive  zoom
                 coefficient. Experimental results carried out on 33 standard test video sequences showed that the proposed algorithm gains separately 0.11
                 dB and 0.64 dB higher motion-compensated peak signal-to-noise ratio (PSNR) than those of the full search (FS) and the DS based on
                 block-wise translational model. And its computational complexity is 96.02% lower than that of the FS, slightly higher than that of the DS.
                 Compared with the motion estimation based on the zoom model, the average PSNR of the proposed algorithm is 0.62 dB lower than that
                 of 3D full search, but 0.008 dB higher than that of fast 3D diamond search. And the computational complexity only amounts to 0.11% and
                 3.86% of the 3D full search  and the 3D diamond search, respectively.  Meanwhile, the  proposed  algorithm  can realize  the self-
                 synchronization between the encoder and decoder without transmitting the zoom vectors, so it does not increase the overhead of the side
                 information. Additionally,  the  proposed adaptive zoom  coefficient computation can also  be combined with  state-of-art  fast  block-wise
                 motion estimation algorithms other than the diamond search, improving their motion-compensation quality.
                 Key words:    video coding; motion estimation; block matching; zoom model; adaptive zoom coefficient

                    运动估计是 AVS、H.265/HEVC 和 MPEG 等视频编码器所采用的一种时间域差分预测方法,它为“差分预
                 测+变换”的闭环反馈编码架构带来了最主要的编码增益                    [1,2] .然而文献[1,3,4]在对各个编码环节的计算量进行
                 定量分析后发现,运动估计环节的计算开销占整个编码器所需计算资源的 40%以上.若视频编码器开启了可变
                 块尺寸的 1/8 像素运动估计、自适应运动矢量预测等高级模式,运动估计甚至会耗费编码器全部计算资源的
                 80%.在这种情况下,为了达到更加合理的码率-失真-计算复杂度(rate-distortion-complexity,简称 R-D-C)性能,现
                 有视频编码标准均采用了基于平移模型的块匹配算法来去除由物体平移运动所产生的时间域冗余,并出现了 7
                                     [5]
                 类快速视频运动估计算法 .
                    (1)  基于候选向量下采样的运动估计:按照某种原则(如中心偏置原则),选择搜索窗口中的少数运动向量
                                                                                                    [6]
                        作 为 候 选向量 集合 ,进而确 定补 偿误差 最小 的候选 向量 作为运 动向 量 , 如 UMHexagonS 、
                                [7]
                                                    [9]
                                       [8]
                        TZSearch 、EPZS 和抛物线搜索 等.
                    (2)  基于像素下采样的运动估计:采用某种采样矩阵(如层次采样矩阵                      [10,11] 、梅花形采样矩阵  [12] 和自适应
                        采样矩阵    [13] )将待匹配的宏块进行下采样,进而在搜索窗口中,为尺寸缩小了的待预测宏块计算每个
                        候选向量的补偿误差,得到最佳运动向量.
                    (3)  基于像素预排序的运动估计:在计算待匹配宏块的运动补偿误差时,采用一定的预测策略确定宏块中
                        可能产生较大帧差的像素,并优先统计其对应的运动补偿误差.若累积误差超过当前最优向量的预测
                        误差,则可排除该候选向量成为最佳运动向量的可能,例如 PDS 算法                     [14,15] 等.
                    (4)  基于低复杂度匹配函数的运动估计:采用异或、比较、取绝对值等运算替代均方差函数的减法、乘
                        法、平方根操作,从而减少计算运动补偿误差所需要的 CPU 时钟周期数及硬件开销,如文献[16,17]等.
                    (5)  基于低比特深度像素的运动估计:采用某种位深度映射函数,将具有较高位深(如 12bit 和 8bit)的像素
                        转换为低位深的像素,进而将多个像素的低位深表示合并到 1 个机器字中.若与第(4)类的低复杂度匹
                        配函数联合使用,则能达到 1 次操作即可求解多个像素的补偿误差的目的,如 1bit 运动估计                           [18] 和 2bit
                        运动估计    [19,20] 等.
                    (6)  基于散列表的运动估计:利用散列函数将待匹配块的像素值映射为一个散列值,再借助散列表查找最
                        佳匹配块    [21−23] ,从而避免了匹配误差的重复计算,可将运动估计的时间复杂度由平方阶降低到线性
   284   285   286   287   288   289   290   291   292   293   294