Page 461 - 《软件学报》2024年第4期
P. 461

软件学报 ISSN 1000-9825, CODEN RUXUEW                                        E-mail: jos@iscas.ac.cn
                 Journal of Software,2024,35(4):2039−2054 [doi: 10.13328/j.cnki.jos.006837]  http://www.jos.org.cn
                 ©中国科学院软件研究所版权所有.                                                          Tel: +86-10-62562563



                                                                     *
                 基于人体和场景上下文的多人                        3D  姿态估计

                 何建航,    孙郡瑤,    刘    琼


                 (华南理工大学 软件学院, 广东 广州 510006)
                 通信作者: 刘琼, E-mail: liuqiong@scut.edu.cn

                 摘 要: 深度歧义是单帧图像多人           3D  姿态估计面临的重要挑战, 提取图像上下文对缓解深度歧义极具潜力. 自顶
                 向下方法大多基于人体检测建模关键点关系, 人体包围框粒度粗背景噪声占比较大, 极易导致关键点偏移或误匹
                 配, 还将影响基于人体尺度因子估计绝对深度的可靠性. 自底向上的方法直接检出图像中的人体关键点再逐一恢
                 复  3D  人体姿态. 虽然能够显式获取场景上下文, 但在相对深度估计方面处于劣势. 提出新的双分支网络, 自顶向下
                 分支基于关键点区域提议提取人体上下文, 自底向上分支基于三维空间提取场景上下文. 提出带噪声抑制的人体
                 上下文提取方法, 通过建模“关键点区域提议”描述人体目标, 建模姿态关联的动态稀疏关键点关系剔除弱连接减
                 少噪声传播. 提出从鸟瞰视角提取场景上下文的方法, 通过建模图像深度特征并映射鸟瞰平面获得三维空间人体
                 位置布局; 设计人体和场景上下文融合网络预测人体绝对深度. 在公开数据集                        MuPoTS-3D  和  Human3.6M  上的实
                 验结果表明: 与同类先进模型相比, 所提模型             HSC-Pose 的相对和绝对    3D  关键点位置精度至少提高        2.2%  和  0.5%;
                 平均根关键点位置误差至少降低           4.2 mm.
                 关键词: 多人场景     3D  姿态估计; 关键点区域提议; 人体上下文; 场景上下文; 人体绝对深度
                 中图法分类号: TP391

                 中文引用格式: 何建航, 孙郡瑤, 刘琼. 基于人体和场景上下文的多人3D姿态估计. 软件学报, 2024, 35(4): 2039–2054. http://www.
                 jos.org.cn/1000-9825/6837.htm
                 英文引用格式: He JH, Sun JY, Liu Q. Multi-person 3D Pose Estimation Using Human-and-scene Contexts. Ruan Jian Xue Bao/Journal
                 of Software, 2024, 35(4): 2039–2054 (in Chinese). http://www.jos.org.cn/1000-9825/6837.htm

                 Multi-person 3D Pose Estimation Using Human-and-scene Contexts
                 HE Jian-Hang, SUN Jun-Yao, LIU Qiong
                 (School of Software Engineering, South China University of Technology, Guangzhou 510006, China)
                 Abstract:  Depth  ambiguity  is  an  important  challenge  for  multi-person  three-dimensional  (3D)  pose  estimation  of  single-frame  images,  and
                 extracting  contexts  from  an  image  has  great  potential  for  alleviating  depth  ambiguity.  Current  top-down  approaches  usually  model  key
                 point  relationships  based  on  human  detection,  which  not  only  easily  results  in  key  point  shifting  or  mismatching  but  also  affects  the
                 reliability  of  absolute  depth  estimation  using  human  scale  factor  because  of  a  coarse-grained  human  bounding  box  with  large  background
                 noise.  Bottom-up  approaches  directly  detect  human  key  points  from  an  image  and  then  restore  the  3D  human  pose  one  by  one.  However,
                 the approaches are at a disadvantage in relative depth estimation although the scene context can be obtained explicitly. This study proposes
                 a  new  two-branch  network,  in  which  human  context  based  on  key  point  region  proposal  and  scene  context  based  on  3D  space  are
                 extracted  by  top-down  and  bottom-up  branches,  respectively.  The  human  context  extraction  method  with  noise  resistance  is  proposed  to
                 describe  the  human  by  modeling  key  point  region  proposal.  The  dynamic  sparse  key  point  relationship  for  pose  association  is  modeled  to
                 eliminate  weak  connections  and  reduce  noise  propagation.  A  scene  context  extraction  method  from  a  bird’s-eye-view  is  proposed.  The
                 human  position  layout  in  3D  space  is  obtained  by  modeling  the  image’s  depth  features  and  mapping  them  to  a  bird’s-eye-view  plane.  A


                 *    基金项目: 广东省自然科学基金  (2021A1515011349); 国家自然科学基金  (61976094)
                  收稿时间: 2022-05-31; 修改时间: 2022-08-16, 2022-09-26; 采用时间: 2022-11-22; jos 在线出版时间: 2023-07-28
                  CNKI 网络首发时间: 2023-08-01
   456   457   458   459   460   461   462   463   464   465   466