Page 233 - 《软件学报》2024年第4期
P. 233

万常选 等: 主题方面共享的领域主题层次模型                                                          1811


                 出具有明确层次关系和关联关系的主题层次结构.
                    本文通过分层次的主题方面共享机制改变               nCRP  构造方法中主题树形结构生成过程, 提出            nCRP+层次构造方
                 法和  rHDP  层次主题模型, 挖掘不同主题下的关联子主题. 结合领域类别信息定义基于投票机制的领域隶属度计
                 算方法, 目的是引导每层级词语集的主题分配过程, 明确主题与领域之间的映射关系, 构建领域主题之间的层次关
                 系. 通过词语与领域主题的语义相关度引导主题-词语分配过程, 目的是将语义相近的词语分配在相同主题中, 凝
                 聚领域主题涵义. 同时, 通过词语与其所在主题树分支中主题的领域相关性, 定义层次化的主题-词语贡献度, 明确
                 关联子主题在主题词上的领域差异性.
                    结合基于投票机制的领域隶属度、词语与领域主题的语义相关度和层次化的主题-词语贡献度, 设计领域知
                 识的形式化描述, 改进层次化的采样过程, 提出一种通用的、结合领域知识的层次主题模型                             rHDP_DK, 实现领域
                 主题层次关系和关联子主题共享关系的构建, 以及领域主题词的提取.
                    下一步工作将研究基于时变信息的领域主题层次结构, 便于分析各领域主题下的子主题及其主题词在不同时
                 期的变化规律.

                 References:
                  [1]  Liu TX, Xu MF. Can internet search behavior help to forecast the macro economy? Economic Research Journal, 2015, 50(12): 68–83 (in
                      Chinese  with  English  abstract).
                  [2]  Blei DM, Griffiths TL, Jordan MI. The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies.
                     Advances in Neural Information Processing Systems, 2007, 16(2): 17–24.
                  [3]  Paisley  J,  Wang  C,  Blei  DM,  Jordan  MI.  Nested  hierarchical  Dirichlet  processes.  IEEE  Trans.  on  Pattern  Analysis  and  Machine
                     Intelligence, 2015, 37(2): 256–270. [doi: 10.1109/TPAMI.2014.2318728]
                  [4]  Ahmed A, Hong LJ, Smola AJ. Nested Chinese restaurant franchise processes: Applications to user tracking and document modeling. In:
                     Proc. of the 30th Int’l Conf. on Machine Learning. Atlanta: JMLR.org, 2013. 1426–1434.
                  [5]  Meng Y, Zhang YY, Huang JX, Zhang Y, Zhang C, Han JW. Hierarchical topic mining via joint spherical tree and text embedding. In:
                     Proc. of the 26th ACM SIGKDD Int’l Conf. on Knowledge Discovery & Data Mining. ACM, 2020. 1908–1917. [doi: 10.1145/3394486.
                     3403242]
                  [6]  Huang JX, Xie YQ, Meng Y, Zhang YY, Han JW. CoRel: Seed-guided topical taxonomy construction by concept learning and relation
                     transferring. In: Proc. of the 26th ACM SIGKDD Int’l Conf. on Knowledge Discovery & Data Mining. ACM, 2020. 1928–1936. [doi: 10.
                     1145/3394486.3403244]
                  [7]  Zhao H, Du L, Buntine W, Zhou MY. Inter and intra topic structure learning with word embeddings. In: Proc. of the 35th Int’l Conf. on
                     Machine Learning. Stroudsburg: PMLR, 2018. 5892–5901.
                  [8]  Zhao H, Du L, Buntine W, Zhou MY. Dirichlet belief networks for topic structure learning. In: Proc. of the 32nd Int’l Conf. on Neural
                     Information Processing Systems. Montréal: Curran Associates Inc., 2018. 7966–7977.
                  [9]  Isonuma M, Mori J, Bollegala D, Sakata I. Tree-structured neural topic model. In: Proc. of the 58th Annual Meeting of the Association
                     for Computational Linguistics. ACL, 2020. 800–806. [doi: 10.18653/v1/2020.acl-main.73]
                 [10]  Gan Z, Chen CY, Henao R, Carlson D, Carin L. Scalable deep Poisson factor analysis for topic modeling. In: Proc. of the 32nd Int’l Conf.
                     on Machine Learning. Lille: JMLR.org, 2015. 1823–1832.
                 [11]  The  YW,  Jordan  MI,  Beal  MJ,  Blei  DM.  Hierarchical  Dirichlet  processes.  Journal  of  the  American  Statistical  Association,  2006,
                     101(476): 1566–1581. [doi: 10.1198/016214506000000302]
                 [12]  Zhang YT, Wan CX, Liu XP, Jiang TJ, Liu DX, Liao GQ. Mining unstructured economic indicators based on PSP_HDP topic model.
                     Ruan Jian Xue Bao/Journal of Software, 2020, 31(3): 845–865 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5898.
                     htm [doi: 10.13328/j.cnki.jos.005898]
                 [13]  Han ZM, Zhang MM, Li MQ, Duan DG, Chen Y. Flow hierarchical Dirichlet process for complex topic modeling. Chinese Journal of
                     Computers, 2019, 42(7): 1539–1552 (in  Chinese  with  English  abstract). [doi: 10.11897/SP.J.1016.2019.01539]
                 [14]  Ma  TF,  Sato  I,  Nakagawa  H.  The  hybrid  nested/hierarchical  Dirichlet  process  and  its  application  to  topic  modeling  with  word
                     differentiation. In: Proc. of the 29th AAAI Conf. on Artificial Intelligence. AAAI, 2015. 2835–2841.
                 [15]  Ding YQ, Li SP, Zhang Z, Shen B. Hierarchical topic modeling with nested hierarchical Dirichlet process. Journal of Zhejiang University-
                     SCIENCE A, 2009, 10(6): 858–867. [doi: 10.1631/jzus.A0820796]
   228   229   230   231   232   233   234   235   236   237   238