Page 265 - 《软件学报》2021年第12期
P. 265

软件学报 ISSN 1000-9825, CODEN RUXUEW                                        E-mail: jos@iscas.ac.cn
         Journal of Software,2021,32(12):3929−3944 [doi: 10.13328/j.cnki.jos.006101]   http://www.jos.org.cn
         ©中国科学院软件研究所版权所有.                                                          Tel: +86-10-62562563


                                                    ∗
         分组随机化隐私保护频繁模式挖掘

                       2
               1
         郭宇红 ,   童云海 ,   苏燕青  1
         1
          (国际关系学院  网络空间安全学院,北京  100091)
         2 (北京大学  智能科学系,北京  100871)
         通讯作者:  郭宇红, E-mail: yhguo@uir.cn

         摘   要:  已有的隐私保护频繁模式挖掘随机化方法不考虑隐私保护需求差异性,对所有个体运用统一的随机化参
         数,实施同等的保护,无法满足个体对隐私的偏好.提出基于分组随机化的隐私保护频繁模式挖掘方法 (grouping-
         based randomization for privacy preserving frequent pattern mining,简称 GR-PPFM).该方法根据不同个体的隐私保护
         要求进行分组,为每一组数据设置不同的隐私保护级别和与之相适应的随机化参数.在合成数据和真实数据中的实
         验结果表明:相对于统一单参数随机化 mask,分组多参数随机化 GR-PPFM 不仅能够满足不同群体多样化的隐私保
         护需求,还能在整体隐私保护度相同情况下提高挖掘结果的准确性.
         关键词:  分组;随机化;个性化;隐私保护;频繁模式挖掘
         中图法分类号: TP309

         中文引用格式:  郭宇红,童云海,苏燕青.分组随机化隐私保护频繁模式挖掘.软件学报,2021,32(12):3929−3944.  http://www.
         jos.org.cn/1000-9825/6101.htm
         英文引用格式: Guo YH, Tong YH, Su YQ. Privacy preserving frequent pattern mining based on grouping randomization. Ruan
         Jian Xue Bao/Journal of Software, 2021,32(12):3929−3944 (in Chinese). http://www.jos.org.cn/1000-9825/6101.htm
         Privacy Preserving Frequent Pattern Mining Based on Grouping Randomization

                    1
                                   2
         GUO Yu-Hong ,  TONG Yun-Hai ,   SU Yan-Qing 1
         1 (School of Cyber Science and Engineering, University of International Relations, Beijing 100091, China)
         2 (Department of Machine Intelligence, Peking University, Beijing 100871, China)
         Abstract:    Existing randomization methods of privacy preserving frequent pattern mining use a uniform randomization parameter for all
         individuals, without considering the differences of privacy requirements. This equal protection cannot satisfy individual preferences for
         privacy.  This study proposes  a  method of privacy preserving frequent pattern  mining based on grouping randomization (referred to as
         GR-PPFM). In this method, individuals are grouped according to their different privacy protection requirements. Different group of data is
         assigned to different privacy protection level and corresponding random parameter. The experimental results of both synthetic and real-
         world data show that compared with the uniform single parameter randomization of mask, grouping randomization with multi parameters
         of GR-PPFM  can  not only  meet the  needs of different  groups of diverse  privacy protection, but  also improve  the  accuracy of  mining
         results with the same overall privacy protection.
         Key words:    grouping; randomization; personalization; privacy preserving; frequent pattern mining

             频繁模式挖掘应用广泛,比如:医学研究人员希望通过分析医学普查数据,发现疾病间的关联,获取并发症
                   [1]
         等病学知识 ——例如患糖尿病的人通常伴随着冠心病和高血压.然而在数据普查时,出于隐私的考虑,许多人

            ∗  基金项目:  国家自然科学基金(60403041);  中央高校基本科研业务费专项资金(3262017T48, 3262018T02)
              Foundation item:  National  Natural  Science Foundation of  China (60403041); Fundamental Research Funds for the  Central
         Universities (3262017T48, 3262018T02)
              收稿时间: 2019-08-28;  修改时间: 2019-12-24, 2020-04-04;  采用时间: 2020-06-16
   260   261   262   263   264   265   266   267   268   269   270