Page 234 - 《软件学报》2021年第11期
P. 234

3560                                Journal of Software  软件学报 Vol.32, No.11, November 2021

                 6    结束语

                    事务数据是一种重要的数据类型,具有广泛的应用场景,如推荐系统、购物分析、用户行为分析等.由于事
                 务数据产生于用户真实的购物或浏览行为,其中含有大量用户的隐私信息,人们对隐私信息也越来越关注.因
                 此,研究在保护用户隐私的前提下收集用户的数据显得至关重要.本文提出一种面向频繁项集挖掘的本地差分
                 隐私事务数据收集方法,基于压缩的本地差分隐私模型,设计了一种新的距离度量函数与抽样方法,理论依据充
                 实,实验效果对比 PrivSet 方法更优.同时,考虑到隐私参数的设置困难,本文还提出一种基于最大后验置信度的
                 启发式隐私参数设置策略,使得隐私参数的设置能够在语义的指导下进行.接下来有以下 4 个工作方向:(1)  考
                 虑将本方法应用于轨迹数据的隐私保护收集;(2)  分析与对比本地差分隐私与压缩的本地差分隐私的本质区
                 别;(3)  实验表明,TDC_CLDP 方法适用于(d,m)较大的场景,主要原因是不同的距离函数导致了错误边界的不
                 同,需要进一步从理论的角度分析其原因;(4)  本文设计的分值函数是一个曼哈顿距离,该距离的直观意义为相
                 异度,即项不相同的数目,基于该分值抽样时引入了额外的噪音,即将不存在的项(项为 0 的部分)也看成存在(即
                 将为 0 的所有项全部转为 1),后续工作需要解决该问题,提升算法的效用性.

                 References:
                 [1]    Ye  QQ,  Meng  XF,  Zhu MJ, Huo  Z. Survey on local  differential privacy.  Ruan Jian  Xue  Bao/Journal of Software, 2018,29(7):
                     1981−2005 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5364.htm [doi: 10.13328/j.cnki.jos.005364]
                 [2]    Wang SW, Huang LS, Nie YW, et al. PrivSet: Set-valued data analyses with local differential privacy. In: Proc. of the IEEE Conf.
                     on Computer Communications. Honolulu: IEEE, 2018. 1088−1096.
                 [3]    Sweeney L. k-anonymity: A model for protecting privacy. Int’l Journal of Uncertainty Fuzziness and Knowledge Based Systems,
                     2002,5(10):557−570.
                 [4]    Machanavajjhala A, Kifer D, Gehrke J, et al. ℓ-diversity: Privacy beyond k-anonymity. ACM Trans. on Knowledge Discovery from
                     Data (TKDD), 2017,1(1):1−3.
                 [5]    He Y, Naughton JF. Anonymization of set-valued data via top-down, local generalization. Proc. of the VLDB Endowment, 2009,
                     2(1):934−945.
                                                    m
                 [6]    Gergely A, Jagdish PA, Claude C. Probabilistic k -anonymity efficient anonymization of large set-valued datasets. In: Proc. of the
                     2015 IEEE Int’l Conf. on Big Data (big data). IEEE, 2015. 1164−1173.
                 [7]    Liu J. Publishing set-valued data against realistic adversaries. Journal of Computer Science and Technology, 2012,27(1):24−36.
                 [8]    Wang S, Tsai Y, Kao H, et al. Anonymizing set valued social data. In: Proc. of the Green Computing and Communications. New
                     York: IEEE, 2010. 809−812.
                 [9]    Chen R, Fung BCM, Mohammed N, et al. Privacy-preserving trajectory data publishing by local suppression. Information Sciences,
                     2013,231(10):83−97.
                [10]    Ghinita  G,  Tao  Y, Kalnis P.  On the  anonymization of sparse high-dimensional data. In: Proc. of  the 25th Int’l  Conf. on  Data
                     Engineering. Piscataway: IEEE, 2008, 715−724.
                [11]    Cao J, Karras P, Raïssi C, et al. ρ-uncertainty: Inference-proof transaction anonymization. Proc. of the VLDB Endowment, 2010,
                     3(1-2):1033−1044.
                [12]    Terrovitis M, Mamoulis N, Kalnis P. Privacy-preserving anonymization of set-valued data. Proc. of the VLDB Endowment, 2008,
                     1(1):115−125.
                [13]    Xu Y, Wang K, Fu A, et al. Anonymizing transaction databases for publication. In: Proc. of the 14th ACM SIGKDD Int’l Conf. on
                     Knowledge Discovery and Data Mining. New York: ACM, 2008. 767−775.
                [14]    Chen R, Fung BCM, Desai BC. Differentially private trajectory data publication. In: Proc. of the 18th ACM SIGKDD Int’l Conf. on
                     Knowledge Discovery and Data Mining. New York: ACM, 2012. 213−221.
                [15]    Ouyang  J,  Yin J,  Liu SP,  Liu  YB.  An  effective differential privacy  transaction data publication strategy. Journal of  Computer
                     Research and Development, 2014,51(10):2195−2205 (in Chinese with English abstract).
   229   230   231   232   233   234   235   236   237   238   239