Optimal Univariate Microaggregation for Privacy Preservation in Data Mining

Main Article Content

Jane Varamani Sulekha

Abstract

In recent years, with the massive development in Internet, data collection and data warehousing technologies, privacy preservation has become one of the greater concerns in data mining. For this reason, several data mining algorithms integrating privacy preserving techniques have been developed in order to prevent the disclosure of sensitive information during the knowledge discovery. A number of effective methods for Privacy Preserving Data Mining (PPDM) have been proposed in the literature. In this paper, we present a brief introduction of different kinds of Microaggregation techniques with their merits and demerits and propose Optimal noise addition based Univariate Microaggregation for anonymizing the individual records. Through the experimental results, our proposed technique is validated to prevent the disclosure of sensitive data without degradation of data utilization. Our work highlights some discussions about future work and promising directions in the perspective of privacy preservation in data mining.

Downloads

Download data is not yet available.

Article Details

Section
Articles

References

Dehkordi, Mohammad Naderi, Kambiz Badie, and Ahmad Khadem Zadeh. "A Novel Method for Privacy Preserving in Association Rule Mining Based on Genetic Algorithms." JSW 4.6 (2009): 555-562.

Lindell, Yehuda, and Benny Pinkas. "Privacy preserving data mining." Annual International Cryptology Conference. Springer Berlin Heidelberg, 2000.

Domingo-Ferrer, Josep, and Josep Maria Mateo-Sanz. "Practical data-oriented microaggregation for statistical disclosure control." IEEE Transactions on Knowledge and data Engineering 14.1 (2002): 189-201.

Domingo-Ferrer, Josep, et al. "Efficient multivariate data-oriented microaggregation." The VLDB Journal—The International Journal on Very Large Data Bases 15.4 (2006): 355-369.

Hundepool, A., et al. "µ-ARGUS version 4.0 Software and User’s Manual." Statistics Netherlands, Voorburg NL (2005).

Solanas, Agusti, Antoni Martinez-Balleste, and J. Domingo-Ferrer. "V-MDAV: a multivariate microaggregation with variable group size." 17th COMPSTAT Symposium of the IASC, Rome. 2006.

Hansen, Stephen Lee, and Sumitra Mukherjee. "A polynomial algorithm for optimal univariate microaggregation." IEEE Transactions on Knowledge and Data Engineering 15.4 (2003): 1043-1044.

Laszlo, Michael, and Sumitra Mukherjee. "Minimum spanning tree partitioning algorithm for microaggregation." IEEE Transactions on Knowledge and Data Engineering 17.7 (2005): 902-911.

Solanas, Agusti, Francesc Sebé, and Josep Domingo-Ferrer. "Micro-aggregation-based heuristics for p-sensitive k-anonymity: one step beyond." Proceedings of the 2008 international workshop on Privacy and anonymity in information society. ACM, 2008.

Chang, Chin-Chen, Yu-Chiang Li, and Wen-Hung Huang. "TFRP: An efficient microaggregation algorithm for statistical disclosure control." Journal of Systems and Software 80.11 (2007): 1866-1878.

Domingo-Ferrer, Josep, and Úrsula González-Nicolás. "Hybrid microdata using microaggregation." Information Sciences 180.15 (2010): 2834-2844.

Lin, Jun-Lin, et al. "Density-based microaggregation for statistical disclosure control." Expert Systems with Applications 37.4 (2010): 3256-3263.

Kabir, Md Enamul, and Hua Wang. "Microdata protection method through microaggregation: A median-based approach." Information Security Journal: A Global Perspective 20.1 (2011): 1-8.

Soria-Comas, Jordi, et al. "t-closeness through microaggregation: Strict privacy with enhanced utility preservation." IEEE Transactions on Knowledge and Data Engineering 27.11 (2015): 3098-3110.

Sánchez, David, et al. "Utility-preserving differentially private data releases via individual ranking microaggregation." Information Fusion 30 (2016): 1-14.

Gal, Tamas S., et al. "A data recipient centered de-identification method to retain statistical attributes." Journal of biomedical informatics 50 (2014): 32-45.

Traub, Joseph F., Yechiam Yemini, and H. Woźniakowski. "The statistical security of a statistical database." ACM Transactions on Database Systems (TODS) 9.4 (1984): 672-679.

Arumugam, G., and V. Sulekha. "IMR based Anonymization for Privacy Preservation in Data Mining." Proceedings of the The 11th International Knowledge Management in Organizations Conference on The changing face of Knowledge Management Impacting Society. ACM, 2016.

Wishart, David. "256. Note: An algorithm for hierarchical classifications." Biometrics (1969): 165-170.

Soria-Comas, Jordi, and Josep Domingo-Ferrer. "Optimal data-independent noise for differential privacy." Information Sciences 250 (2013): 200-214.