BIG DATA PRIVACY PRESERVATION USING K-ANONYMIZATION AND L-DIVERSITY |
Author(s): |
Priyanka Gawali |
Keywords: |
Data anonymization; l-diversity; privacy preservation; hadoop. |
Abstract |
Classification is a fundamental problem in data analysis. Training a classifier requires to get a large collection of data. Releasing person-specific data in its most specific state poses a threat to individual privacy. This paper presents a practical and productive algorithm for determining a abstract version of data that masks sensitive information and remains useful for standardizing structuring. The analysis of data is implemented by specializing or detailing the level of information in a top-down and bottom-up manner until a minimum privacy requirement is compromised. This top-down and bottom-up specialization is practical and efficient for handling both definitive and continuous attributes. Our method exploits the scenario that data usually contains redundant structures for classification. While generalization may remove few structures, other structures originate to help. Our results show that standard of classification can be preserved even for highly prohibitive privacy requirements. This work has big applications to both public and private sectors that share information for mutual advantage and productivity. Experiments on real-life data show that the quality of classification can be preserved even for highly restrictive anonymity requirements. |
Other Details |
Paper ID: IJSARTV Published in: Volume : 2, Issue : 11 Publication Date: 11/4/2016 |
Article Preview |
Download Article |