direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

The Effect of Homogeneity on the Complexity of Anonymizing Data

André Nichterlein (TU Berlin) [1]

Motivated by data privacy applications [3], the NP-hard k-Anonymity problem asks, given an n × m matrix M over a fixed alphabet and an integer s > 0, whether M can be made k-anonymous by suppressing (blanking out) at most s entries. A matrix M is said to be k-anonymous if for each row r in M there are at least k − 1 other rows in M which are identical to r. Complementing previous work [1, 2], we introduce two new “data-driven” parameterizations for k-Anonymity—the number in of different input rows and the number out of different output rows—for modeling aspects of data homogeneity. We show that k-Anonymity is fixed-parameter tractable for the parameter in, and it is NP-hard even for out = 2 and alphabet size four. Notably, our fixed-parameter tractability result implies that k-Anonymity can be solved in linear  time when in is a constant. Our results also extend to some interesting generalizations of k-Anonymity such as ℓ-diversity and anonymization using the so-called “domain generalization hierarchies”.

04.05.2011 14:15
André Nichterlein
FR 6510

* Language is depending on the audience.

Back to overview. [2]

P. Bonizzoni, G. D. Vedova, R. Dondi, and Y. Pirola. Parameterized complexity of k-anonymity: Hardness and tractability. In Proc. 21st IWOCA, volume 6460 of LNCS, pages 242–255. Springer, 2010.
P. A. Evans, T. Wareham, and R. Chaytor. Fixed-parameter tractability of anonymizing data by suppressing entries. J. Comb. Optim., 18(4):362–375, 2009.
L. Sweeney. Uniqueness of simple demographics in the U.S. population. Technical report, Carnegie Mellon University, School of Computer Science, Laboratory for International Data Privacy, 2000.
------ Links: ------