TU Berlin

Algorithmics and Computational Complexity Research GroupTalk 10.05.2012


Page Content

to Navigation

Entity Disambiguation by Partitioning Under Heterogeneity Constraints

Falk Hüffner (Postdoc TU Berlin)

We consider disambiguation of entries using a given partition of the entries into groups, where we know that no two entries in one group can refer to the same entity. One application is in correcting inaccurate interlanguage links in Wikipedia, and another is matching user profiles from different social networks. This is formalized by a simple and elegant graph-theoretic model: Given an undirected vertex-colored graph, delete a minimum number of edges such that none of the resulting connected components contain two vertices of the same color. This graph partitioning problem is NP-hard. We present methods based on mathematical programming that allow for an exact solution in real-world Wikipedia interlanguage link networks with millions of vertices. Moreover, these methods perform well in experiments on realistic random data. Finally, we demonstrate that a greedy heuristic provides solutions very close to optimal ones and scales well.

10.05.2012 16:15
Falk Hüffner
FR 6510

Back to the research colloquium site.


Quick Access

Schnellnavigation zur Seite über Nummerneingabe