direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Es gibt keine deutsche Übersetzung dieser Webseite.

Entity Disambiguation by Partitioning Under Heterogeneity Constraints

Falk Hüffner (Postdoc TU Berlin)

We consider disambiguation of entries using a given partition of the entries into groups, where we know that no two entries in one group can refer to the same entity. One application is in correcting inaccurate interlanguage links in Wikipedia, and another is matching user profiles from different social networks. This is formalized by a simple and elegant graph-theoretic model: Given an undirected vertex-colored graph, delete a minimum number of edges such that none of the resulting connected components contain two vertices of the same color. This graph partitioning problem is NP-hard. We present methods based on mathematical programming that allow for an exact solution in real-world Wikipedia interlanguage link networks with millions of vertices. Moreover, these methods perform well in experiments on realistic random data. Finally, we demonstrate that a greedy heuristic provides solutions very close to optimal ones and scales well.

10.05.2012 16:15
Falk Hüffner
FR 6510

Back to the research colloquium site.

Zusatzinformationen / Extras