Advancing the discovery of unique column combinations
Unique column combinations of a relational database table are sets of columns that contain only unique values. Discovering such combinations is a fundamental research problem and has many different data management and knowledge discovery applications. Existing discovery algorithms are either brute f...
Gespeichert in:
1. Verfasser: | |
---|---|
Weitere Verfasser: | |
Format: | UnknownFormat |
Sprache: | eng |
Veröffentlicht: |
Potsdam
Univ.-Verl. Potsdam
2011
|
Schriftenreihe: | Technische Berichte des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität Potsdam
51 |
Schlagworte: | |
Online Zugang: | Inhaltstext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Unique column combinations of a relational database table are sets of columns that contain only unique values. Discovering such combinations is a fundamental research problem and has many different data management and knowledge discovery applications. Existing discovery algorithms are either brute force or have a high memory load and can thus be applied only to small datasets or samples. In this paper, the wellknown GORDIAN algorithm and "Apriori-based" algorithms are compared and analyzed for further optimization. We greatly improve the Apriori algorithms through efficient candidate generation and statistics-based pruning methods. A hybrid solution HCAGORDIAN combines the advantages of GORDIAN and our new algorithm HCA, and it significantly outperforms all previous work in many situations. |
---|---|
Beschreibung: | 25 S. graph. Darst. |
ISBN: | 9783869561486 978-3-86956-148-6 |