Clustering with Tangles: Algorithmic Framework and Theoretical Guarantees
Solveig Klepper, Christian Elbracht, Diego Fioravanti, Jakob Kneip, Luca Rendsburg, Maximilian Teegen, Ulrike von Luxburg; Volume 24, Issue 190, Pages 1-56, 2023.
Abstract
Tangles were originally developed as a mathematical tool in graph theory to prove the graph minor theorem. In this paper, we demonstrate the practical applications of tangles in machine learning. By aggregating cuts of a dataset, tangles provide a way to identify dense structures and characterize clusters through consistent pointers. This flexible approach can be used in various scenarios, such as questionnaires, community detection in graphs, and clustering in metric spaces. The resulting output is hierarchical and generates a soft dendrogram, enabling the exploration of a dataset’s cluster structure. The computational complexity of aggregating the cuts is linear in the number of data points, with the generation of cuts being the main bottleneck. However, simple and fast algorithms can suffice for this step. This paper presents the algorithmic framework for clustering with tangles, provides theoretical guarantees in different settings, and includes extensive simulations and use cases. The Python code for the framework is available on GitHub.
[abs]