In our research, we focus on graph clustering in the Stochastic Block Model (SBM) where there are both large clusters and small, unrecoverable clusters. Previous methods for exact recovery either do not allow small clusters of size less than the square root of the total number of nodes (n), or require a size difference between the smallest recovered cluster and the largest non-recovered cluster.

To address this limitation, we propose an algorithm based on semidefinite programming (SDP) that can recover large clusters without any restrictions on the remaining cluster sizes. However, the analysis becomes more complex when dealing with mid-sized clusters as they are sensitive to small noise perturbations and cannot be solved using a closed-form solution.

To overcome these challenges, we develop innovative techniques, including a leave-one-out-style argument that controls the correlation between SDP solutions and noise vectors even when a single row of noise can significantly impact the SDP solution. We also improve the bounds for eigenvalue perturbations, which may have independent applications.

By using our gap-free clustering procedure, we achieve efficient algorithms for clustering with a faulty oracle, with significantly better query complexities. Even in the presence of numerous small clusters, we achieve a sample complexity of less than n^2, which is a significant improvement. Additionally, our gap-free clustering procedure leads to improved algorithms for recursive clustering.

Furthermore, our results extend to certain heterogeneous probability settings that are particularly challenging for other algorithms.