Differentiable Community Detection with GNNs and Stochastic Block Models

Community detection is a fundamental problem in network analysis - identifying groups of nodes that are more densely connected internally than to the rest of the network. In our recent paper presented at the Learning on Graphs Conference (LoG 2025), we propose a novel approach that combines Graph Neural Networks (GNNs) with Stochastic Block Models (SBMs) to create a differentiable, architecture-agnostic framework for community detection.

Our Approach: SBM-Based Loss Functions for GNNs

Traditional community detection methods like Louvain and spectral clustering are effective but don’t leverage the representation learning capabilities of Graph Neural Networks. Meanwhile, existing GNN approaches for community detection often use heuristic loss functions that may not directly optimize for community structure quality.

Stochastic Block Models as Loss Functions

Our key insight is that Stochastic Block Models (SBMs) provide a principled way to evaluate partition quality through their likelihood functions. SBMs are generative models that describe how random graphs are created based on community structure. Since SBM likelihood functions are:

Well-defined: They measure how well a partition explains the observed graph structure
Differentiable: They can be used as loss functions for gradient-based optimization
Theoretically grounded: They’re based on statistical principles rather than heuristics

We can use them directly as loss functions for training GNNs in an unsupervised manner.

Architecture-Agnostic Framework

Our approach is architecture-agnostic - it works with any GNN that outputs node embeddings. The training process:

GNN produces node embeddings from the input graph
Embeddings are mapped to soft community assignments
SBM likelihood evaluates the quality of these assignments
Gradients flow back through the network to improve embeddings

This framework allows different GNN architectures (GCN, GAT, GraphSAINT, etc.) to be trained for community detection without modifying their core structure.

Results

Our experiments across multiple datasets show that SBM-based loss functions produce competitive results compared to existing community detection methods, while providing the benefits of:

End-to-end training: No separate clustering step needed
Scalability: Leverages mini-batching and GPU acceleration
Flexibility: Works with various GNN architectures
Interpretability: Loss directly measures partition quality via statistical likelihood

Why This Matters

Community detection is crucial for understanding network structure in domains ranging from social networks to biological systems. By combining the representation learning power of GNNs with the theoretical foundation of SBMs, we create a framework that’s both principled and practical.

The code and experiments are available in our LoG 2025 paper.

Graham Mueller