Graham Mueller bio photo

Graham Mueller

Applied mathematics refugee, PhD economist, interested in time series, machine learning and graph theory.

LinkedIn Github

Community detection is a fundamental problem in network analysis - identifying groups of nodes that are more densely connected internally than to the rest of the network. In our recent paper presented at the Learning on Graphs Conference (LoG 2025), we propose a novel approach that combines Graph Neural Networks (GNNs) with Stochastic Block Models (SBMs) to create a differentiable, architecture-agnostic framework for community detection.

Our Approach: SBM-Based Loss Functions for GNNs

Traditional community detection methods like Louvain and spectral clustering are effective but don’t leverage the representation learning capabilities of Graph Neural Networks. Meanwhile, existing GNN approaches for community detection often use heuristic loss functions that may not directly optimize for community structure quality.

Stochastic Block Models as Loss Functions

Our key insight is that Stochastic Block Models (SBMs) provide a principled way to evaluate partition quality through their likelihood functions. SBMs are generative models that describe how random graphs are created based on community structure. Since SBM likelihood functions are:

  1. Well-defined: They measure how well a partition explains the observed graph structure
  2. Differentiable: They can be used as loss functions for gradient-based optimization
  3. Theoretically grounded: They’re based on statistical principles rather than heuristics

We can use them directly as loss functions for training GNNs in an unsupervised manner.

Architecture-Agnostic Framework

Our approach is architecture-agnostic - it works with any GNN that outputs node embeddings. The training process:

  1. GNN produces node embeddings from the input graph
  2. Embeddings are mapped to soft community assignments
  3. SBM likelihood evaluates the quality of these assignments
  4. Gradients flow back through the network to improve embeddings

This framework allows different GNN architectures (GCN, GAT, GraphSAINT, etc.) to be trained for community detection without modifying their core structure.

Results

Our experiments across multiple datasets show that SBM-based loss functions produce competitive results compared to existing community detection methods, while providing the benefits of:

  • End-to-end training: No separate clustering step needed
  • Scalability: Leverages mini-batching and GPU acceleration
  • Flexibility: Works with various GNN architectures
  • Interpretability: Loss directly measures partition quality via statistical likelihood

Why This Matters

Community detection is crucial for understanding network structure in domains ranging from social networks to biological systems. By combining the representation learning power of GNNs with the theoretical foundation of SBMs, we create a framework that’s both principled and practical.

The code and experiments are available in our LoG 2025 paper.