Skip to main content

MA 415G course notes

Section 5.4 Intro to Modularity Maximization in Networks and Community Detection

Detecting communities within networks, especially social networks, is a key goal for network scientists and graph theorists. While community detection has a long history, a major driver of this subject was the publication in 2004 of the paper "Finding and Evaluating Community Structure in Networks" by Newman and Girvan https://journals.aps.org/pre/abstract/10.1103/PhysRevE.69.026113, which introduced the concept of modularity maximization. Newman and Girvan discuss in their paper an example of a karate club studied by Wayne Zachary in the 1970s (An Information Flow Model for Conflict and Fission in Small Groups, Wayne W. Zachary, Journal of Anthropological Research 1977 33:4, 452-473). The network represents friendships between 34 members of the club, which eventually split into two factions due to a conflict between the club’s administrator and instructor.
Figure 5.4.1. Zachary’s karate club network
The goal of community detection algorithms is to identify such factions (or communities) based solely on the network structure, without any knowledge of the actual social dynamics of the agents involved. Using the modularity maximization approach, one can obtain the following partition of Zachary’s karate club network, which mirrors the split in the club that occurred in reality.
Figure 5.4.2. Zachary’s karate club network split into communities.
As another example of a network that is frequently studied using community detection algorithms, here is the network of characters in Victor Hugo’s novel "Les Misérables".
Figure 5.4.3. "Les Misérables" character network, divided into communities using modularity maximization, image from Newman and Girvan 2004 https://journals.aps.org/pre/abstract/10.1103/PhysRevE.69.026113.
Our goal in the following sections is to learn how to use the modularity maximization technique to detect communities within a network. While modularity maximization is not the only approach to this problem, it is one that connects well with other topics we have seen in this course, and which is widely used.