PART 3
Exploring the World of Graphs
# Printing the subgraph of communities
subgraph = graph.subgraph(nodes_list)
nx.draw(subgraph, with_labels=True)
plt.show()
All cliques of four: [[0, 1, 2, 3, 13], [0, 1, 2, 3, 7],
[33, 32, 8, 30], [33, 32, 23, 29]]
Found these communities: [[0, 1, 2, 3, 7, 13],
[32, 33, 29, 23], [32, 33, 8, 30]]
The example begins by extracting just the nodes in the karate club dataset that
have four or more connections, and then prints the cliques with a minimum size
of four. Of course, you can set any level of connections needed to obtain the desired
resource. Perhaps you consider a clique a community in which each node has
twenty connections, but other people might see a clique as a community where
each node has just three connections.
The list of cliques doesn’t really help you much, though, if you want to see the
communities. To see them, you need to rely on specialized and complex
algorithms to merge overlapping cliques and find clusters, such as the clique
percolation method described at
https://gaplogs.net/2012/04/01/simple-
community-detection-algorithms/
. The NetworkX package offers
k_clique_
communities
, an implementation of the clique percolation algorithm, which
results in the union of all the cliques of a certain size (the k parameter). These
cliques of a certain size share k-1 elements (that is, they differ by just one
component, a truly strict rule).
Clique percolation provides you with a list of all the communities found. In this
example, one clique revolves around the karate instructor and another revolves
around the president of the club. In addition, you can extract all the nodes that are
part of a community into a single set, which helps you create a subgraph made of
just communities.
Finally, you can draw the subgraph and display it. Figure 10-2 shows the output
of this example, which displays the ensemble of cliques with four or more
connections.
Finding cliques in graphs is a complex problem requiring many computations (it’s
a difficult problem) that an algorithm solves using a brute-force search, which
means trying all possible subsets of vertexes to determine whether they’re cliques.
With some luck, because some randomization is needed for the algorithm to
succeed, you can find a large clique using a simple approach whose complexity is
O(n+m), where n is the number of vertexes and m the edges. The following steps
describe this process.
CHAPTER 10
Do'stlaringiz bilan baham: |