Understanding a graph by peeling away nodes

John

from John D. Cook on 2016-01-25 20:14 (#11TN9)

One way to get some understanding of a graph is to peel away nodes, starting with the least connected. The k-core of a graph is the maximal subgraph such that every vertex has degree at least k. The k-shell is the set of vertices that are part of the k-core but not part of the (k+1)-core.

For example, the 0-core of a graph is simply the entire graph since every vertex has at least zero edges; a vertex can't have a negative number of edges. The 1-core of a graph contains the vertices that are connected to other vertices. You form the 1-core by throwing away the 0-shell, the set of isolated vertices. You move from the 1-core to the 2-core by removing nodes of degree 1 until everything that's left has degree at least 2.

NB: You do not form the k-core simply by discarding nodes of degree less than k. For example, consider the star graph below.

If you peel off the the nodes of degree 1, eventually you have nothing left. There is no 2-core, even though the original graph had a node of degree 5. Or going back to the definition, there is no subgraph such that every vertex has degree 2, and so no maximal subgraph. The node of degree 5 only has degree five because it is connected to nodes of degree 1.

The k-cores of a complete graph of degree n are simply the entire graph, for k a n. The k-shells are empty for k < n.

Keyword overlap

Let's see what the k-cores and k-shells look like for some more complex graphs. First, let's go back to the post looking at the overlap in keywords. The edges of the graph are the 200 search terms that have lead visitors to the site at least twice. We connect two search terms if they share a word. The 56-core of the graph has 57 vertices, and the 57-core is empty. So the 56-core is a complete graph. Here's the distribution of the k-shell sizes.

Random graphs

Next let's look at a random graph. As in the post on spectra of random graphs, we look at an ErdAs-Ri(C)nyi graph G_n,p. First we start with n = 1000 and p = 0.01. So we start with 1000 nodes, and there's a 0.01 that any particular pair of nodes is connected. Here are the core sizes for one such random graph:

[1000 1000 999 999 990 970 914 793]

So the 7-core has 793 nodes, then the 8-core is empty. I ran this ten times and the 7-core sizes ranged from size 709 to 849. But every time the 8-core was empty. The largest value of k such that the k-core is non-empty is called the degeneracy of the graph, so we could say the degeneracy was 7 every time.

I reran with n = 2000. The degeneracy was 14 every time. The 14-cores were in the range 1723 to 1810.

Finally I set n = 4000 and ran again. The degeneracy was 29 three times and 30 seven times. The final core sizes ranged from 3451 to 3714.

The experiments above suggest that the k-core size abruptly drops to zero, at a predictable value of k, and with a fairly predictable core size. There are papers on k-core size of random graphs, but I have not yet read any. (I have one in my to-read queue that someone let me know about after posting this article.)

* * *

Consulting for complex networks

Source	RSS or Atom Feed
Feed Location	http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title	John D. Cook
Feed Link	https://www.johndcook.com/blog