A random walk means that you pick a starting node and then randomly pick an edge to move to that node’s neighbor.
You will repeat this step until you get a walk of some predefined length.
For example, here’s the result of performing 1 random walk of length 20 on each node in our example graph presented as a 13 x 20 matrix:The i-th row refers to a random walk performed starting at node i.
For example, the 8-th row reads:This means that the random walk began at node 8, then it walked to node 9, then to node 10 then back to node 9 again, and then back to node 10, and then to node 11, and so on until in ended at node 9.
In NLP terms, we have 13 sentences and each sentence has 20 words in it.
Our words are 1, 2, …, 13 and the size of the vocabulary is 13.
Now that we have a corpus, we can apply SkipGram to learn the embeddings of each node.
To understand why the resulting embeddings will encode the graph’s community structure, suppose for simplicity’s sake that we will create the embeddings using a window of size 1 i.
the context of a word are the words immediately to its left and right.
For example:In the diagram above I have highlighted the context for the word “2”.
Notice that the context for the word “2” tend to be “1”, “3” and/or “6”.
Therefore, the SkipGram embeddings for “1”, “2”, “3”, and “6” will be close to each other which is exactly what we want since these nodes are connected to each other.
In practice, DeepWalk will perform multiple random walks on each node to generate a large corpus.
The following result is generated using DeepWalk with the following hyperparameters:Number of random walks per node: 80Length of 1 random walk: 40Window size: 10Embedding dimension: 128The above diagram is the plot of the embeddings for each node using PCA (default arguments) to reduce the embedding dimension from 128 to 2.
Notice that the embeddings clearly capture the community structure of the graph since nodes 1, 2, 3, 4, 5, 6 together form a distinct cluster from nodes 8, 9, 10, 11, 12, 13.
As node 7 was in the middle of this two “communities”, it makes sense that its embedding shows that it is somewhere in between these two clusters.
For comparison, here is what the embeddings look like using TSNE (perplexity = 5 since this is a small dataset):The results are consistent with PCA.
ConclusionIn this article, we have seen that we can use SkipGram to capture the community structure of a graph by viewing the graph as a kind of language where:Each node is a unique word in the languageRandom walks of finite length on the graph constitutes a sentenceI hope this has improved your understanding of how DeepWalk works.
Let me know in the comments if you have any questions.
ReferencesDeepWalk: Online Learning of Social Representations; Perozzi et al.
2014Word2Vec Tutorial — The Skip-Gram Model; McCormick.
2016.. More details