且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用igraph在R中进行Louvain社区检测-边和顶点的格式

更新时间:2022-12-26 09:54:08

I believe that cluster_louvain did exactly what it should do with your data. The problem is your graph.Your code included the line get.edgelist(test2). That must produce a lot of output. Instead try, this

vcount(test2)
ecount(test2)

Since you say that your correlation matrix is 400x400, I expect that you will get that vcount gives 400 and ecount gives 79800 = 400 * 399 / 2. As you have constructed it, every node is directly connected to all other nodes. Of course there is only one big community.

I suspect that what you are trying to do is group variables that are correlated. If the correlation is near zero, the variables should be unconnected. What seems less clear is what to do with variables with correlation near -1. Do you want them to be connected or not? We can do it either way.

You do not provide any data, so I will illustrate with the Ionosphere data from the mlbench package. I will try to mimic your code pretty closely, but will change a few variable names. Also, for my purposes, it makes no sense to write the edges to a file and then read them back again, so I will just directly use the edges that are constructed.

First, assuming that you want variables with correlation near -1 to be connected.

library(igraph)
library(mlbench)    # for Ionosphere data
library(psych)      # for cor2dist
data(Ionosphere)

correlationmatrix = cor(Ionosphere[, which(sapply(Ionosphere, class) == 'numeric')])
distancematrix <- cor2dist(correlationmatrix)

DM1 <- as.matrix(distancematrix)
## Zero out connections where there is low (absolute) correlation
## Keeps connection for cor ~ -1
## You may wish to choose a different threshhold
DM1[abs(correlationmatrix) < 0.33] = 0

G1 <- graph.adjacency(DM1, mode = "undirected", weighted = TRUE, diag = TRUE)
vcount(G1)
[1] 32
ecount(G1)
[1] 140

Not a fully connected graph! Now let's find the communities.

clusterlouvain <- cluster_louvain(G1)
plot(G1, vertex.color=rainbow(3, alpha=0.6)[clusterlouvain$membership])

If instead, you do not want variables with negative correlation to be connected, just get rid of the absolute value above. This should be much less connected

DM2 <- as.matrix(distancematrix)
## Zero out connections where there is low correlation
DM2[correlationmatrix < 0.33] = 0

G2 <- graph.adjacency(DM2, mode = "undirected", weighted = TRUE, diag = TRUE)
clusterlouvain <- cluster_louvain(G2)
plot(G2, vertex.color=rainbow(4, alpha=0.6)[clusterlouvain$membership])