Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more efficient Matrix computation for large monoplex networks #11

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

TimBMK
Copy link

@TimBMK TimBMK commented May 2, 2023

I've found that, for large monoplex networks, the current implementation of the matrix transformation becomes quite inefficient, as it uses a lot (hundres of GB) of RAM. The culprit seems to be this line of code in compute.adjacency.matrix():
offdiag <- (delta/(L-1))*Idem_Matrix

However, as far as I can see, this step (and everything related to it) is not necessary for monoplex networks. I assume, however, that it is relevant for multiplex networks, but as I am not currently working with those, I had no way of testing this. Therefore, I slightly adjusted the code to skip this step for monoplex networks, and leave as-is for multiplex networks.

In all my tests (both with a toy example and a larger dataset) the results for monoplex datasets were identical. See the following reprex, where the adjusted function is called compute.adjacency.matrix_2():

library(RandomWalkRestartMH)
library(igraph)

compute.adjacency.matrix_2 <- function(x,delta = 0.5)
{
  if (!isMultiplex(x) & !isMultiplexHet(x)) {
    stop("Not a Multiplex or Multiplex Heterogeneous object")
  }
  if (delta > 1 || delta <= 0) {
    stop("Delta should be between 0 and 1")
  }
  
  N <- x$Number_of_Nodes_Multiplex
  L <- x$Number_of_Layers
  
  ## We impose delta=0 in the monoplex case.
  if (L==1){
    delta = 0
  }
  
  Layers_Names <- names(x)[seq(L)]
  
  ## IDEM_MATRIX.
  Idem_Matrix <- Matrix::Diagonal(N, x = 1)
  
  counter <- 0 
  Layers_List <- lapply(x[Layers_Names],function(x){
    
    counter <<- counter + 1;    
    if (is_weighted(x)){ 
      Adjacency_Layer <-  as_adjacency_matrix(x,sparse = TRUE, 
                                              attr = "weight")
    } else {
      Adjacency_Layer <-  as_adjacency_matrix(x,sparse = TRUE)
    }
    
    Adjacency_Layer <- Adjacency_Layer[order(rownames(Adjacency_Layer)),
                                       order(colnames(Adjacency_Layer))]
    colnames(Adjacency_Layer) <- 
      paste0(colnames(Adjacency_Layer),"_",counter)
    rownames(Adjacency_Layer) <- 
      paste0(rownames(Adjacency_Layer),"_",counter)
    Adjacency_Layer
  })
  
  MyColNames <- unlist(lapply(Layers_List, function (x) unlist(colnames(x))))
  MyRowNames <- unlist(lapply(Layers_List, function (x) unlist(rownames(x))))
  names(MyColNames) <- c()
  names(MyRowNames) <- c()
  SupraAdjacencyMatrix <- (1-delta)*(Matrix::bdiag(unlist(Layers_List)))
  colnames(SupraAdjacencyMatrix) <-MyColNames
  rownames(SupraAdjacencyMatrix) <-MyRowNames
  
  
  if (L > 1) {
    offdiag <- (delta/(L-1))*Idem_Matrix
    
    i <- seq_len(L)
    Position_ini_row <- 1 + (i-1)*N
    Position_end_row <- N + (i-1)*N
    j <- seq_len(L)
    Position_ini_col <- 1 + (j-1)*N
    Position_end_col <- N + (j-1)*N
    
    for (i in seq_len(L)){
      for (j in seq_len(L)){
        if (j != i){
          SupraAdjacencyMatrix[(Position_ini_row[i]:Position_end_row[i]),
                               (Position_ini_col[j]:Position_end_col[j])] <- offdiag
        }    
      }
    }
  }
  
  SupraAdjacencyMatrix <- as(SupraAdjacencyMatrix, "dgCMatrix")
  return(SupraAdjacencyMatrix)
}





m1 <- igraph::graph(c(1,2,3,4,5,6,1,3,2,3,5,2), directed = FALSE)


m1 <- set_edge_attr(m1,"weight",E(m1), 
                    value = c(1,1,1,1,-1,1))

m1_MultiplexObject <- create.multiplex(list(m1=m1))
                                   
    
AdjMatrix_m1 <- compute.adjacency.matrix(m1_MultiplexObject)

AdjMatrix_m1_2 <- compute.adjacency.matrix_2(m1_MultiplexObject)

identical(AdjMatrix_m1, AdjMatrix_m1_2)

While this should not cause any problems for multiplex networks, a second look and potentially more testing would be appreciated.

@TimBMK TimBMK changed the title more efficient implementation for large monoplex networks more efficient Matrix computation for large monoplex networks May 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant