Parallelization in R

how-to
How to run stuff in parallel in R
Author

Temi

Published

September 3, 2023

Introduction

When running computationally-heavy tasks in R, it can be useful to parallelize your codes/runs. In that vein, this is a really good blogpost to read to understand when and how to parallelize. And here is another one.

R offers many ways to do this. Usually, I prefer using some libraries.

show code
library(doParallel)
Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
show code
library(foreach)
library(parallel)

Use case

Assuming we want to apply a function over the rows of a matrix. This function will take each row, divide the numbers in that row by the index of that row in the matrix and return a newly-created matrix of the same shape.

show code
set.seed(2022)
myMatrix <- matrix(sample(1:50000, size=300*500, replace=T), nrow=300, ncol=500)

dim(myMatrix)
[1] 300 500

lapply

First, I will time this function with lapply loops. lapply is shipped with base R and is a parallel form of a regular for loop.

show code
lapply_fxn <- function(){
  applyMatrix <- lapply(1:nrow(myMatrix), function(i){
    out <- myMatrix[i, ] / (i)
    return(out)
  })
  applyMatrix <- do.call('rbind', applyMatrix)
  return(applyMatrix)
}

system.time(lapply_fxn())
   user  system elapsed 
  0.004   0.000   0.004 

mclapply

Next, let’s take advantage of the cores, this time using mclapply

show code
mclapply_fxn <- function(){
  applyMatrix <- parallel::mclapply(1:nrow(myMatrix), function(i){
    out <- myMatrix[i, ] / (i)
    return(out)
  }, mc.cores = 12) # using 7 cores
  
  applyMatrix <- do.call('rbind', applyMatrix)
  return(applyMatrix)
}

system.time(mclapply_fxn())
   user  system elapsed 
  0.043   0.084   0.032 

We see that lapply is faster. This is because there is an overhead to distributing these runs and collecting their results when using mclapply

foreach & %do%

Here, I will use the foreach but without a parallel back-end - this is akin to a sequential run, just like lapply or a regular for loop

show code
system.time({
  outputMatrix <- foreach::foreach(i=1:nrow(myMatrix), .combine='rbind', .inorder=F) %do% {
    out <- myMatrix[i, ] / i
    return(out)
  }
})
   user  system elapsed 
  0.071   0.005   0.078 

foreach & %dopar%

Here, I will use the foreach but with a parallel back-end. The parallel back-end is a cluster of cores If you are familiar with multiprocessing in python, it is equivalent to multiprocessing.Pool

First we need to register a parallel back-end

We can query how many cores we have on this computer

show code
parallel::detectCores()
[1] 8

I will register 7 cores

show code
num_clusters <- 7 #- 5 # 12 - 5
doParallel::registerDoParallel(num_clusters)

cat('Registering', num_clusters, 'clusters for a parallel run\n')
Registering 7 clusters for a parallel run
show code
system.time({
  outputMatrix <- foreach::foreach(i=1:nrow(myMatrix), .combine='rbind', .inorder=F) %dopar% {
    out <- myMatrix[i, ] / i
    return(out)
  }
})
   user  system elapsed 
  0.085   0.070   0.081 
show code
# stop the cluster
doParallel::stopImplicitCluster()

Here we see that lapply is much faster. Of course that is because of all the overheads and all that.

Next, I will show to use the various makeCluster() options.