Parallel Examples

Author: Mitch Richling
Updated: 2022-06-04 16:17:46

Copyright 2020-2021 Mitch Richling. All rights reserved.

Table of Contents

1. Metadata

The home for this HTML file is: https://richmit.github.io/ex-R/parallelBasics.html

Files related to this document may be found on github: https://github.com/richmit/ex-R

Directory contents:

src - The org-mode file that generated this HTML document
docs - This html document
data - Data files
tangled - Tangled R code from this document

2. Starting a cluster

2.1. Load the library


2.2. Calculate the number instances based on core count

instancesInMyComputeCluster <- detectCores()
[1] 8

2.3. Start up the cluster

myComputeCluster <- makeCluster(instancesInMyComputeCluster)
socket cluster with 8 nodes on host 'localhost'

3. Using the cluster

3.1. Run a function in each instance (you can add arguments after the function name)

unlist(clusterCall(myComputeCluster, 'Sys.getpid'))
[1] 25704 17520 15828 18808 21660 21896 26024 12048

3.2. Run an expression in each instance

unlist(clusterEvalQ(myComputeCluster, 2*Sys.getpid()))
[1] 51408 35040 31656 37616 43320 43792 52048 24096

3.3. Using data across the cluster

Notice aVar need not be exported because it is the second argument of parSapply!

aVar <- 1:10
parSapply(myComputeCluster, aVar, sin)
#parLapply(myComputeCluster, aVar, sin)
[1]  0.8414710  0.9092974  0.1411200 -0.7568025 -0.9589243 -0.2794155  0.6569866  0.9893582  0.4121185 -0.5440211

3.4. Explicitly exporting data across the cluster

While aVar need not be exported, bVar must be exported – because it is not the second argument of parSapply.

bVar <- 10
clusterExport(myComputeCluster, 'bVar')
parSapply(myComputeCluster, aVar, function (x) bVar*x)
[1]  10  20  30  40  50  60  70  80  90 100

4. Performance

4.1. Create some big data data and put it in 'cVar'

cVar <- rnorm(instancesInMyComputeCluster*2^14)

4.2. Export 'cVar' to to each instance

clusterExport(myComputeCluster, "cVar")

4.3. Compute in serial

system.time(b<-sapply(cVar, function (x) for(i in 1:500) sin(x)))
user  system elapsed 
4.81    0.00    4.82

4.4. Compute in parallel

system.time(b<-parSapply(myComputeCluster, cVar, function (x) for(i in 1:500) sin(x)))
user  system elapsed 
0.07    0.00    1.12

5. Shut down cluster
