Parallel Examples
Author: | Mitch Richling |
Updated: | 2022-06-04 16:17:46 |
Copyright 2020-2021 Mitch Richling. All rights reserved.
Table of Contents
1. Metadata
The home for this HTML file is: https://richmit.github.io/ex-R/parallelBasics.html
Files related to this document may be found on github: https://github.com/richmit/ex-R
Directory contents:
src |
- | The org-mode file that generated this HTML document |
docs |
- | This html document |
data |
- | Data files |
tangled |
- | Tangled R code from this document |
2. Starting a cluster
2.1. Load the library
library(parallel)
2.2. Calculate the number instances based on core count
instancesInMyComputeCluster <- detectCores()
instancesInMyComputeCluster
[1] 8
2.3. Start up the cluster
myComputeCluster <- makeCluster(instancesInMyComputeCluster)
myComputeCluster
socket cluster with 8 nodes on host 'localhost'
3. Using the cluster
3.1. Run a function in each instance (you can add arguments after the function name)
unlist(clusterCall(myComputeCluster, 'Sys.getpid'))
[1] 25704 17520 15828 18808 21660 21896 26024 12048
3.2. Run an expression in each instance
unlist(clusterEvalQ(myComputeCluster, 2*Sys.getpid()))
[1] 51408 35040 31656 37616 43320 43792 52048 24096
3.3. Using data across the cluster
Notice aVar
need not be exported because it is the second argument of parSapply
!
aVar <- 1:10 parSapply(myComputeCluster, aVar, sin) #parLapply(myComputeCluster, aVar, sin)
[1] 0.8414710 0.9092974 0.1411200 -0.7568025 -0.9589243 -0.2794155 0.6569866 0.9893582 0.4121185 -0.5440211
3.4. Explicitly exporting data across the cluster
While aVar
need not be exported, bVar
must be exported – because it is not the second argument of parSapply
.
bVar <- 10 clusterExport(myComputeCluster, 'bVar') parSapply(myComputeCluster, aVar, function (x) bVar*x)
[1] 10 20 30 40 50 60 70 80 90 100
4. Performance
4.1. Create some big data data and put it in 'cVar'
cVar <- rnorm(instancesInMyComputeCluster*2^14)
4.2. Export 'cVar' to to each instance
clusterExport(myComputeCluster, "cVar")
4.3. Compute in serial
system.time(b<-sapply(cVar, function (x) for(i in 1:500) sin(x)))
user system elapsed 4.81 0.00 4.82
4.4. Compute in parallel
system.time(b<-parSapply(myComputeCluster, cVar, function (x) for(i in 1:500) sin(x)))
user system elapsed 0.07 0.00 1.12
5. Shut down cluster
stopCluster(myComputeCluster)