cc_staff
56
edits
No edit summary |
(added foreach example) |
||
Line 216: | Line 216: | ||
</translate> | </translate> | ||
=== doParallel and foreach === <!--T:40--> | |||
====Usage==== <!--T:42--> | |||
Foreach can be considered as a unified interface for all backends(i.e. doMC, doMPI, doParallel, doRedis, etc.). It works on all platforms, assuming that the backend works. doParallel acts as an interface between foreach and the parallel package and can be loaded alone. There are some known efficiency issues when using foreach to run a very large number of very small tasks. Therefore, keep in mind that the following code is not the best example of an optimized use of the foreach() call but rather that the function chosen was kept at a minimum for demonstration purposes. | |||
You must register the backend by feeding it the number of cores available. If the backend is not registered, foreach will assume that the number of cores is 1 and will proceed to go through the iterations serially. | |||
The general method to use foreach is: | |||
# to load both foreach and the backend package; | |||
# to register the backend; | |||
# to call foreach() by keeping it on the same line as the %do% (serial) or %dopar% operator. | |||
====Running==== <!--T:44--> | |||
<!--T:45--> | |||
1. Place your R code in a script file, in this case the file is called ''test_foreach.R''. | |||
<!--T:46--> | |||
{{File | |||
|name=test_foreach.R | |||
|lang="r" | |||
|contents= | |||
# library(foreach) # optional if using doParallel | |||
library(doParallel) # | |||
# a very simple function | |||
test_func <- function(var1, var2) { | |||
return(var1*var2) | |||
} | |||
# we will iterate over two sets of values, you can modify this to explore the mechanism of foreach | |||
var1.v = c(1:8) | |||
var2.v = seq(0.1, 1, length.out = 8) | |||
# Use the environment variable SLURM_NTASKS to set the number of cores. | |||
# This is for SLURM. Replace SLURM_NTASKS by the proper variable for your system. | |||
# Avoid manually setting a number of cores. | |||
ncores = Sys.getenv("SLURM_NTASKS") | |||
registerDoParallel(cores=ncores)# Shows the number of Parallel Workers to be used | |||
print(ncores) # this how many cores are available, and how many you have requested. | |||
getDoParWorkers()# you can compare with the number of actual workers | |||
# be careful! foreach() and %dopar% must be on the same line! | |||
foreach(var1=var1.v, .combine=rbind) %:% foreach(var2=var2.v, .combine=rbind) %dopar% {test_func(var1=var1, var2=var2)} | |||
}} | |||
<!--T:47--> | |||
2. Copy the following content in a job submission script called ''job_foreach.sh'': | |||
<!--T:48--> | |||
{{File | |||
|name=job_foreach.sh | |||
|lang="bash" | |||
|contents= | |||
#!/bin/bash | |||
#SBATCH --account=def-someacct # replace this with your own account | |||
#SBATCH --ntasks=4 # number of processes | |||
#SBATCH --mem-per-cpu=2048M # memory; default unit is megabytes | |||
#SBATCH --time=0-00:15 # time (DD-HH:MM) | |||
#SBATCH --mail-user=yourname@someplace.com # Send email updates to you or someone else | |||
#SBATCH --mail-type=ALL # send an email in all cases (job started, job ended, job aborted) | |||
module load r/3.4.3 | |||
export R_LIBS=~/local/R_libs/ | |||
R CMD BATCH --no-save --no-restore test_foreach.R | |||
}} | |||
<!--T:49--> | |||
3. Submit the job with: | |||
<!--T:50--> | |||
<source lang="bash"> | |||
sbatch job.sh | |||
</source> | |||
<!--T:51--> | |||
For more on submitting jobs, see the [[Running jobs]] page. |