R: Difference between revisions

move Rmpi toward the bottom
(link to SciNet colloquium slides)
(move Rmpi toward the bottom)
Line 230: Line 230:
of worker processes listening ''via'' sockets for commands from the master is called a 'cluster' of  
of worker processes listening ''via'' sockets for commands from the master is called a 'cluster' of  
nodes."<ref>Core package "parallel" vignette, https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf</ref>.
nodes."<ref>Core package "parallel" vignette, https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf</ref>.
=== Rmpi === <!--T:23-->
Note that the following instructions do not work on [[Cedar]], so any use of Rmpi should be restricted to the other clusters.
====Installing==== <!--T:24-->
This next procedure installs [https://cran.r-project.org/web/packages/Rmpi/index.html Rmpi], an interface (wrapper) to MPI routines, which allow R to run in parallel.
<!--T:25-->
1. See the available R modules by running:
<source lang="bash">
module spider r
</source>
<!--T:26-->
2.  Select the R version and load the required Open MPI module. This example uses Open MPI version 4.0.3 to spawn the processes correctly.
<source lang="bash">
module load gcc/11.3.0
module load r/4.2.1
module load openmpi/4.1.4
</source>
<!--T:27-->
3. Download [https://cran.r-project.org/web/packages/Rmpi/index.html the latest Rmpi version]; change the version number to whatever is desired.
<source lang="bash">
wget https://cran.r-project.org/src/contrib/Rmpi_0.6-9.2.tar.gz
</source>
<!--T:28-->
4. Specify the directory where you want to install the package files; you must have write permission for this directory. The directory name can be changed if desired.
<source lang="bash">
mkdir -p ~/local/R_libs/
export R_LIBS=~/local/R_libs/
</source>
<!--T:29-->
5. Run the install command.
<source lang="bash">
R CMD INSTALL --configure-args="--with-Rmpi-include=$EBROOTOPENMPI/include  --with-Rmpi-libpath=$EBROOTOPENMPI/lib --with-Rmpi-type='OPENMPI' " Rmpi_0.6-9.2.tar.gz
</source>
<!--T:30-->
Again, carefully read any error message that comes up when packages fail to install and load the required modules to ensure that all your packages are successfully installed.
====Running==== <!--T:31-->
<!--T:32-->
1. Place your R code in a script file, in this case the file is called ''test.R''.
<!--T:33-->
{{File
  |name=test.R
  |lang="r"
  |contents=
#Tell all slaves to return a message identifying themselves.
library("Rmpi")
sprintf("TEST mpi.universe.size() =  %i", mpi.universe.size())
ns <- mpi.universe.size() - 1
sprintf("TEST attempt to spawn %i slaves", ns)
mpi.spawn.Rslaves(nslaves=ns)
mpi.remote.exec(paste("I am",mpi.comm.rank(),"of",mpi.comm.size()))
mpi.remote.exec(paste(mpi.comm.get.parent()))
#Send execution commands to the slaves
x<-5
#These would all be pretty correlated one would think
x<-mpi.remote.exec(rnorm,x)
length(x)
x
mpi.close.Rslaves()
mpi.quit()
}}
<!--T:34-->
2. Copy the following content in a job submission script called ''job.sh'':
<!--T:35-->
{{File
  |name=job.sh
  |lang="bash"
  |contents=
#!/bin/bash
#SBATCH --account=def-someacct  # replace this with your supervisors account
#SBATCH --ntasks=5              # number of MPI processes
#SBATCH --mem-per-cpu=2048M      # memory; default unit is megabytes
#SBATCH --time=0-00:15          # time (DD-HH:MM)
<!--T:78-->
module load gcc/11.3.0
module load r/4.2.1
module load openmpi/4.1.4
export R_LIBS=~/local/R_libs/
mpirun -np 1 R CMD BATCH test.R test.txt
}}
<!--T:36-->
3. Submit the job with:
<!--T:37-->
<source lang="bash">
sbatch job.sh
</source>
<!--T:38-->
For more on submitting jobs, see [[Running jobs]].


=== doParallel and foreach === <!--T:39-->
=== doParallel and foreach === <!--T:39-->
Line 494: Line 391:
{{Command|sbatch job_makecluster.sh}}
{{Command|sbatch job_makecluster.sh}}
For more information on submitting jobs, see [[Running jobs]].
For more information on submitting jobs, see [[Running jobs]].
=== Rmpi === <!--T:23-->
Note that the following instructions do not work on [[Cedar]], so any use of Rmpi should be restricted to the other clusters.
====Installing==== <!--T:24-->
This next procedure installs [https://cran.r-project.org/web/packages/Rmpi/index.html Rmpi], an interface (wrapper) to MPI routines, which allow R to run in parallel.
<!--T:25-->
1. See the available R modules by running:
<source lang="bash">
module spider r
</source>
<!--T:26-->
2.  Select the R version and load the required Open MPI module. This example uses Open MPI version 4.0.3 to spawn the processes correctly.
<source lang="bash">
module load gcc/11.3.0
module load r/4.2.1
module load openmpi/4.1.4
</source>
<!--T:27-->
3. Download [https://cran.r-project.org/web/packages/Rmpi/index.html the latest Rmpi version]; change the version number to whatever is desired.
<source lang="bash">
wget https://cran.r-project.org/src/contrib/Rmpi_0.6-9.2.tar.gz
</source>
<!--T:28-->
4. Specify the directory where you want to install the package files; you must have write permission for this directory. The directory name can be changed if desired.
<source lang="bash">
mkdir -p ~/local/R_libs/
export R_LIBS=~/local/R_libs/
</source>
<!--T:29-->
5. Run the install command.
<source lang="bash">
R CMD INSTALL --configure-args="--with-Rmpi-include=$EBROOTOPENMPI/include  --with-Rmpi-libpath=$EBROOTOPENMPI/lib --with-Rmpi-type='OPENMPI' " Rmpi_0.6-9.2.tar.gz
</source>
<!--T:30-->
Again, carefully read any error message that comes up when packages fail to install and load the required modules to ensure that all your packages are successfully installed.
====Running==== <!--T:31-->
<!--T:32-->
1. Place your R code in a script file, in this case the file is called ''test.R''.
<!--T:33-->
{{File
  |name=test.R
  |lang="r"
  |contents=
#Tell all slaves to return a message identifying themselves.
library("Rmpi")
sprintf("TEST mpi.universe.size() =  %i", mpi.universe.size())
ns <- mpi.universe.size() - 1
sprintf("TEST attempt to spawn %i slaves", ns)
mpi.spawn.Rslaves(nslaves=ns)
mpi.remote.exec(paste("I am",mpi.comm.rank(),"of",mpi.comm.size()))
mpi.remote.exec(paste(mpi.comm.get.parent()))
#Send execution commands to the slaves
x<-5
#These would all be pretty correlated one would think
x<-mpi.remote.exec(rnorm,x)
length(x)
x
mpi.close.Rslaves()
mpi.quit()
}}
<!--T:34-->
2. Copy the following content in a job submission script called ''job.sh'':
<!--T:35-->
{{File
  |name=job.sh
  |lang="bash"
  |contents=
#!/bin/bash
#SBATCH --account=def-someacct  # replace this with your supervisors account
#SBATCH --ntasks=5              # number of MPI processes
#SBATCH --mem-per-cpu=2048M      # memory; default unit is megabytes
#SBATCH --time=0-00:15          # time (DD-HH:MM)
<!--T:78-->
module load gcc/11.3.0
module load r/4.2.1
module load openmpi/4.1.4
export R_LIBS=~/local/R_libs/
mpirun -np 1 R CMD BATCH test.R test.txt
}}
<!--T:36-->
3. Submit the job with:
<!--T:37-->
<source lang="bash">
sbatch job.sh
</source>
<!--T:38-->
For more on submitting jobs, see [[Running jobs]].
</translate>
</translate>
Bureaucrats, cc_docs_admin, cc_staff
2,879

edits