QGIS
QGIS is a free and open-source cross-platform desktop geographic information system (GIS) application that supports viewing, editing, and analysis of geospatial data.
!!IMPORTANT!! Never make intense use of QGIS on the login nodes! Submit jobs using the command line whenever possible and if you must visualize your data using the GUI, please do so on an interactive node. Using parallel rendering on the shared login nodes will result in the termination of your session.
Command-line QGIS[edit]
You will first need to load gcc
[name@server ~]$ module load gcc/5.4.0
Then, you will need to load the QGIS module; there could potentially be several versions available and you can see a list of all of them using the command
[name@server ~]$ module spider qgis
You can load a particular QGIS module using a command like
[name@server ~]$ module load qgis/2.18.24
You might also have to load various other modules depending on the packages you need to install. For example, "rgdal" will require that you load a module called "gdal", which itself requires that you load nixpkgs and gcc. Nixpkgs should already be loaded by default. You can ensure that it is by running
[name@server ~]$ module list
If nixpkgs is not listed, you can load it by running
[name@server ~]$ module load nixpkgs/16.09
If any package fails to install, be sure to read the error message carefully, as it might give you some details concerning some additional modules you need to load. You can also find out if a module is dependent on any other module by running
[name@server ~]$ module spider gdal/2.2.1
To execute R scripts, use the Rscript front-end with the file containing the R commands as an argument:
[name@server ~]$ Rscript computation.R
Using the GUI[edit]
Running[edit]
1. Place your R code in a script file, in this case the file is called test.R.
#Tell all slaves to return a message identifying themselves.
library("Rmpi")
sprintf("TEST mpi.universe.size() = %i", mpi.universe.size())
ns <- mpi.universe.size() - 1
sprintf("TEST attempt to spawn %i slaves", ns)
mpi.spawn.Rslaves(nslaves=ns)
mpi.remote.exec(paste("I am",mpi.comm.rank(),"of",mpi.comm.size()))
mpi.remote.exec(paste(mpi.comm.get.parent()))
#Send execution commands to the slaves
x<-5
#These would all be pretty correlated one would think
x<-mpi.remote.exec(rnorm,x)
length(x)
x
mpi.close.Rslaves()
mpi.quit()
2. Copy the following content in a job submission script called job.sh:
#!/bin/bash
#SBATCH --account=def-someacct # replace this with your own account
#SBATCH --ntasks=5 # number of MPI processes
#SBATCH --mem-per-cpu=2048M # memory; default unit is megabytes
#SBATCH --time=0-00:15 # time (DD-HH:MM)
module load r/3.4.0
module load openmpi/1.10.7
export R_LIBS=~/local/R_libs/
mpirun -np 1 R CMD BATCH test.R test.txt
3. Submit the job with:
sbatch job.sh
For more on submitting jobs, see the Running jobs page.
doParallel and foreach[edit]
Usage[edit]
Foreach can be considered as a unified interface for all backends (i.e. doMC, doMPI, doParallel, doRedis, etc.). It works on all platforms, assuming that the backend works. doParallel acts as an interface between foreach and the parallel package and can be loaded alone. There are some known efficiency issues when using foreach to run a very large number of very small tasks. Therefore, keep in mind that the following code is not the best example of an optimized use of the foreach() call but rather that the function chosen was kept at a minimum for demonstration purposes.
You must register the backend by feeding it the number of cores available. If the backend is not registered, foreach will assume that the number of cores is 1 and will proceed to go through the iterations serially.
The general method to use foreach is:
- to load both foreach and the backend package;
- to register the backend;
- to call foreach() by keeping it on the same line as the %do% (serial) or %dopar% operator.
Running[edit]
1. Place your R code in a script file, in this case the file is called test_foreach.R.
# library(foreach) # optional if using doParallel
library(doParallel) #
# a very simple function
test_func <- function(var1, var2) {
return(var1*var2)
}
# we will iterate over two sets of values, you can modify this to explore the mechanism of foreach
var1.v = c(1:8)
var2.v = seq(0.1, 1, length.out = 8)
# Use the environment variable SLURM_NTASKS to set the number of cores.
# This is for SLURM. Replace SLURM_NTASKS by the proper variable for your system.
# Avoid manually setting a number of cores.
ncores = Sys.getenv("SLURM_NTASKS")
registerDoParallel(cores=ncores)# Shows the number of Parallel Workers to be used
print(ncores) # this how many cores are available, and how many you have requested.
getDoParWorkers()# you can compare with the number of actual workers
# be careful! foreach() and %dopar% must be on the same line!
foreach(var1=var1.v, .combine=rbind) %:% foreach(var2=var2.v, .combine=rbind) %dopar% {test_func(var1=var1, var2=var2)}
2. Copy the following content in a job submission script called job_foreach.sh:
#!/bin/bash
#SBATCH --account=def-someacct # replace this with your own account
#SBATCH --ntasks=4 # number of processes
#SBATCH --mem-per-cpu=2048M # memory; default unit is megabytes
#SBATCH --time=0-00:15 # time (DD-HH:MM)
#SBATCH --mail-user=yourname@someplace.com # Send email updates to you or someone else
#SBATCH --mail-type=ALL # send an email in all cases (job started, job ended, job aborted)
module load r/3.4.3
export R_LIBS=~/local/R_libs/
R CMD BATCH --no-save --no-restore test_foreach.R
3. Submit the job with:
[name@server ~]$ sbatch job_foreach.sh
For more on submitting jobs, see the Running jobs page.