Talk:R
Jump to navigation
Jump to search
Remarks (stubbsda)[edit]
- I think the best idea here would be to tell people to run the install.packages command on the login node (which is presumably assumed but should be made explicit).
- The issue with rogue R jobs creating too many threads arises not so much from users writing their own parallelized R scripts I suspect but in most cases running R code found elsewhere (installed via install.packages, cloned from GitHub etc.) and these R scripts/packages adopting a greedy algorithm for thread spawning.
Changes wanted[edit]
- Refer to Utiliser des modules instead of lengthy modules explanation
- R on login node should only be for (1) small tests and (2) package installation
- sbatch for non-interactive work, salloc for interactive, refer to Running jobs
- install.packages() will fail from Graham compute nodes
- Recommend CRAN mirrors? http://cran.utstat.utoronto.ca/, http://cran.stat.sfu.ca/, http://cloud.r-project.org/ ?
- Is https: preferred to http: in this context?
- Could we open paths to these from Graham compute?
- In "Exploiting parallelism":
- Acknowledge variety of solutions at https://cran.r-project.org/web/views/HighPerformanceComputing.html
- Don't assume reader has prior understanding of MPI, threading, "multicore", etc etc
- Mention library(parallel)
- Acknowledge terminology clash: "Following snow, a pool of worker processes listening via sockets for commands from the master is called a 'cluster' of nodes."
- I think a
- Ross Dickson (talk) 20:25, 20 December 2018 (UTC)