rsnt_translations
57,772
edits
(Marked this version for translation) |
No edit summary |
||
Line 7: | Line 7: | ||
<!--T:2--> | <!--T:2--> | ||
Even though R was not developed for high performance computing (HPC), its popularity with scientists from a variety of disciplines, including engineering, mathematics, statistics, bioinformatics, etc. makes it an essential tool on HPC installations dedicated to academic research. Features such as C extensions, byte-compiled code and | Even though R was not developed for high-performance computing (HPC), its popularity with scientists from a variety of disciplines, including engineering, mathematics, statistics, bioinformatics, etc. makes it an essential tool on HPC installations dedicated to academic research. Features such as C extensions, byte-compiled code and parallelization allow for reasonable performance in single-node jobs. Thanks to R’s modular nature, users can customize the R functions available to them by installing packages from the Comprehensive R Archive Network ([https://cran.r-project.org/ CRAN]) into their home directories. | ||
<!--T:83--> | <!--T:83--> | ||
Line 206: | Line 206: | ||
<!--T:71--> | <!--T:71--> | ||
The processors on our clusters are quite ordinary. | The processors on our clusters are quite ordinary. | ||
What makes these supercomputers | What makes these supercomputers <i>super</i> is that you have access to thousands of CPU cores with a high-performance network. | ||
In order to take advantage of this hardware you must run code "in parallel." | In order to take advantage of this hardware, you must run code "in parallel." However, note that prior to investing a lot of time and effort | ||
in parallelizing your R code, you should first ensure that your serial implementation is as efficient as possible. As an interpreted | in parallelizing your R code, you should first ensure that your serial implementation is as efficient as possible. As an interpreted | ||
language, the use of loops in R, and especially nested loops, constitutes a significant performance bottleneck. Whenever possible you | language, the use of loops in R, and especially nested loops, constitutes a significant performance bottleneck. Whenever possible you | ||
Line 225: | Line 225: | ||
<!--T:73--> | <!--T:73--> | ||
<b>A note on terminology:</b> In most of our documentation the term 'node' refers | |||
to an individual machine, also called a 'host', and a collection of such nodes makes up a 'cluster'. | to an individual machine, also called a 'host', and a collection of such nodes makes up a 'cluster'. | ||
In a lot of R documentation however, the term 'node' refers to a worker process and a 'cluster' is a | In a lot of R documentation however, the term 'node' refers to a worker process and a 'cluster' is a | ||
collection of such processes. As an example, consider the following quote, "Following | collection of such processes. As an example, consider the following quote, "Following <b>snow</b>, a pool | ||
of worker processes listening ''via'' sockets for commands from the master is called a 'cluster' of | of worker processes listening ''via'' sockets for commands from the master is called a 'cluster' of | ||
nodes."<ref>Core package "parallel" vignette, https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf</ref>. | nodes."<ref>Core package "parallel" vignette, https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf</ref>. |