Python: Difference between revisions

Python (view source)

161 bytes added , 4 years ago

m

bit more expicit on multi-node

cc_staff

176

edits

@@ Line 105: / Line 105: @@
 Parallel filesystems such as the ones used on our clusters are very good at reading or writing large chunks of data, but can be bad for intensive use of small files. Launching a software and loading libraries, such as starting python and loading a virtual environment, can be slow for this reason.
-As a workaround for this kind of slowdown, and especially for single-node Python jobs, you can create your virtual environment inside of your job, using the compute node's local disk. It may seem counter-intuitive to recreate your environment for every job, but it can be faster than running from the parallel filesystem, and will give you some protection against some filesystem performance issues. This can be achieved using the following submission script example :
+As a workaround for this kind of slowdown, and especially for single-node Python jobs, you can create your virtual environment inside of your job, using the compute node's local disk. It may seem counter-intuitive to recreate your environment for every job, but it can be faster than running from the parallel filesystem, and will give you some protection against some filesystem performance issues. This approach, of creating a node-local virtualenv, has to be done for each node in the job, since the virtualenv is only accessible on one node.  Following job submission script demonstrates how to do this for a single-node job:
 <!--T:37-->