Tuning Lustre/en: Difference between revisions

Updating to match new version of source page
(Updating to match new version of source page)
(Updating to match new version of source page)
 
Line 80: Line 80:
When a parallel program needs to read a small file (< 1MB), a configuration file for example, it is best to put this file on one disk (stripe count=1), to read it with the master rank, and to send its content to other ranks using a <tt>MPI_Broadcast</tt> or <tt>MPI_Scatter</tt>.
When a parallel program needs to read a small file (< 1MB), a configuration file for example, it is best to put this file on one disk (stripe count=1), to read it with the master rank, and to send its content to other ranks using a <tt>MPI_Broadcast</tt> or <tt>MPI_Scatter</tt>.


When treating large files, it is usually best to use a stripe count as large as the number of MPI ranks. For the stripe size, you will want it to be the same size as the buffer size for the data that is being read or written, by each rank. For example, if each rank reads 1 MB of data at a time, the ideal stripe size will likely be 1 MB. If you don't know what size to use, your best bet is to keep the default value, which has been optimized for large files. '''Note that you must never use a stripe size that is not a multiple of 1 MB'''.
When treating large files, it is usually best to use a stripe count as large as the number of MPI ranks. For the stripe size, you will want it to be the same size as the buffer size for the data that is being read or written, by each rank. For example, if each rank reads 1 MB of data at a time, the ideal stripe size will likely be 1 MB. If you don't know what size to use, your best bet is to keep the default value, which has been optimized for large files. '''Note that the stripe size must always be an integer multiple of 1MB'''.


In general, you want to reduce the number of open/close operations on the filesystem. It is therefore best to concatenate all data within a single file rather than writing a lot of small files. It will also be best to open the file once at the beginning, and close it once at the end of the program, rather than opening and closing it each time you want to add new data.
In general, you want to reduce the number of open/close operations on the filesystem. It is therefore best to concatenate all data within a single file rather than writing a lot of small files. It will also be best to open the file once at the beginning, and close it once at the end of the program, rather than opening and closing it each time you want to add new data.
38,789

edits