cc_staff
353
edits
(Long running computations; Running many similar jobs) |
(Remove draft tag) |
||
Line 1: | Line 1: | ||
To get the most out of our clusters for machine learning applications, special care must be taken. A cluster is a complicated beast that is very different from your local machine that you use for prototyping. Notably, a cluster uses a distributed filesystem, linking many storage devices seamlessly. Accessing a file on <tt>/project</tt> ''feels the same'' as accessing one from the current node; but under the hood, these two IO operations have very different performance implications. In short, you need to [[#Managing_your_datasets|choose wisely where to put your data]]. | To get the most out of our clusters for machine learning applications, special care must be taken. A cluster is a complicated beast that is very different from your local machine that you use for prototyping. Notably, a cluster uses a distributed filesystem, linking many storage devices seamlessly. Accessing a file on <tt>/project</tt> ''feels the same'' as accessing one from the current node; but under the hood, these two IO operations have very different performance implications. In short, you need to [[#Managing_your_datasets|choose wisely where to put your data]]. | ||