AI and Machine Learning

From Alliance Doc
Revision as of 18:50, 16 July 2019 by Lemc2220 (talk | contribs) (Add section about large collections of files)
Jump to navigation Jump to search


This article is a draft

This is not a complete article: This is a draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.



Python

Python is very popular in the field of machine learning. If you (plan to) use it on our clusters, please refer to our documentation about Python to get important information about Python versions, virtual environments on login or on compute nodes, multiprocessing, Anaconda, Jupyter, etc.

Useful information about software packages

Please refer to the page of your machine learning package of choice for useful information about how to install, common pitfalls, etc.:

Datasets containing lots of small files (e.g. image datasets)

In machine learning, it is common to have to manage very large collections of files, meaning hundreds of thousands or more. The individual files may be fairly small, e.g. less than a few hundred kilobytes. In these cases, problems arise:

  • Filesystem quotas on Compute Canada clusters limit the number of filesystem objects;
  • Your software could become be significantly slowed down from streaming lots of small files from /project (or /scratch) to a compute node.

On a distributed filesystem, data should be stored in large single-file archives. On this subject, please refer to Handling large collections of files.