38,757
edits
(Importing a new version from external source) |
(Importing a new version from external source) |
||
Line 1: | Line 1: | ||
In certain domains, notably [[AI and Machine Learning]], it is common to have to manage very large collections of files, meaning hundreds of thousands or more. | In certain domains, notably [[AI and Machine Learning]], it is common to have to manage very large collections of files, meaning hundreds of thousands or more. The individual files may be fairly small, e.g. less than a few hundred kilobytes. In these cases, a problem arises due to [[Storage_and_file_management#Filesystem_quotas_and_policies|filesystem quotas]] on Compute Canada clusters that limit the number of filesystem objects. So how can a user or group of users store these necessary datasets on the cluster? In this page we will present a variety of different solutions, each with its own pros and cons, so you may judge for yourself which is appropriate for you. |