Translations:Handling large collections of files/1/en: Difference between revisions

Importing a new version from external source
(Importing a new version from external source)
 
(Importing a new version from external source)
Line 1: Line 1:
In certain domains, notably [[AI and Machine Learning]], it is common to have to manage very large collections of files, meaning hundreds of thousands or more. The individual files may be fairly small, e.g. less than a few hundred kilobytes. In these cases, a problem arises due to [[Storage_and_file_management#Filesystem_quotas_and_policies|filesystem quotas]] on Compute Canada clusters that limit the number of filesystem objects. So how can a user or group of users store these necessary data sets on the cluster?  In this page we will present a variety of different solutions, each with its own pros and cons, so you may judge for yourself which is an appropriate one for you.
In certain domains, notably [[AI and Machine Learning]], it is common to have to manage very large collections of files, meaning hundreds of thousands or more. The individual files may be fairly small, e.g. less than a few hundred kilobytes. In these cases, a problem arises due to [[Storage_and_file_management#Filesystem_quotas_and_policies|filesystem quotas]] on Compute Canada clusters that limit the number of filesystem objects. So how can a user or group of users store these necessary datasets on the cluster?  In this page we will present a variety of different solutions, each with its own pros and cons, so you may judge for yourself which is appropriate for you.
38,757

edits