Bureaucrats, cc_docs_admin, cc_staff
2,879
edits
(moved squashfs and ratarmount sections to Talk page) |
(remove Draft tag) |
||
Line 1: | Line 1: | ||
In certain domains, notably [[AI and Machine Learning]], it is common to have to manage very large collections of files, meaning hundreds of thousands or more. The individual files may be fairly small, e.g. less than a few hundred kilobytes. In these cases, a problem arises due to [[Storage_and_file_management#Filesystem_quotas_and_policies|filesystem quotas]] on Compute Canada clusters that limit the number of filesystem objects. So how can a user or group of users store these necessary data sets on the cluster? In this page we will present a variety of different solutions, each with its own pros and cons, so you may judge for yourself which is an appropriate one for you. | In certain domains, notably [[AI and Machine Learning]], it is common to have to manage very large collections of files, meaning hundreds of thousands or more. The individual files may be fairly small, e.g. less than a few hundred kilobytes. In these cases, a problem arises due to [[Storage_and_file_management#Filesystem_quotas_and_policies|filesystem quotas]] on Compute Canada clusters that limit the number of filesystem objects. So how can a user or group of users store these necessary data sets on the cluster? In this page we will present a variety of different solutions, each with its own pros and cons, so you may judge for yourself which is an appropriate one for you. |