Handling large collections of files: Difference between revisions

Handling large collections of files (view source)

1 byte removed , 5 years ago

no edit summary

Bureaucrats, cc_docs_admin, cc_staff

2,879

edits

@@ Line 1: / Line 1: @@
 {{Draft}}
-In certain domains, notably machine learning, it is common to have to manage very large collections of files, meaning hundreds of thousands or more.  The individual files may be fairly small, e.g. less than a few hundred kilobytes.  In these cases, a problem arises due to [[/Storage_and_file_management#Filesystem_quotas_and_policies|filesystem quotas]] on Compute Canada clusters that limit the number of filesystem objects.  So how can a user or group of users store these necessary data sets on the cluster?  In this page we will present a variety of different solutions, each with its own pros and cons, so you may judge for yourself which is an appropriate one for you.
+In certain domains, notably machine learning, it is common to have to manage very large collections of files, meaning hundreds of thousands or more.  The individual files may be fairly small, e.g. less than a few hundred kilobytes.  In these cases, a problem arises due to [[Storage_and_file_management#Filesystem_quotas_and_policies|filesystem quotas]] on Compute Canada clusters that limit the number of filesystem objects.  So how can a user or group of users store these necessary data sets on the cluster?  In this page we will present a variety of different solutions, each with its own pros and cons, so you may judge for yourself which is an appropriate one for you.
 <!-- This text should not appear -->