AI and Machine Learning: Difference between revisions

Jump to navigation Jump to search
no edit summary
(Add section about large collections of files)
No edit summary
Line 1: Line 1:
{{Draft}}
{{Draft}}
== Python ==
= Python =


[[Python]] is very popular in the field of machine learning. If you (plan to) use it on our clusters, please refer to [[Python|our documentation about Python]] to get important information about Python versions, virtual environments on login or on compute nodes, multiprocessing, Anaconda, Jupyter, etc.
[[Python]] is very popular in the field of machine learning. If you (plan to) use it on our clusters, please refer to [[Python|our documentation about Python]] to get important information about Python versions, virtual environments on login or on compute nodes, multiprocessing, Anaconda, Jupyter, etc.


== Useful information about software packages ==
= Useful information about software packages =


Please refer to the page of your machine learning package of choice for useful information about how to install, common pitfalls, etc.:
Please refer to the page of your machine learning package of choice for useful information about how to install, common pitfalls, etc.:
Line 16: Line 16:
* [[XGBoost]]
* [[XGBoost]]


== Datasets containing lots of small files (e.g. image datasets) ==
= Datasets containing lots of small files (e.g. image datasets) =


In machine learning, it is common to have to manage very large collections of files, meaning hundreds of thousands or more. The individual files may be fairly small, e.g. less than a few hundred kilobytes. In these cases, problems arise:
In machine learning, it is common to have to manage very large collections of files, meaning hundreds of thousands or more. The individual files may be fairly small, e.g. less than a few hundred kilobytes. In these cases, problems arise:
cc_staff
353

edits

Navigation menu