Comet.ml: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
(Add translation tags)
No edit summary
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<languages />
<languages />
[[Category:AI and Machine Learning]]
<translate>
<translate>


[https://comet.ml Comet] is a "meta machine learning platform" designed to help AI practitioners and teams build reliable machine learning models for real-world applications by streamlining the machine learning model lifecycle. By using Comet, users can track, compare, explain and reproduce their machine learning experiments. Comet can also greatly accelerate hyperparameter search, by providing a [https://www.comet.ml/parameter-optimization module for the Bayesian exploration of hyperparameter space].
<!--T:1-->
[https://comet.ml Comet] is a meta machine learning platform designed to help AI practitioners and teams build reliable machine learning models for real-world applications by streamlining the machine learning model lifecycle. By using Comet, users can track, compare, explain and reproduce their machine learning experiments. Comet can also greatly accelerate hyperparameter search, by providing a [https://www.comet.ml/parameter-optimization module for the Bayesian exploration of hyperparameter space].


== Using Comet on Compute Canada clusters ==
== Using Comet on our clusters == <!--T:2-->


=== Availability ===
=== Availability === <!--T:3-->


<!--T:4-->
Since it requires an internet connection, Comet has restricted availability on compute nodes, depending on the cluster:
Since it requires an internet connection, Comet has restricted availability on compute nodes, depending on the cluster:


<!--T:5-->
{| class="wikitable"
{| class="wikitable"
|-
|-
! Cluster !! Availability !! Note
! Cluster !! Availability !! Note
|-
|-
| Béluga || Yes ✅ || Comet can be used after loading the <tt>httpproxy</tt> module: <tt>module load httpproxy</tt>
| Béluga || rowspan=2| Yes ✅ || rowspan=2|  Comet can be used after loading the <code>httpproxy</code> module: <code>module load httpproxy</code>
|-
|-
| Cedar || Yes ✅ || Internet access is enabled
| Narval
|-
|-
| Graham || No ❌ || Internet access is disabled on compute nodes
| Cedar || Yes ✅ || internet access is enabled
|-
| Graham || No ❌ || internet access is disabled on compute nodes. Workaround: [https://www.comet.ml/docs/python-sdk/offline-experiment/ Comet OfflineExperiment]
|}
|}


=== Best practices ===
=== Best practices === <!--T:6-->


* Avoid logging metrics (e.g. loss, accuracy) at a high frequency. This can cause Comet to throttle your experiment, which can make your job duration harder to predict. As a rule of thumb, please log metrics (or request new hyperparameters) at an interval >= 10 minutes.
<!--T:7-->
* Avoid logging metrics (e.g. loss, accuracy) at a high frequency. This can cause Comet to throttle your experiment, which can make your job duration harder to predict. As a rule of thumb, please log metrics (or request new hyperparameters) at an interval >= 1 minute.


</translate>
</translate>

Latest revision as of 22:39, 25 July 2023

Other languages:

Comet is a meta machine learning platform designed to help AI practitioners and teams build reliable machine learning models for real-world applications by streamlining the machine learning model lifecycle. By using Comet, users can track, compare, explain and reproduce their machine learning experiments. Comet can also greatly accelerate hyperparameter search, by providing a module for the Bayesian exploration of hyperparameter space.

Using Comet on our clusters

Availability

Since it requires an internet connection, Comet has restricted availability on compute nodes, depending on the cluster:

Cluster Availability Note
Béluga Yes ✅ Comet can be used after loading the httpproxy module: module load httpproxy
Narval
Cedar Yes ✅ internet access is enabled
Graham No ❌ internet access is disabled on compute nodes. Workaround: Comet OfflineExperiment

Best practices

  • Avoid logging metrics (e.g. loss, accuracy) at a high frequency. This can cause Comet to throttle your experiment, which can make your job duration harder to predict. As a rule of thumb, please log metrics (or request new hyperparameters) at an interval >= 1 minute.