Weights & Biases (wandb): Difference between revisions

no edit summary
mNo edit summary
No edit summary
Line 3: Line 3:
<translate>
<translate>
<!--T:1-->
<!--T:1-->
[https://wandb.ai Weights & Biases (wandb)] is a "meta machine learning platform" designed to help AI practitioners and teams build reliable machine learning models for real-world applications by streamlining the machine learning model lifecycle. By using wandb, users can track, compare, explain and reproduce their machine learning experiments.
[https://wandb.ai Weights & Biases (wandb)] is a <i>meta machine learning platform</i> designed to help AI practitioners and teams build reliable machine learning models for real-world applications by streamlining the machine learning model lifecycle. By using wandb, you can track, compare, explain and reproduce machine learning experiments.


== Using wandb on Alliance clusters == <!--T:2-->
== Using wandb on Alliance clusters == <!--T:2-->
Line 11: Line 11:


<!--T:4-->
<!--T:4-->
Since it requires an internet connection, wandb has restricted availability on compute nodes, depending on the cluster:
Since it requires an Internet connection, wandb has restricted availability on compute nodes, depending on the cluster:


<!--T:5-->
<!--T:5-->
Line 30: Line 30:


<!--T:41-->
<!--T:41-->
While it is possible to upload basic metrics to Weights&Biases during a job on Béluga, the wandb package automatically uploads information about the user's environment to a Google Cloud Storage bucket, resulting in a crash during or at the very end of a training run. It is not currently possible to disable this behaviour. Uploading artifacts to W&B with <tt>wandb.save()</tt> also requires access to Google Cloud Storage, which is not available on Béluga's compute nodes.
While it is possible to upload basic metrics to Weights&Biases during a job on Béluga, the wandb package automatically uploads information about your environment to a Google Cloud Storage bucket, resulting in a crash during or at the very end of a training run. It is not currently possible to disable this behaviour. Uploading artifacts to W&B with <code>wandb.save()</code> also requires access to Google Cloud Storage, which is not available on Béluga's compute nodes.


<!--T:42-->
<!--T:42-->
Users can still use wandb on Béluga by enabling the [https://docs.wandb.ai/library/cli#wandb-offline <tt>offline</tt>] or [https://docs.wandb.ai/library/init#save-logs-offline <tt>dryrun</tt>] modes. In these two modes, wandb will write all metrics, logs and artifacts to the local disk and will not attempt to sync anything to the Weights&Biases service on the internet. After their jobs finish running, users can sync their wandb content to the online service by running the command [https://docs.wandb.ai/ref/cli#wandb-sync <tt>wandb sync</tt>] on the login node.
You can still use wandb on Béluga by enabling the [https://docs.wandb.ai/library/cli#wandb-offline <tt>offline</tt>] or [https://docs.wandb.ai/library/init#save-logs-offline <tt>dryrun</tt>] modes. In these two modes, wandb will write all metrics, logs and artifacts to the local disk and will not attempt to sync anything to the Weights&Biases service on the Internet. After your jobs finish running, you can sync their wandb content to the online service by running the command [https://docs.wandb.ai/ref/cli#wandb-sync <tt>wandb sync</tt>] on the login node.


<!--T:46-->
<!--T:46-->
Line 63: Line 63:


<!--T:43-->
<!--T:43-->
### Save your wandb API key in your .bash_profile or replace $API_KEY with your actual API key. Uncomment the line below and comment out 'wandb offline'. if running on Cedar ###
### Save your wandb API key in your .bash_profile or replace $API_KEY with your actual API key. Uncomment the line below and comment out <code>wandb offline</code>. if running on Cedar ###


<!--T:44-->
<!--T:44-->
rsnt_translations
56,430

edits