Weights & Biases (wandb): Difference between revisions

Jump to navigation Jump to search
no edit summary
(Marked this version for translation)
No edit summary
Line 11: Line 11:
<!--T:4-->
<!--T:4-->
Since it requires an internet connection, wandb has restricted availability on compute nodes, depending on the cluster:
Since it requires an internet connection, wandb has restricted availability on compute nodes, depending on the cluster:


<!--T:5-->
<!--T:5-->
Line 18: Line 17:
! Cluster !! Availability !! Note
! Cluster !! Availability !! Note
|-
|-
| Béluga || No ❌  || Wandb requires access to Google Cloud Storage, which is not available on Béluga
| Béluga || No ❌  || Wandb requires access to Google Cloud Storage, which is not accessible from the compute nodes
|-
|-
| Cedar || Yes ✅ || Internet access is enabled
| Cedar || Yes ✅ || Internet access is enabled
Line 28: Line 27:


<!--T:41-->
<!--T:41-->
While it is possible to upload basic metrics to Weights&Biases during a job on Béluga, the wandb package automatically uploads information about the user's environment to a Google Cloud Storage bucket. It is not currently possible to disable this behaviour. Uploading artifacts to W&B with <tt>wandb.save()</tt> also requires access to Google Cloud Storage, which is not available on Béluga's compute nodes.
While it is possible to upload basic metrics to Weights&Biases during a job on Béluga, the wandb package automatically uploads information about the user's environment to a Google Cloud Storage bucket, resulting in a crash during or at the very end of a training run. It is not currently possible to disable this behaviour. Uploading artifacts to W&B with <tt>wandb.save()</tt> also requires access to Google Cloud Storage, which is not available on Béluga's compute nodes.


<!--T:42-->
<!--T:42-->
Users can still use wandb on Béluga by enabling the [https://docs.wandb.ai/library/cli#wandb-offline <tt>offline</tt>] or [https://docs.wandb.ai/library/init#save-logs-offline <tt>dryrun</tt>] modes. In these two modes, wandb will write all metrics, logs and artifacts to the local disk and will not attempt to sync anything to the Weights&Biases service on the internet. After their jobs finish running, users can sync their wandb content to the online service by running the command [https://docs.wandb.ai/ref/cli#wandb-sync <tt>wandb sync</tt>] on the login node.
Users can still use wandb on Béluga by enabling the [https://docs.wandb.ai/library/cli#wandb-offline <tt>offline</tt>] or [https://docs.wandb.ai/library/init#save-logs-offline <tt>dryrun</tt>] modes. In these two modes, wandb will write all metrics, logs and artifacts to the local disk and will not attempt to sync anything to the Weights&Biases service on the internet. After their jobs finish running, users can sync their wandb content to the online service by running the command [https://docs.wandb.ai/ref/cli#wandb-sync <tt>wandb sync</tt>] on the login node.
Note that [[Comet.ml]] is a product very similar to Weights & Biases, and works on Béluga.


=== Example === <!--T:6-->
=== Example === <!--T:6-->
cc_staff
353

edits

Navigation menu