TensorFlow/fr: Difference between revisions

TensorFlow/fr (view source)

Revision as of 16:46, 5 October 2021

1,454 bytes added , 3 years ago

Updating to match new version of source page

FuzzyBot

Bots

38,760

edits

@@ Line 444: / Line 444: @@
 }}
+==Creating Model Checkpoints==
+Whether or not you expect your code to run for long time periods, it is a good habit to create Checkpoints during training. A checkpoint is a snapshot of your model at a given point during the training process (after a certain number of iterations or after a number of epochs) that is saved to disk and can be loaded at a later time. It is a handy way of breaking jobs that are expected to run for a very long time, into multiple shorter jobs that may get allocated on the cluster more quickly. It is also a good way of avoiding losing progress in case of unexpected errors in your code or node failures.
+===With Keras===
+To create a checkpoint when training with <code>keras</code>, we recommend using the <code>callbacks</code> parameter of the <code>model.fit()</code> method. The following example shows how to instruct TensorFlow to create a checkpoint at the end of every training epoch:
+ callbacks = [tf.keras.callbacks.ModelCheckpoint(filepath="./ckpt",save_freq="epoch")] # Make sure the path where you want to create the checkpoint exists
+ model.fit(dataset, epochs=10 , callbacks=callbacks)
+For more information, please refer to the [https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/ModelCheckpoint official TensorFlow documentation].
+===With a Custom Training Loop===
+Please refer to the [https://www.tensorflow.org/guide/checkpoint#writing_checkpoints official TensorFlow documentation].
 ==Opérateurs personnalisés==