User contributions for Stubbsda
Jump to navigation
Jump to search
7 October 2019
- 15:3315:33, 7 October 2019 diff hist +52 N Translations:Tutoriel Apprentissage machine/44/en Created page with "===Important elements of a <tt>sbatch</tt> script===" current
- 15:3315:33, 7 October 2019 diff hist 0 Tutoriel Apprentissage machine/en Created page with "You must submit your jobs using a script in conjunction with the <tt>sbatch</tt> command, so that they can be entirely automated as a batch process. Interactive jobs are just..."
- 15:3315:33, 7 October 2019 diff hist +213 N Translations:Tutoriel Apprentissage machine/43/en Created page with "You must submit your jobs using a script in conjunction with the <tt>sbatch</tt> command, so that they can be entirely automated as a batch process. Interactive jobs are just..."
- 15:3215:32, 7 October 2019 diff hist −4 Tutoriel Apprentissage machine/en Created page with "==Step 3: Preparing your job submission script=="
- 15:3215:32, 7 October 2019 diff hist +48 N Translations:Tutoriel Apprentissage machine/42/en Created page with "==Step 3: Preparing your job submission script=="
- 15:3115:31, 7 October 2019 diff hist +2 Tutoriel Apprentissage machine/en No edit summary
- 15:3115:31, 7 October 2019 diff hist +2 Translations:Tutoriel Apprentissage machine/38/en No edit summary
- 15:3115:31, 7 October 2019 diff hist +2 Tutoriel Apprentissage machine/en Created page with "The job submission script will look like this:"
- 15:3115:31, 7 October 2019 diff hist +46 N Translations:Tutoriel Apprentissage machine/56/en Created page with "The job submission script will look like this:" current
- 15:3015:30, 7 October 2019 diff hist +244 N Translations:Tutoriel Apprentissage machine/61/en Created page with "# Get most recent checkpoint (this example is for PyTorch *.pth checkpoint files) export CHECKPOINTS=~/scratch/checkpoints/ml-test LAST_CHECKPOINT=$(find . -maxdepth 1 -name "..." current
- 15:3015:30, 7 October 2019 diff hist −4 Tutoriel Apprentissage machine/en Created page with "# Prepare data"
- 15:3015:30, 7 October 2019 diff hist +14 N Translations:Tutoriel Apprentissage machine/60/en Created page with "# Prepare data" current
- 15:3015:30, 7 October 2019 diff hist −3 Tutoriel Apprentissage machine/en Created page with " # Prepare virtualenv"
- 15:3015:30, 7 October 2019 diff hist +21 N Translations:Tutoriel Apprentissage machine/59/en Created page with " # Prepare virtualenv" current
- 15:2915:29, 7 October 2019 diff hist +33 N Translations:Tutoriel Apprentissage machine/58/en Created page with "module load python/3.6 cuda cudnn" current
- 15:2915:29, 7 October 2019 diff hist +148 N Translations:Tutoriel Apprentissage machine/57/en Created page with "{{File |name=ml-test-chain.sh |lang="bash" |contents= #!/bin/bash #SBATCH --array=1-10%1 # 10 is the number of jobs in the chain #SBATCH ..." current
- 15:1215:12, 7 October 2019 diff hist +303 N Translations:Tutoriel Apprentissage machine/62/en Created page with "# Start training if [ -n "$LAST_CHECKPOINT" ]; then # $LAST_CHECKPOINT is null; start from scratch python $SOURCEDIR/train.py --write-checkpoints-to $CHECKPOINTS ... e..." current
- 15:1115:11, 7 October 2019 diff hist −6 Tutoriel Apprentissage machine/en Created page with "'''Now is a good time to verify that your job reads and writes as much as possible on the compute node's local storage (<tt>$SLURM_TMPDIR</tt>) and as little as possible on th..."
- 15:1115:11, 7 October 2019 diff hist +227 N Translations:Tutoriel Apprentissage machine/41/en Created page with "'''Now is a good time to verify that your job reads and writes as much as possible on the compute node's local storage (<tt>$SLURM_TMPDIR</tt>) and as little as possible on th..."
- 15:1015:10, 7 October 2019 diff hist −19 Tutoriel Apprentissage machine/en Created page with "* Create and activate a virtual environment in <tt>$SLURM_TMPDIR</tt> (this variable points to a directory on the local dis..."
- 15:1015:10, 7 October 2019 diff hist +720 N Translations:Tutoriel Apprentissage machine/40/en Created page with "* Create and activate a virtual environment in <tt>$SLURM_TMPDIR</tt> (this variable points to a directory on the local dis..."
- 15:0315:03, 7 October 2019 diff hist −8 Tutoriel Apprentissage machine/en No edit summary
- 15:0315:03, 7 October 2019 diff hist −8 Translations:Tutoriel Apprentissage machine/39/en No edit summary
- 15:0215:02, 7 October 2019 diff hist +10 Tutoriel Apprentissage machine/en No edit summary
- 15:0215:02, 7 October 2019 diff hist +10 Translations:Tutoriel Apprentissage machine/39/en No edit summary
- 15:0015:00, 7 October 2019 diff hist 0 Tutoriel Apprentissage machine/en No edit summary
- 15:0015:00, 7 October 2019 diff hist 0 Translations:Tutoriel Apprentissage machine/34/en No edit summary
- 14:5914:59, 7 October 2019 diff hist −18 Tutoriel Apprentissage machine/en Created page with "We recommend that you try running your job in an interactive job before submitting it using a script (discussed in the following section). Yo..."
- 14:5914:59, 7 October 2019 diff hist +407 N Translations:Tutoriel Apprentissage machine/39/en Created page with "We recommend that you try running your job in an interactive job before submitting it using a script (discussed in the following section). Yo..."
- 14:5514:55, 7 October 2019 diff hist −10 Tutoriel Apprentissage machine/en Created page with "==Step 2: Prepare your virtual environment =="
- 14:5514:55, 7 October 2019 diff hist +45 N Translations:Tutoriel Apprentissage machine/38/en Created page with "==Step 2: Prepare your virtual environment =="
- 14:5514:55, 7 October 2019 diff hist −14 Tutoriel Apprentissage machine/en Created page with "The above command does not compress the data. If you believe that this is appropriate, you can use <tt>tar czf</tt>."
- 14:5514:55, 7 October 2019 diff hist +116 N Translations:Tutoriel Apprentissage machine/37/en Created page with "The above command does not compress the data. If you believe that this is appropriate, you can use <tt>tar czf</tt>." current
- 14:5414:54, 7 October 2019 diff hist −13 Tutoriel Apprentissage machine/en Created page with "Assuming that the files which you need are in the directory <tt>mydataset</tt>:"
- 14:5414:54, 7 October 2019 diff hist +79 N Translations:Tutoriel Apprentissage machine/35/en Created page with "Assuming that the files which you need are in the directory <tt>mydataset</tt>:" current
- 14:5314:53, 7 October 2019 diff hist −110 Tutoriel Apprentissage machine/en Created page with "The filesystems on Compute Canada clusters are designed for a small number of extremely large files. Make sure that the data set which you need for your training is an archive..."
- 14:5314:53, 7 October 2019 diff hist +639 N Translations:Tutoriel Apprentissage machine/34/en Created page with "The filesystems on Compute Canada clusters are designed for a small number of extremely large files. Make sure that the data set which you need for your training is an archive..."
- 14:4914:49, 7 October 2019 diff hist −17 Tutoriel Apprentissage machine/en Created page with "== Step 1: Archiving a data set =="
- 14:4914:49, 7 October 2019 diff hist +34 N Translations:Tutoriel Apprentissage machine/33/en Created page with "== Step 1: Archiving a data set =="
- 14:4814:48, 7 October 2019 diff hist −42 Tutoriel Apprentissage machine/en Created page with "This page is a beginner's manual concerning how to port a machine learning job to a Compute Canada cluster."
- 14:4814:48, 7 October 2019 diff hist +107 N Translations:Tutoriel Apprentissage machine/32/en Created page with "This page is a beginner's manual concerning how to port a machine learning job to a Compute Canada cluster."
- 14:4714:47, 7 October 2019 diff hist −5 Translations:Tutoriel Apprentissage machine/Page display title/en No edit summary current
4 October 2019
- 15:2615:26, 4 October 2019 diff hist +200 Handling large collections of files No edit summary
3 October 2019
- 12:5912:59, 3 October 2019 diff hist +34 Tutoriel Apprentissage machine No edit summary
- 12:5712:57, 3 October 2019 diff hist +205 Tutoriel Apprentissage machine No edit summary
2 October 2019
- 18:3918:39, 2 October 2019 diff hist +3 SSH security improvements No edit summary
- 18:3818:38, 2 October 2019 diff hist +9 SSH security improvements No edit summary
1 October 2019
- 18:2818:28, 1 October 2019 diff hist +96 VNC Marked this version for translation
- 18:2718:27, 1 October 2019 diff hist −24 VNC No edit summary
- 18:2718:27, 1 October 2019 diff hist −12 VNC No edit summary