TensorFlow

Revision as of 15:45, 18 July 2017 by Diane27 (talk | contribs) (Created page with "TensorFlow")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Other languages:

Installing Tensorflow

These instructions install Tensorflow into your home directory using Compute Canada's pre-built Python wheels. Custom Python wheels are stored in /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/. To install Tensorflow's wheel we will use the pip command and install it into a Python virtual environments. The below instructions install for Python 3.5.2 but you can also install for Python 3.5.Y or 2.7.X by loading a different Python module.

Load modules required by Tensorflow:

 
[name@server ~]$ module load cuda cudnn python/3.5.2

Create a new python virtual environment:

 
[name@server ~]$ virtualenv tensorflow

Activate your newly created python virtual environment:

 
[name@server ~]$ source tensorflow/bin/activate

Install the numpy and Tensorflow wheels into your newly created virtual environment

 
[name@server ~]$ pip install tensorflow

Submitting a Tensorflow job

Once you have the above setup completed you can submit a Tensorflow job as

 
[name@server ~]$ sbatch tensorflow-test.sh

The job submission script has the contents

File : tensorflow-test.sh

#!/bin/bash
#SBATCH --gres=gpu:1              # request GPU "generic resource"
#SBATCH --mem=4000M               # memory per node
#SBATCH --time=0-05:00            # time (DD-HH:MM)
#SBATCH --output=%N-%j.out        # %N for node name, %j for jobID

module load cuda cudnn python/3.5.2
source tensorflow/bin/activate
python ./tensorflow-test.py


while the Python script has the form,

File : tensorflow-test.py

import tensorflow as tf
node1 = tf.constant(3.0, dtype=tf.float32)
node2 = tf.constant(4.0) # also tf.float32 implicitly
print(node1, node2)
sess = tf.Session()
print(sess.run([node1, node2]))


Once the above job has completed (should take less than a minute) you should see an output file called something like cdr116-122907.out with contents similar to the following example,

File : cdr116-122907.out

2017-07-10 12:35:19.489458: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: Tesla P100-PCIE-12GB
major: 6 minor: 0 memoryClockRate (GHz) 1.3285
pciBusID 0000:82:00.0
Total memory: 11.91GiB
Free memory: 11.63GiB
2017-07-10 12:35:19.491097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2017-07-10 12:35:19.491156: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y
2017-07-10 12:35:19.520737: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla P100-PCIE-12GB, pci bus id: 0000:82:00.0)
Tensor("Const:0", shape=(), dtype=float32) Tensor("Const_1:0", shape=(), dtype=float32)
[3.0, 4.0]