User Using TensorFlow on GPU Nodes
Although the T2B has GPUs, they are not supported by tensorflow.
Ohters have reported that with the kind of GPU's we have, the results given by tensorflow might be wrong!
We recommend to not use tensorflow on our GPUs
Setting up your environment
In order to use TensorFlow, you need to be connected to one of the machines containing GPU. You can do this both interactively or via the queue. If you are developing, we recommend that you connect directly to the machine.
To be able to connect to a GPU machine, you first need to log in to the cluster. You need to add the -A option to your ssh command. A good example to connect is the following:
ssh -o ServerAliveInterval=100 -AX mshort.iihe.ac.be
when you connect to the cluster in this way, you can tunnel further to the GPU machine:
Next, declare the GPUs in your environment :
$ source /swmgrs/icecubes/set_gpus.sh
To make sure that it worked, try something like this :
$ echo $CUDA_VISIBLE_DEVICES 0,1
In this case, we have 2 GPU devices at our disposal.
To easily get a ready-to-use software environment for TensorFlow with GPU support, we make use of a Singularity container.
$ singularity shell --nv -B /swmgrs -B /cvmfs -B /scratch /swmgrs/nonvo/singularity/osgvo-tensorflow-gpu.simg
Testing your environment
When launching the previous command, your are in a shell pertaining to the osgvo-tensorflow-gpu image. This image is ubuntu based, so in that respect, it is different from the rest of our cluster which is redhat like. However, the main functionality is not changed by this fact.
Let us now test a small program. Paste the following code into a file name TF.py:
import tensorflow as tf matrix1 = tf.constant([[3., 3.]]) matrix2 = tf.constant([[2.],[2.]]) product = tf.matmul(matrix1, matrix2) with tf.Session() as sess: with tf.device("/gpu:0"): result = sess.run(product) print(result)
This program defines two matrices and multiplies them making use of the first GPU, and then it prints the result.
To run it, issue:
The end result should be [12.].
Running on the cluster
If you want to run jobs on the cluster, create the following batch script as job.sh and make it executable (chmod +x job.sh).
source /swmgrs/icecubes/set_gpus.sh hostname env|grep CUDA echo $CUDA_VISIBLE_DEVICES singularity exec --nv -B /swmgrs -B /cvmfs -B /scratch /swmgrs/nonvo/singularity/osgvo-tensorflow-gpu.simg python TF.py
Notice the "singularity exec" command that will execute the command "python TF.py" within our singularity container.
This job can be submitted to the gpu queue in the following way:
qsub -q gpu job.sh
When the jobs is completed, the file "job.sh.o<jobnr>" will contain the result of your calculations.