MXNet

From Alliance Doc
Revision as of 15:21, 13 July 2022 by Coulombc (talk | contribs) (Marked this version for translation)
Jump to navigation Jump to search
Other languages:

Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix symbolic and imperative programming to maximize efficiency and productivity. At its core, MXNet contains a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. MXNet is portable and lightweight, scalable to many GPUs and machines.

Available wheels

You can list available wheels using the avail_wheels command.

Question.png
[name@server ~]$ avail_wheels mxnet
name    version    python    arch
------  ---------  --------  ------
mxnet   1.9.1      cp39      avx2
mxnet   1.9.1      cp38      avx2
mxnet   1.9.1      cp310     avx2

Installing in a Python virtual environment

1. Create and activate a Python virtual environment.

[name@server ~]$ module load python/3.10
[name@server ~]$ virtualenv --no-download ~/env
[name@server ~]$ source ~/env/bin/activate


2. Install MXNet and its Python dependencies.

Question.png
(env) [name@server ~] pip install --no-index mxnet

3. Validate it.

Question.png
(env) [name@server ~] python -c "import mxnet as mx;print((mx.nd.ones((2, 3))*2).asnumpy());"
[[2. 2. 2.]
 [2. 2. 2.]]

Running a job

A single Convolution layer:

File : mxnet-conv-ex.py

#!/bin/env python

import mxnet as mx
import numpy as np

num_filter = 32
kernel = (3, 3)
pad = (1, 1)
shape = (32, 32, 256, 256)

x = mx.sym.Variable('x')
w = mx.sym.Variable('w')
y = mx.sym.Convolution(data=x, weight=w, num_filter=num_filter, kernel=kernel, no_bias=True, pad=pad)

device = mx.gpu() if mx.context.num_gpus() > 0 else mx.cpu()

# On CPU will use MKLDNN, or will use cuDNN
exe = y.simple_bind(device, x=shape)

exe.arg_arrays[0][:] = np.random.normal(size=exe.arg_arrays[0].shape)
exe.arg_arrays[1][:] = np.random.normal(size=exe.arg_arrays[1].shape)

exe.forward(is_train=False)
o = exe.outputs[0]
t = o.asnumpy()
print(t)


2. Edit the following submission script according to your needs.

File : mxnet-conv.sh

#!/bin/bash

#SBATCH --job-name=mxnet-conv
#SBATCH --account=def-someprof    # adjust this to match the accounting group you are using to submit jobs
#SBATCH --time=01:00:00           # adjust this to match the walltime of your job
#SBATCH --cpus-per-task=2         # adjust this to match the number of cores
#SBATCH --mem=20G                 # adjust this according to the memory you need

# Load modules dependencies
module load python/3.10

# Generate your virtual environment in $SLURM_TMPDIR
virtualenv --no-download ${SLURM_TMPDIR}/env
source ${SLURM_TMPDIR}/env/bin/activate

# Install MXNet and its dependencies
pip install --no-index mxnet==1.9.1

# Will use MKLDNN
python mxnet-conv-ex.py


File : mxnet-conv.sh

#!/bin/bash

#SBATCH --job-name=mxnet-conv
#SBATCH --account=def-someprof    # adjust this to match the accounting group you are using to submit jobs
#SBATCH --time=01:00:00           # adjust this to match the walltime of your job
#SBATCH --cpus-per-task=2         # adjust this to match the number of cores
#SBATCH --mem=20G                 # adjust this according to the memory you need
#SBATCH --gres=gpu:1              # adjust this to match the number of GPUs, unless distributed training, use 1

# Load modules dependencies
module load python/3.10

# Generate your virtual environment in $SLURM_TMPDIR
virtualenv --no-download ${SLURM_TMPDIR}/env
source ${SLURM_TMPDIR}/env/bin/activate

# Install MXNet and its dependencies
pip install --no-index mxnet==1.9.1

# Will use cuDNN
python mxnet-conv-ex.py


3. Submit the job to the scheduler.

Question.png
[name@server ~]$ sbatch mxnet-conv.sh