Easiest way toward Multi-GPU training in Tensorflow 2

Overview

Easy parallelization over multiple GPUs can be accomplished in Tensorflow 2 using the ‘MirroredStrategy’ approach, especially if one is using Keras through the Tensorflow integration. This can be used as a replacement for ‘multi_gpu_model’ in Keras. There are a few caveats (bugs) with using this on TF2.0 (see below).

An example illustrating its use is shown below where two of the GPU devices are selected.

import tensorflow as tf
from tensorflow.keras.utils import multi_gpu_model

tf.debugging.set_log_device_placement(True)
mirrored_strategy = tf.distribute.MirroredStrategy(devices=["/gpu:3","/gpu:4"])

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

with mirrored_strategy.scope():
    model = tf.keras.models.Sequential([
      tf.keras.layers.Flatten(input_shape=(28, 28)),
      tf.keras.layers.Dense(128, activation='relu'),
      tf.keras.layers.BatchNormalization(renorm=False),
      tf.keras.layers.Dropout(0.2),
      tf.keras.layers.Dense(10)
    ])

    loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    

    model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
              
model.evaluate(x_test,  y_test, verbose=2)

Caveats

As of TF2.2 this does not work with a BatchNormalization layer and throws an error when ‘renorm’ is set to ‘True’ for this layer.