The build method, allowing flexible inputs for custom layers

目录

<!DOCTYPE html>

The_build_method

Flexible input shapes for custom layers

In this reading you will learn how to use the build method to allow custom layers to work with flexible sized inputs.

In [1]:
import tensorflow as tf
print(tf.__version__)
2.0.0

Fix the input shape in the custom layer

Previously, you have created custom layers by initialising all variables in the __init__ method. For instance, you defined a dense layer called MyLayer as follows:

In [2]:
# Create a custom layer

from tensorflow.keras.layers import Layer

class MyLayer(Layer):

    def __init__(self, units, input_dim, **kwargs):
        super(MyLayer, self).__init__(**kwargs)
        self.w = self.add_weight(shape=(input_dim, units),
                             initializer='random_normal')
        self.b = self.add_weight(shape=(units,),
                             initializer='zeros')
        
    def call(self, inputs):
        return tf.matmul(inputs, self.w)+self.b

Notice that the required arguments for the __init__ method are the number of units in the dense layer (units) and the input size (input_dim). This means that you need to fix these two arguments when you instantiate the layer.

In [3]:
#  Create a custom layer with 3 units and input dimension of 5

dense_layer = MyLayer(3, 5) 

Since the input size has been fixed to be 5, this custom layer can only take inputs of that size. For example, we can call the layer as follows:

In [4]:
# Call the custom layer on a Tensor input of ones

x = tf.ones((1,5))
print(dense_layer(x))
tf.Tensor([[ 0.01100426 -0.30199945 -0.01802837]], shape=(1, 3), dtype=float32)

However, forcing the input shape (and therefore the shape of the weights) to be fixed when the layer is instantiated is unnecessary, and it may be more convenient to only do this later on, after the model has been defined.

For example, in some cases you may not know the input shape at the model building time. We have come across this concept before when building models with the Sequential API. If the input_shape argument is omitted, the weights will only be created when an input is passed into the model.

Allow a flexible input shape in the custom layer

You can delay the weight creation by using the build method to define the weights. The build method is executed when the __call__ method is called, meaning the weights are only created only the layer is called with a specific input.

The build method has a required argument input_shape, which can be used to define the shapes of the layer weights.

In [6]:
# Rewrite the custom layer with lazy weight creation

class MyLayer(Layer):

    def __init__(self, units, **kwargs):
        super(MyLayer, self).__init__(**kwargs)
        self.units = units
    
    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units),
                                 initializer='random_normal')
        self.b = self.add_weight(shape=(self.units,),
                                 initializer='zeros')
    def call(self, inputs):
        return tf.matmul(inputs, self.w)+self.b

Now, when you instantiate the layer, you only need to specify the number of units in the dense layer (units), and not the input size (input_dim).

Create a custom layer with flexible input size

In [7]:
#  Create a custom layer with 3 units

dense_layer = MyLayer(3) 

This layer can now be called on an input of any size, at which point the layer weights will be created and the input size will be fixed.

In [8]:
# Call the custom layer on a Tensor input of ones of size 5

x = tf.ones((1,5))
print(dense_layer(x))
tf.Tensor([[-0.16368304 -0.02228257 -0.08700767]], shape=(1, 3), dtype=float32)
In [9]:
# Print the layer weights

dense_layer.weights
Out[9]:
[<tf.Variable 'my_layer_1/Variable:0' shape=(5, 3) dtype=float32, numpy=
 array([[-4.9105797e-02, -1.6169475e-02,  6.2181827e-02],
        [-3.5072900e-02, -2.6903901e-02, -3.6265276e-04],
        [-9.5036004e-05,  2.5834545e-02, -5.3623449e-02],
        [-2.7640924e-02, -1.1497076e-02, -2.6422501e-02],
        [-5.1768374e-02,  6.4533399e-03, -6.8780899e-02]], dtype=float32)>,
 <tf.Variable 'my_layer_1/Variable:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]

Create a new custom layer and pass in a different sized input

In [10]:
#  Create a new custom layer with 3 units

dense_layer = MyLayer(3) 
In [11]:
# Call the custom layer on a Tensor input of ones of size 4

x = tf.ones((1,4))
print(dense_layer(x))
tf.Tensor([[ 0.01080223 -0.01008084  0.05707707]], shape=(1, 3), dtype=float32)
In [12]:
# Print the layer weights

dense_layer.weights
Out[12]:
[<tf.Variable 'my_layer_2/Variable:0' shape=(4, 3) dtype=float32, numpy=
 array([[ 0.02848851, -0.00484542,  0.00279722],
        [ 0.00491298,  0.04867994,  0.04755805],
        [-0.01268567,  0.01012804, -0.04380822],
        [-0.0099136 , -0.0640434 ,  0.05053002]], dtype=float32)>,
 <tf.Variable 'my_layer_2/Variable:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]

Note that the code for creating a custom layer object is identical, but the shape of the weights in the layer depend on the size of the input passed to the layer.

Flexible input shapes in models

Deferring the weight creation until the layer is called is also useful when using the custom layer as an intermediate layer inside a larger model. In this case you may want to create several custom layer objects in the model, and it is tedious to keep track of the input shape that each of the custom layers needs.

By deferring the weight creation as above, the input shape can be inferred from the output of the previous layer.

In [13]:
# Create a model using the custom layer

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Softmax

class MyModel(Model):

    def __init__(self, units_1, units_2, **kwargs):
        super(MyModel, self).__init__(**kwargs)
        self.layer_1 = MyLayer(units_1)
        self.layer_2 = MyLayer(units_2)

    def call(self, inputs):
        x = self.layer_1(inputs)
        x = tf.nn.relu(x)
        x = self.layer_2(x)
        return Softmax()(x)

In the above model definition, the custom layer MyLayer is used twice. Notice that each instance of the custom layer object can have a different input size, depending on the arguments used to create the model and the inputs passed into the model

In [14]:
# Create a custom model object

model = MyModel(units_1=32, units_2=10)

We can create and initialise all of the weights of the model by passing in an example Tensor input.

In [15]:
# Create and initialize all of the model weights

_ = model(tf.ones((1, 100)))
In [16]:
# Print the model summary

model.summary()
Model: "my_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
my_layer_3 (MyLayer)         multiple                  3232      
_________________________________________________________________
my_layer_4 (MyLayer)         multiple                  330       
=================================================================
Total params: 3,562
Trainable params: 3,562
Non-trainable params: 0
_________________________________________________________________