PyTorch and Tensorflow in Natural Language Processing Pipeline_Model Building

4 min readApr 15, 2022

After data preprocessing, we need to build models for the NLP pipeline.

Check how to do data preprocessing in PyTorch or TensorFlow for the NLP. ( https://ruolanlin.medium.com/comparing-pytorch-and-tensorflow-in-natural-language-processing-pipeline-part-1-9af01f012ff )

1. Use PyTorch to build a model for the NLP pipeline

There are two necessary parts needed to implement in building a model class in PyTorch. Initialize the class (inheriting from nn.Module) and the forward function. A simple example is as follows. These two parts are a must. We can add other functions which can handle the more complex situations and build a more powerful class.

A simple example for building an NLP model by PyTorch:

Initialize function:

class CNN(nn.Module):
    def __init__(self, vocab_size, embedding_dim,\
                   filter_size, num_filter, num_class):
        super(CNN, self).__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.conv1d = nn.Conv1d(embedding_dim,num_filter,\
                                  filter_size, padding=1)
        self.activate = F.relu
        self.linear = nn.Linear(num_filter, num_class)

Implement the forward function:

def forward(self, inputs):
    embedding = self.embedding(inputs)    convolution = self.activate(self.conv1d(embedding.permute(0, 2,\
                                 1)))    pooling = F.max_pool1d(convolution,\
                            kernel_size=convolution.shape[2])    outputs = self.linear(pooling.squeeze(dim=2))
    log_probs = F.log_softmax(outputs, dim=1)
    return log_probs

More code details can be referred to here. Thanks very much for the code authors.

plm-nlp-code/cnn_sent_polarity.py at main · HIT-SCIR/plm-nlp-code

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below…

github.com

If we want to customize the layers to implement the algorithms, we may not only need to use the PyTorch off-the-shelf layers (such asnn.LSTM, nn.Conv1d) but also need to use the PyTorch operators (such as torch.cat/torch.transpose/torch.expand_as/torch.clamp). An example of implementing an algorithm by PyTorch can be referred to in another article:

https://ruolanlin.medium.com/text-summarization-with-pointer-generator-networks-4c9ae228c49f

2. Use Tensorflow to build a model for the NLP pipeline

1 . Customize a model with the Model layer (tf.keras.Model), pass different inputs to the model and get different outputs we want by the Model layer’s arguments.

2. Customize a layer with tf.keras.layers.Lambda

(Consult the documentation for the Model module https://www.tensorflow.org/api_docs/python/tf/keras/Model

Consult the documentation for the lambda layer

https://www.tensorflow.org/api_docs/python/tf/keras/layers/Lambda )

Example of the Siamese network’s architecture which uses the TensorFlow Model module to pass different inputs and get outputs and lambda layer. The code of the Siamese network example comes from the Coursera course:TensorFlow: Advanced Techniques Specialization by DeepLearning.AI.

In NLP, It may use the Embedding layer, the LSTM layer, and the Conv1D layer which is more suitable for text data modeling instead of Flatten layer.

from tensorflow.keras.models import Model
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Flatten, Dense, 
from tensorflow.keras.layers import Lambda, Dropoutdef initialize_base_network():  
    input = Input(shape=(28,28,))  
    x = Flatten()(input)  
    x = Dense(128, activation='relu')(x)  
    x = Dropout(0.1)(x)  
    x = Dense(128, activation='relu')(x)  
    x = Dropout(0.1)(x)  
    x = Dense(128, activation='relu')(x)  
    return Model(inputs=input, outputs=x)base_network = initialize_base_network()
input_a = Input(shape=(28,28,)) 
input_b = Input(shape=(28,28,)vect_output_a = base_network(input_a) 
vect_output_b = base_network(input_b)def euclidean_distance(vects):  
    x, y = vects  
    sum_square = K.sum(K.square(x - y), axis=1, keepdims=True)
    return K.sqrt(K.maximum(sum_square, K.epsilon()))
def eucl_dist_output_shape(shapes):  
    shape1, shape2 = shapes  return (shape1[0], 1)output = Lambda \
           (euclidean_distance,output_shape=eucl_dist_output_shape)\
           ([vect_output_a, vect_output_b])model = Model([input_a, input_b], output)

Except using the lambda layer, the customization also can be done by inheriting from tf.keras.layers.Layer

Consult the document, here is the code example from the document. ( https://www.tensorflow.org/tutorials/customization/custom_layers )

class MyDenseLayer(tf.keras.layers.Layer):   def __init__(self, num_outputs):
       super(MyDenseLayer, self).__init__()
       self.num_outputs = num_outputs

   def build(self, input_shape):
       self.kernel = self.add_weight("kernel",
                                  shape=[int(input_shape[-1]),
                                         self.num_outputs])

    def call(self, inputs):
        return tf.matmul(inputs, self.kernel)

layer = MyDenseLayer(10)

It is very similar to the PyTorch module class construction. Instead of inheriting from nn.Module , the class inherit from tf.keras.layers.Layer

It has three parts to implement: initialize function (__init__), build function and call function. Like in PyTorch, if we want to customize the layers to implement the algorithms, we may not only need to use the PyTorch off-the-shelf layers (such asDense, Flatten) but also need to use the PyTorch operators (such as tf.matmul/tf.where /tf.abs/tf.square).

Customize a loss function as a function or as a loss class. The code below comes from the Coursera course:TensorFlow: Advanced Techniques Specialization by DeepLearning.AI.

def my_huber_loss(y_true, y_pred):  
    threshold = 1  
    error = y_true - y_pred
    is_small_error = = tf.abs(error) <= self.threshold
    small_error_loss = tf.square(error) / 2  
    big_error_loss = self.threshold * (tf.abs(error) - \
                             (0.5 *   self.threshold))  
    return tf.where(is_small_error, \
                           small_error_loss, big_error_loss)

If it is needed to customize the loss function as a class.Steps as follows: inherited from the loss class tensorflow.keras.losses and implement the initialize function and the call function.

class MyHuberLoss(Loss):      threshold = 1  
    def __init__(self, threshold):  
        super().__init__()  
        self.threshold = threshold      def call(self, y_true, y_pred):  
        error = y_true - y_pred  
        is_small_error = = tf.abs(error) <= self.threshold
        small_error_loss = tf.square(error) / 2  
        big_error_loss = self.threshold * (tf.abs(error) - \
                             (0.5 *   self.threshold))  
        return tf.where(is_small_error, \
                           small_error_loss, big_error_loss)

Tips here:

Construct the model class by inheriting from the base class of the library, such as nn.Module and tf.keras.layers.Layer
Make use of the operators from the library such as torch.cat/torch.transpose and tf.matmul/tf.where.

PyTorch and Tensorflow in Natural Language Processing Pipeline_Model Building

plm-nlp-code/cnn_sent_polarity.py at main · HIT-SCIR/plm-nlp-code

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below…

Written by Slinae