PyTorch and Tensorflow in Natural Language Processing Pipeline_Model Training

Slinae

3 min readApr 15, 2022

After data preprocessing and model building, the next step is to train the model.

Model training includes the following steps.

First, initialize Model

Second, for each epoch:

Calculate loss for input samples (between the model output and the true value)
Calculate gradients of the loss
Update Model Trainable weights using gradients and learning rate

Use PyTorch to train model

Initialize a model object by passing the necessary arguments to the model class we want to train.

Device setting:

DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model.to(DEVICE)

Training Loop: We need to pass data in the way of dataloader prepared for the model. The dataloaderspecifies the batch number.

A simple example code is as follows, we can include more details into the training loop such as saving checkpoints of the model parameters or choosing the best model by comparing the loss in the training loop. The steps following are a must for the training.

from torch import optim
from torch.utils.data import DataLoader
import numpy as npoptimizer = optim.Adam(model.parameters(),lr=config.learning_rate)
train_dataloader = DataLoader(dataset=train_data,batch_size=batch_size,\
               collate_fn=collate_fn,\
               shuffle=True)for epoch in range(epochs):
    batch_losses = []
    for batch, data in enumerate(train_data_loader):
       x, y = data
       x = x.to(DEVICE)
       y = y.to(DEVICE)
       model.train()
       optimizer.zero_grad()
       #Calculate loss for input samples       loss = loss_function(model(x),y)  
       batch_losses.append(loss.item())       #Calculate gradients of loss  
       loss.backward()       #update Model Trainable weights using gradients \
       #          and learning rate
       optimizer.step()    epoch_loss = np.mean(batch_losses)

Consult the PyTorch documentation for more about the training loop in PyTorch.

(https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

Use TensorFlow to train model

Initialize a model object by passing the necessary arguments to the model class we want to train.

Then, we train the model in every epoch.

One way we train the model is to use the compile function model.compile()and the fit function model.fit() to train our model in TensorFlow.

Pass the arguments such as loss function, the optimizer, and the metric to

model.compile()

Pass the arguments such as the number of epochs to the model.fit() function. More code details can be checked here. Thanks very much for the code authors.

https://github.com/https-deeplearning-ai/tensorflow-1-public/blob/main/C3/W2/ungraded_labs/C3_W2_Lab_1_imdb.ipynb

import tensorflow as tf

model = model()

# Setup the training parameters
model.compile(loss='binary_crossentropy',optimizer='adam', \
                 metrics=['accuracy'])# Print the model summary
model.summary()
num_epochs = 10

# Train the model
model.fit(padded, training_labels_final, epochs=num_epochs, validation_data=(testing_padded, testing_labels_final))

Another way we train the model is to customize the training loop with TensorFlow operators.

To customize the training loop, we will use the module tf.GradientTape()

Consult the documentation for more details. https://www.tensorflow.org/api_docs/python/tf/GradientTape

The code below comes from the Coursera course:TensorFlow: Advanced Techniques Specialization by DeepLearning.AI.

for epoch in range(epochs):  
    losses_train = train_data_for_one_epoch()
    losses_val = perform_validation()
    losses_train_mean = np.mean(losses_train)
    losses_val_mean = np.mean(losses_val)def train_data_for_one_epoch():
    losses = []
    for step, (x_batch_train, y_batch_train) \
                      in enumerate(train_datset):
        logits, loss_value = apply_gradient(optimizer, model,\
                   x_batch_train, y_batch_train)
        losses.append(loss_value)
    return lossesdef apply_gradient(optimizer, model, x, y):
    with tf.GradientTape() as tape:
        logits = model(x)        #Calculate loss for input samples
        loss_value = loss_object(y_true=y, y_pred=logits)    #Calculate gradients of loss  
    gradients = tape.gradient(loss_value, model.trainable_weights)    #update Model Trainable weights using gradients \
    #          and learning rate
    optimizer.apply_gradients(zip(gradients,\
               model.trainable_weights))
    return logits, loss_valuedef perform_validation():
    losses = []
    for x_val, y_val in test:
        val_logits = model(x_val)
        val_loss = loss_object(y_true=y_val, y_pred=val_logits)
        losses.append(val_loss)
    return losses

Tips here:

After initializing the Model, for each epoch: calculate the loss for input samples, the gradients of loss, and update Model Trainable weights using gradients and learning rate by conventional of the PyTorch library or TensorFlow Library.

Check how to do data preprocessing in PyTorch or TensorFlow for the NLP.

( https://ruolanlin.medium.com/comparing-pytorch-and-tensorflow-in-natural-language-processing-pipeline-part-1-9af01f012ff )

Check how to build a model in PyTorch or Tensorflow for the NLP.

( https://ruolanlin.medium.com/pytorch-and-tensorflow-in-natural-language-processing-pipeline-model-building-62a1f73f543b )

PyTorch and Tensorflow in Natural Language Processing Pipeline_Model Training

Use PyTorch to train model

Use TensorFlow to train model

Written by Slinae