PyTorch and Tensorflow in Natural Language Processing Pipeline_Model Training
After data preprocessing and model building, the next step is to train the model.
Model training includes the following steps.
First, initialize Model
Second, for each epoch:
- Calculate loss for input samples (between the model output and the true value)
- Calculate gradients of the loss
- Update Model Trainable weights using gradients and learning rate
Use PyTorch to train model
Initialize a model object by passing the necessary arguments to the model class we want to train.
Device setting:
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(DEVICE)
Training Loop: We need to pass data in the way of dataloader prepared for the model. The dataloader
specifies the batch number.
A simple example code is as follows, we can include more details into the training loop such as saving checkpoints of the model parameters or choosing the best model by comparing the loss in the training loop. The steps following are a must for the training.
from torch import optim
from torch.utils.data import DataLoader
import numpy as npoptimizer = optim.Adam(model.parameters(),lr=config.learning_rate)
train_dataloader = DataLoader(dataset=train_data,batch_size=batch_size,\
collate_fn=collate_fn,\
shuffle=True)for epoch in range(epochs):
batch_losses = []
for batch, data in enumerate(train_data_loader):
x, y = data
x = x.to(DEVICE)
y = y.to(DEVICE)
model.train()
optimizer.zero_grad()
#Calculate loss for input samples loss = loss_function(model(x),y)
batch_losses.append(loss.item()) #Calculate gradients of loss
loss.backward() #update Model Trainable weights using gradients \
# and learning rate
optimizer.step() epoch_loss = np.mean(batch_losses)
Consult the PyTorch documentation for more about the training loop in PyTorch.
(https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
Use TensorFlow to train model
Initialize a model object by passing the necessary arguments to the model class we want to train.
Then, we train the model in every epoch.
One way we train the model is to use the compile function model.compile()
and the fit function model.fit()
to train our model in TensorFlow.
Pass the arguments such as loss function, the optimizer, and the metric to
model.compile()
Pass the arguments such as the number of epochs to the model.fit()
function. More code details can be checked here. Thanks very much for the code authors.
import tensorflow as tf
model = model()
# Setup the training parameters
model.compile(loss='binary_crossentropy',optimizer='adam', \
metrics=['accuracy'])# Print the model summary
model.summary()
num_epochs = 10
# Train the model
model.fit(padded, training_labels_final, epochs=num_epochs, validation_data=(testing_padded, testing_labels_final))
Another way we train the model is to customize the training loop with TensorFlow operators.
To customize the training loop, we will use the module tf.GradientTape()
Consult the documentation for more details. https://www.tensorflow.org/api_docs/python/tf/GradientTape
The code below comes from the Coursera course:TensorFlow: Advanced Techniques Specialization by DeepLearning.AI.
for epoch in range(epochs):
losses_train = train_data_for_one_epoch()
losses_val = perform_validation()
losses_train_mean = np.mean(losses_train)
losses_val_mean = np.mean(losses_val)def train_data_for_one_epoch():
losses = []
for step, (x_batch_train, y_batch_train) \
in enumerate(train_datset):
logits, loss_value = apply_gradient(optimizer, model,\
x_batch_train, y_batch_train)
losses.append(loss_value)
return lossesdef apply_gradient(optimizer, model, x, y):
with tf.GradientTape() as tape:
logits = model(x) #Calculate loss for input samples
loss_value = loss_object(y_true=y, y_pred=logits) #Calculate gradients of loss
gradients = tape.gradient(loss_value, model.trainable_weights) #update Model Trainable weights using gradients \
# and learning rate
optimizer.apply_gradients(zip(gradients,\
model.trainable_weights))
return logits, loss_valuedef perform_validation():
losses = []
for x_val, y_val in test:
val_logits = model(x_val)
val_loss = loss_object(y_true=y_val, y_pred=val_logits)
losses.append(val_loss)
return losses
Tips here:
After initializing the Model, for each epoch: calculate the loss for input samples, the gradients of loss, and update Model Trainable weights using gradients and learning rate by conventional of the PyTorch library or TensorFlow Library.
Check how to do data preprocessing in PyTorch or TensorFlow for the NLP.
Check how to build a model in PyTorch or Tensorflow for the NLP.