Before going through this blog i will highly recommend the previous blog regrading introduction to Recurrent Neural Network https://ainewgeneration.com/recurrent-neural-network/
In this application, we will be using the PyTorch Library, an in-depth study platform that is easy to use and widely used by senior researchers. We will create a model that will complete a sentence based on the word or a few letters conveyed.
Source : Google Image Search
Table of Contents
- Workflow for Implementation of RNN model
- Importing Basic libraries
- Creating dictionaries
- Padding Input Sequence
- Creating Input and Target Sequence
- Creating one-hot-vectors for sequence
- PyTorch to Check GPU
- Creating RNN model
- Initializing Hyper parameter
- Training RNN model
- Testing RNN model
Workflow for Implementation of RNN model
The model will be named and will predict who the next character in the sentence will be. This process will again produce the sentence of the desired length.
To keep this short and simple, we will not use any large or external data sets. Instead, we will be defining just a few sentences to see how the model learns from these phrases. The process that follows is as follows:
Source : Google Image Search
Importing Basic Libraries
We will start by importing the main PyTorch package and the nn package that we will use when building the model. In addition, we will only use NumPy to process our data before Torch works better with NumPy.
First, we are going to describe the sentences we would like our model to be taken out of after we are fed with the primary word or the primary few letters.
Then we are going to build a dictionary for all the letters we’ve got within the sentences and count them in int numbers. this can allow us to convert our input letters to their correct numbers (char2int) and contrariwise (int2char).
Padding Input Sequence
we will be inserting our input sentences to ensure that all sentences are of the same length. While RNNs can take different inputs, we will often want to feed training data in batches to speed up the training process. In order to use batches to train our data, we will need to ensure that each sequence of input data is of the same size.
Therefore, in most cases, padding can be done by completing a very short sequence with 0 values and a very long cut sequence. In our case, we will find the longest sequence length and cover all the other sentences with blank spaces to match that length.
Creating Input And Target Sequence
As we will predict the following character in each step, we will have to separate each sentence from:
- Input data : The last letter should be excluded because it does not need to be included in the model
- Target data : One step ahead of the input data as this will be the “correct response” of the model at any time equal to the input data
The directed sequence will always be a one-time step before the installation sequence.
We can now convert our input and target sequences into numerical sequences instead of alphabetical marks by using the dictionaries we have created above. This will allow us to hold one hot code for our later installation sequences.
Creating one-hot-vectors for Input sequence
Before encoding our input sequences for hot vectors, we will define 3 key variables:
- dict_size: Dictionary size – The number of unique characters we have in our text
This will determine the size of the hot vector as each character will have a reference assigned to that vector
- seq_len: The length of the sequence we feed in the model
Since we rated the length of all of our sentences to be the same as the longest sentences, this number will be the maximum length – 1 as we removed the last letter input
- batch_size: The number of sentences we have defined and fed by the model as a group
now converting input sequence to one-hot-vector
and finally we will convert from numpy to tensor as model accept vector should be in tensor
PyTorch to Check GPU
Before we start building a model, let’s use the built-in PyTorch feature to test the device we’re working on (CPU or GPU). This implementation will not require a GPU as training is really easy. However, as you move forward in big data sets and models with millions of trained parameters, using a GPU will be very important to speed up your training.
Creating RNN Model
To begin building our neural network model, we can define the basic legacy category of PyTorch (nn.module) in all neural network modules. After doing so, we can begin to describe some of the variables and layers of our model under the builder. In this model, we will only use 1 RNN layer followed by a fully connected layer. A fully integrated layer will handle the RNN output conversion to our desired output state.
We will also have to define the forward () function as a class. The progression work is done sequentially, so we will need to pass the input and the hidden state created zero to the RNN layer first, before transferring the RNN results to the fully connected layer. Note that we use the layers we have described in the builder.
The last method we have to define is the method we called earlier to create a hidden state – init_hidden (). This basically creates the number of eggs in the shape of our hidden conditions.
Initializing Hyper Parameter
After defining the above model, we will have to validate the model with the appropriate parameters and define our hyper parameters. The hyper parameters we describe below are:
- n_epochs: Number of Epochs -> Number of times our model will go through all training databases
- lr: Read rate -> Estimate where our model regenerates cellular cells each time the back broadcast is done
For an in-depth guide to hyper parameters, you can refer to this broad article.
Like other neural networks, we also have to define the function of optimizer and loss. We will use CrossEntropyLoss as the final output is a split function and a standard Adam optimizer.
Training RNN model
Now we will start our training RNN model with using 100 as epochs and loss function as categoricalcrossentropy .
Testing RNN model
we will test our RNN model for prediction part whether our model is performing well in test data
now we can run above function so that we are able to see prediction made by our model
we can see that our model is able to predict its working well as we feed “good” and model is able to predict “good i am fine”
I hope that this article would have given you a head begin with the Recurrent Neural Networks in PyTorch. we will more focus on coding parts and also we will talk in next blog about LSTM. and problems related long term dependencies.