DL 8

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Practical 8

Design and implement RNN for classification of temporal data, sequence to sequence data
modelling, etc.

AIM:

Design and implement RNN for classification of temporal data, sequence to sequence data
modelling. etc.

THEORY:

- RNNs are specifically designed to process sequential data, such as time series, test, or sadio.
They can naturally capture dependencies and patterns present in sequences, which is crucial for
tasks like machine translation, text summarization, and speech recognition.

-They are commonly used in an encoder-decoder architecture, which is well-suited for


sequence-to-sequence tasks. The encoder processes the input sequence and creates a fixed-length
context vector, which is then used by the decoder to generate the output sequence.

- This architecture is highly effective in tasks like machine translation, where the input and
output have different lengths. While RNNs have limitations, such as the vanishing gradient
problem and difficulty in capturing long-range dependencies, they have been foundational in
sequence-to-sequence modelling and have laid the groundwork for more advanced architectures
like Transformers.

-Sequence to sequence modelling is a powerful machine learning technique that has


revolutionized the way we do natural language processing (NLP). It allows us to process input
sequences of varying lengths and produce output sequences of varying lengths, making it
particularly useful for tasks such as language translation, speech recognition, and chatbot
development. Sequence to sequence modelling also provides a great foundation for creating text
summarizers, question answering systems, sentiment analysis systems, and more. With its wide
range of applications, learning about sequence-to-sequence modelling concepts is essential for
anyone who wants to work in the field of natural language processing. This blog post will
discuss types of sequence models, their examples, and how they can be used to help with the:
understanding and analysis of sequences.
time step 1 2 3 4 5 6 7

-There are various different types of sequence models based on whether the input and output to
the model is sequence data or non-sequence data. They are as following:

1. One-to-sequence (One-to-many):

In one-to-sequence model, the input data is non sequence and the output data is sequence data.
Here is how one-sequence model looks like. One classic example is image captioning where
input is one single image and the output is a sequence of words.

2. Sequence-to-one (Many-to-one):

In sequence-to-one sequence model, the input data is sequence and output data is non sequence.
For example, consider a sequence of words (sentence) fed into the network and the output is
positive or negative sentiment. This is also called sentiment analysis.

3. Sequence-to-sequence (Many-to-many): In sequence-to-sequence sequence model, the input


data is sequence and output data is sequence. Take an example of machine translation system
Input is a sequence of words (sentence) in one language and output is another sequence of words
in another language.

Below are some popular machine learning applications that are based on sequential data:

1.Time Series: A challenge of predicting time series, such as stock market projections.

2 Text mining and sentiment analysis

3. Machine Translation: Given a single language input, sequence models are used to translate
the input into several languages.
4 Image captioning: Assessing the current action and creating a caption for the image.

5 Deep Recurrent Neural Network for Speech Recognition

6. Recurrent neural networks are being used to create classical music

7. Recurrent Neural Network for Predicting Transcription Factor Binding Sites based on
DNA Sequence Analysis.

Code:
The task is to translate short English sentences into French sentences, character-by-character
using a sequence-to-sequence model.

#import modules

# Import modules

from keras.models import Model

from keras.layers import Input, LSTM, Dense

import numpy as np

# Define the input sequence and process it

encoder_inputs = Input(shape=(None, num_encoder_tokens))

encoder = LSTM(latent_dim, return_state=True)

encoder_outputs, state_h, state_c = encoder(encoder_inputs)

# We discard encoder_outputs and only keep the states.

encoder_states = [state_h, state_c]

# Set up the decoder, using encoder_states as initial state.

decoder_inputs = Input(shape=(None, num_decoder_tokens))


# We set up our decoder to return full output sequences,

# and to return internal states as well. We don't use the

# return states in the training model, but we will use them in inference.

decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True)

decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)

decoder_dense = Dense(num_decoder_tokens, activation='softmax')

decoder_outputs = decoder_dense(decoder_outputs)

# Define the model that will turn encoder_input_data and decoder_input_data into
decoder_target_data

model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

# Compile the model

model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

# Training the model

model.fit([encoder_input_data, decoder_input_data], decoder_target_data,

batch_size=batch_size,

epochs=epochs,

validation_split=0.2)

# Define the encoder model

encoder_model = Model(encoder_inputs, encoder_states)


# Define the decoder model

decoder_state_input_h = Input(shape=(latent_dim,))

decoder_state_input_c = Input(shape=(latent_dim,))

decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

decoder_outputs, state_h, state_c = decoder_lstm(

decoder_inputs, initial_state=decoder_states_inputs

decoder_states = [state_h, state_c]

decoder_outputs = decoder_dense(decoder_outputs)

decoder_model = Model(

[decoder_inputs] + decoder_states_inputs,

[decoder_outputs] + decoder_states

def decode_sequence(input_seq):

# Encode the input as state vectors

states_value = encoder_model.predict(input_seq)

# Generate an empty target sequence of length 1

target_seq = np.zeros((1, 1, num_decoder_tokens))

# Populate the first character of the target sequence with the start character
target_seq[0, 0, target_token_index['\t']] = 1.0

# Sampling loop for a batch of sequences (assuming batch size = 1)

stop_condition = False

decoded_sentence = ''

while not stop_condition:

output_tokens, h, c = decoder_model.predict([target_seq] + states_value)

# Sample a token

sampled_token_index = np.argmax(output_tokens[0, -1])

sampled_char = reverse_target_char_index[sampled_token_index]

decoded_sentence += sampled_char

# Exit condition: either hit max length or find stop character

if (

sampled_char == '\n'

or len(decoded_sentence) > max_decoder_seq_length

):

stop_condition = True

# Update the target sequence (of length 1)

target_seq = np.zeros((1, 1, num_decoder_tokens))

target_seq[0, 0, sampled_token_index] = 1.0


# Update states

states_value = [h, c]

return decoded_sentence

Conclusion:
Sequence Models are a sequence modeling technique that is used for analyzing sequence data.
There are three types of sequence models: one-to-sequence, sequence-to-one and sequence to
sequence. Sequence models can be used in different applications such as image captioning, smart
replies on chat tools and predicting movie ratings based on user feedback.

You might also like