LSTM InvalidArgumentError

Question

I'm trying to implement both character and word lstms but i keep getting this error:

InvalidArgumentError:  indices[310,0] = 119 is not in [0, 119)
     [[node model_3/time_distributed_12/embedding_7/embedding_lookup (defined at <ipython-input-64-51f6ad92087d>:3) ]] [Op:__inference_train_function_28785]

Errors may have originated from an input operation.
Input Source operations connected to node model_3/time_distributed_12/embedding_7/embedding_lookup:
 model_3/time_distributed_12/embedding_7/embedding_lookup/24179

This is my model:

# input and embedding for words
word_in = Input(shape=(max_len,))
emb_word = embedding_layer(word_in)

# input and embeddings for characters
char_in = Input(shape=(max_len, max_len_char,)) 
emb_char = Embedding(input_dim=n_chars + 1, output_dim=20, 
                           input_length=max_len_char, mask_zero=True)
print(emb_char)
char_dist = TimeDistributed(emb_char)(char_in)
# character LSTM to get word encodings by characters
char_enc = TimeDistributed(LSTM(units=20, return_sequences=False,
                                recurrent_dropout=0.5))(char_dist)

# main LSTM

x = concatenate([emb_word, char_enc])

x = SpatialDropout1D(0.3)(x)
main_lstm = Bidirectional(LSTM(units=50, return_sequences=False,recurrent_dropout=0.6))(x)
out = Dense(num_of_classes, activation="sigmoid")(main_lstm)


model = Model([word_in, char_in], out)

I've read that it is to do with the input. My X_char_tr shape is (2770, 10, 30) X_word_tr.shape is (2770, 10) and y_tr is (2770,135)

history = model.fit([X_word_tr,
                    (np.array(X_char_tr)).astype('float32').reshape((len(X_char_tr), max_len, max_len_char))],
                    np.array(to_categorical(y_tr)), epochs=10, verbose=1)

This is my word embedding layer:

embedding_layer = Embedding(
    vocab_size + 1,
    config['W2V_DIM'],
    weights=[w2v_weights],
    input_length=max_sequence_len,
    trainable=False
)

The word vector shape is (n_words,128)

Andrey · Accepted Answer · 2020-12-09 13:41:48Z

1

Check X_char_tr. As far as I understand - vocabulary size (n_char) is 118. So maximum permitted value for this tensor is 118. But the first value is 119.

Try this:

print(tf.where(X_char_tr>118))

edited Dec 9, 2020 at 13:41

answered Dec 9, 2020 at 12:11

Andrey

6,3333 gold badges23 silver badges45 bronze badges

Yes, thats the number of unique characters n_chars i have. Should the input_dim for char embedding be something else?
– mojbius
Commented Dec 9, 2020 at 13:01
What is X_char_tr[0][0][0] ? Is it 119 ?
– Andrey
Commented Dec 9, 2020 at 13:19
No, 38. This is X_char_tr[0][0]: array([ 38, 79, 115, 61, 46, 55, 61, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
– mojbius
Commented Dec 9, 2020 at 13:23
I get this tf.Tensor([], shape=(0, 3), dtype=int64) , but not sure what that means. First timei ran it, it complained about NoneTypes inside the array so i has to convert them to floats
– mojbius
Commented Dec 10, 2020 at 10:25
it means that your X_char_tr has no elements exceeding 118. So it is correct. Please add code of embedding_layer() to the question
– Andrey
Commented Dec 10, 2020 at 10:38

Add a comment |

Collectives™ on Stack Overflow

LSTM InvalidArgumentError

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged
python
tensorflow
keras
lstm
embedding
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged pythontensorflowkeraslstmembedding or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
python
tensorflow
keras
lstm
embedding
or ask your own question.