1

I'm trying to implement both character and word lstms but i keep getting this error:

InvalidArgumentError:  indices[310,0] = 119 is not in [0, 119)
     [[node model_3/time_distributed_12/embedding_7/embedding_lookup (defined at <ipython-input-64-51f6ad92087d>:3) ]] [Op:__inference_train_function_28785]

Errors may have originated from an input operation.
Input Source operations connected to node model_3/time_distributed_12/embedding_7/embedding_lookup:
 model_3/time_distributed_12/embedding_7/embedding_lookup/24179

This is my model:

# input and embedding for words
word_in = Input(shape=(max_len,))
emb_word = embedding_layer(word_in)

# input and embeddings for characters
char_in = Input(shape=(max_len, max_len_char,)) 
emb_char = Embedding(input_dim=n_chars + 1, output_dim=20, 
                           input_length=max_len_char, mask_zero=True)
print(emb_char)
char_dist = TimeDistributed(emb_char)(char_in)
# character LSTM to get word encodings by characters
char_enc = TimeDistributed(LSTM(units=20, return_sequences=False,
                                recurrent_dropout=0.5))(char_dist)

# main LSTM

x = concatenate([emb_word, char_enc])

x = SpatialDropout1D(0.3)(x)
main_lstm = Bidirectional(LSTM(units=50, return_sequences=False,recurrent_dropout=0.6))(x)
out = Dense(num_of_classes, activation="sigmoid")(main_lstm)


model = Model([word_in, char_in], out)

I've read that it is to do with the input. My X_char_tr shape is (2770, 10, 30) X_word_tr.shape is (2770, 10) and y_tr is (2770,135)

history = model.fit([X_word_tr,
                    (np.array(X_char_tr)).astype('float32').reshape((len(X_char_tr), max_len, max_len_char))],
                    np.array(to_categorical(y_tr)), epochs=10, verbose=1)

This is my word embedding layer:

embedding_layer = Embedding(
    vocab_size + 1,
    config['W2V_DIM'],
    weights=[w2v_weights],
    input_length=max_sequence_len,
    trainable=False
)

The word vector shape is (n_words,128)

1 Answer 1

1

Check X_char_tr. As far as I understand - vocabulary size (n_char) is 118. So maximum permitted value for this tensor is 118. But the first value is 119.

Try this:

print(tf.where(X_char_tr>118))
5
  • Yes, thats the number of unique characters n_chars i have. Should the input_dim for char embedding be something else?
    – mojbius
    Commented Dec 9, 2020 at 13:01
  • What is X_char_tr[0][0][0] ? Is it 119 ?
    – Andrey
    Commented Dec 9, 2020 at 13:19
  • No, 38. This is X_char_tr[0][0]: array([ 38, 79, 115, 61, 46, 55, 61, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
    – mojbius
    Commented Dec 9, 2020 at 13:23
  • I get this tf.Tensor([], shape=(0, 3), dtype=int64) , but not sure what that means. First timei ran it, it complained about NoneTypes inside the array so i has to convert them to floats
    – mojbius
    Commented Dec 10, 2020 at 10:25
  • it means that your X_char_tr has no elements exceeding 118. So it is correct. Please add code of embedding_layer() to the question
    – Andrey
    Commented Dec 10, 2020 at 10:38

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.