I'm trying to implement both character and word lstms but i keep getting this error:
InvalidArgumentError: indices[310,0] = 119 is not in [0, 119)
[[node model_3/time_distributed_12/embedding_7/embedding_lookup (defined at <ipython-input-64-51f6ad92087d>:3) ]] [Op:__inference_train_function_28785]
Errors may have originated from an input operation.
Input Source operations connected to node model_3/time_distributed_12/embedding_7/embedding_lookup:
model_3/time_distributed_12/embedding_7/embedding_lookup/24179
This is my model:
# input and embedding for words
word_in = Input(shape=(max_len,))
emb_word = embedding_layer(word_in)
# input and embeddings for characters
char_in = Input(shape=(max_len, max_len_char,))
emb_char = Embedding(input_dim=n_chars + 1, output_dim=20,
input_length=max_len_char, mask_zero=True)
print(emb_char)
char_dist = TimeDistributed(emb_char)(char_in)
# character LSTM to get word encodings by characters
char_enc = TimeDistributed(LSTM(units=20, return_sequences=False,
recurrent_dropout=0.5))(char_dist)
# main LSTM
x = concatenate([emb_word, char_enc])
x = SpatialDropout1D(0.3)(x)
main_lstm = Bidirectional(LSTM(units=50, return_sequences=False,recurrent_dropout=0.6))(x)
out = Dense(num_of_classes, activation="sigmoid")(main_lstm)
model = Model([word_in, char_in], out)
I've read that it is to do with the input. My X_char_tr shape is (2770, 10, 30)
X_word_tr.shape is (2770, 10)
and y_tr is (2770,135)
history = model.fit([X_word_tr,
(np.array(X_char_tr)).astype('float32').reshape((len(X_char_tr), max_len, max_len_char))],
np.array(to_categorical(y_tr)), epochs=10, verbose=1)
This is my word embedding layer:
embedding_layer = Embedding(
vocab_size + 1,
config['W2V_DIM'],
weights=[w2v_weights],
input_length=max_sequence_len,
trainable=False
)
The word vector shape is (n_words,128)