0

I have a very simple Keras model that looks like:

model = Sequential()
model.add(Dense(hidden_size, input_dim=n_inputs, activation='relu'))
model.add(Dense(n_outputs, activation='softmax'))

The embedding that I am using is Bag of Words.

I want to include the embedding step as part of the model. I thought of doing it as a embedding layer... but I don't know wether is possible to implement a Bag of Words model as a Keras Embedding Layer? I know you can pass pre-trained BoW and GloVe embedding models to Embedding layers, so I was wondering if something like that could be done with BOW?

Any ideas will be much appreciated! :D

1 Answer 1

1

The embeddings layer in Keras (and basically all deep learning frameworks) does a lookup: for a token index, it returns a dense embedding.

The question is how do you want to embed a bag-of-words representation? I think one of the reasonable options would be:

  1. Do the embedding lookup for every word,
  2. Average the token embeddings and thus get a single vector representing the BoW. In Keras, you can use the GlobalAveragePooling1D for that.

Averaging is probably a better option than summing because the output will be of the same scale for sequences of different lengths.

Note that for the embedding lookup, you need the input to has a shape of batch × sequence length with integers corresponding to token indices in a vocabulary.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.