Allow me to rephrase to see if I understand. The embeddings refer to an internal state (like a vector representing activation levels of neurons) reached after feeding in the words in the context window.
So you're saying I could compare the state which creates a prediction to the state that is achieved by inputting only the predicted word?