-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conceptual issue in character embeddings #40
Comments
Hi @mraduldubey , |
Thanks @guillaumegenthial for the reply. This way the ground truth will be a vector representing the whole word. So, what is the ground truth here? |
You train the network to predict the tags. Turns out some parameters of the network correspond to character embeddings, so these are trained to help the network predict the tags. So the ground truth is the tag, and the learned embeddings help predict this tag. |
So, you mean that the word representation n/w, the contextual word representation n/w and the decoder, though mentioned separately in the blog, are trained simultaneously in conjunction with the ground truth being the tags and the backpropagation happens from the final layer back to the word representation n/w. |
I have this conceptual doubt in the part where we are obtaining word level representations from characters using the final output of BiLSTM network. We are initializing the character embeddings using xavier_initialization which just ensures that the cells do not saturate. So, how do these random embeddings capture any meaningful information? And how is this network trained or is it unsupervised?
The text was updated successfully, but these errors were encountered: