Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on training inputs and outputs #12

Open
DocDriven opened this issue Oct 23, 2021 · 0 comments
Open

Clarification on training inputs and outputs #12

DocDriven opened this issue Oct 23, 2021 · 0 comments

Comments

@DocDriven
Copy link

The HLD is helpful, but what I think is still hard to understand is how you actually train the model, speak how inputs and outputs of the model look like that get used to calculate the loss.

Following the diagram and assuming one input of each type of feature (binary, numeric, categorical), you have 3 inputs looking like this:

[ 1 
  44
  9 ]

These are the inputs of your model, which get initially transformed and concatenated into a vector with 8 features like:

[ 1        // binary
  0.22     // rescaled
  0.2      // rest is random embedding of emb size 6
 -0.12
  0.65
  0.11
 -0.96
 -1.01 ]

As I understand it, you use an autoencoder to reconstruct the concatenated layer, e.g. you minimize the reconstruction error between the concatenated and the output layer, both with 8 features. But doesn't this completely ignore the training of the embedding?

Would you be so kind to give a simple example, which errors you minimize between which inputs and outputs (concrete values are also fine!)

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant