We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
你好,我在看代码的过程中有点小疑问,就是在model.py的第510行 new_mems.append(_cache_mem(output, mems[i], mem_len)) 这个的意思其实就是将最早的一个memory剃掉,将最近的一个插入。但是当i=0的时候,那个output中只有position的embedding,并没有multihead的attention。为啥不把new_mems.append(_cache_mem(output, mems[i], mem_len)) 这行代码放到for循环的 positionwise_FF 输出output之后呢?即把这行代码放到534行 多谢
new_mems.append(_cache_mem(output, mems[i], mem_len))
The text was updated successfully, but these errors were encountered:
No branches or pull requests
你好,我在看代码的过程中有点小疑问,就是在model.py的第510行
new_mems.append(_cache_mem(output, mems[i], mem_len))
这个的意思其实就是将最早的一个memory剃掉,将最近的一个插入。但是当i=0的时候,那个output中只有position的embedding,并没有multihead的attention。为啥不把new_mems.append(_cache_mem(output, mems[i], mem_len))
这行代码放到for循环的 positionwise_FF 输出output之后呢?即把这行代码放到534行多谢
The text was updated successfully, but these errors were encountered: