Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ANCE encoders #11

Open
kaishxu opened this issue Jan 7, 2021 · 3 comments
Open

ANCE encoders #11

kaishxu opened this issue Jan 7, 2021 · 3 comments

Comments

@kaishxu
Copy link

kaishxu commented Jan 7, 2021

Hello, I have a question about the BERT encoders. In the paper, it is said that "ANCE can be used to train any dense retrieval model. For simplicity, we use a simple set up in recent research (Luan et al., 2020) with BERT Siamese/Dual Encoder (shared between q and d), dot product similarity, and negative log likelihood (NLL) loss." So actually, only one encoder is used to encode queries and documents separately. However, in the "model.py", the "BiEncoder" is as follows:

class BiEncoder(nn.Module):
    """ Bi-Encoder model component. Encapsulates query/question and context/passage encoders.
    """
    def __init__(self, args):
        super(BiEncoder, self).__init__()
        self.question_model = HFBertEncoder.init_encoder(args)
        self.ctx_model = HFBertEncoder.init_encoder(args)

There are two encoders are defined.

@zhiqihuang
Copy link

Hello, I have a question about the BERT encoders. In the paper, it is said that "ANCE can be used to train any dense retrieval model. For simplicity, we use a simple set up in recent research (Luan et al., 2020) with BERT Siamese/Dual Encoder (shared between q and d), dot product similarity, and negative log likelihood (NLL) loss." So actually, only one encoder is used to encode queries and documents separately. However, in the "model.py", the "BiEncoder" is as follows:

class BiEncoder(nn.Module):
    """ Bi-Encoder model component. Encapsulates query/question and context/passage encoders.
    """
    def __init__(self, args):
        super(BiEncoder, self).__init__()
        self.question_model = HFBertEncoder.init_encoder(args)
        self.ctx_model = HFBertEncoder.init_encoder(args)

There are two encoders are defined.

Kudos! You asked the exact question I have. In the paper. it keeps using "BERT-Siamese". To my understanding, siamese here means a shared encoder between query and document.

In fact, if two encoders are used, Dense Retriever doubles the parameter size comparing to model like BERT Reranker or ColBERT.

@kaishxu
Copy link
Author

kaishxu commented Sep 23, 2021

hhhhhh! Bingo!
Besides, the hyper parameters are two sensitive. See the table in Appendix, if you change lr from 1e-6 to 2e-6, the accuracy decreases significantly!

@kaishxu kaishxu closed this as completed Sep 23, 2021
@kaishxu kaishxu reopened this Sep 23, 2021
@kaishxu
Copy link
Author

kaishxu commented Sep 23, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants