You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This implementation is the same as Transformers.Bert with a tiny embeddings tweaks.
RoBERTa has the same architecture as BERT, but uses a byte-level BPE(implemented in BPE.jl) as a tokenizer (same as GPT-2) and uses a different pre-training scheme.
RoBERTa doesn’t have token_type_ids, you don’t need to indicate which token belongs to which segment. Just separate your segments with the separation (or </s>)
we can also wrapper Camembert (or the french version of BERT) around RoBERT.
The text was updated successfully, but these errors were encountered:
aviks
transferred this issue from JuliaText/TextAnalysis.jl
Nov 2, 2020
Transformers.Bert
with a tiny embeddings tweaks.BPE.jl
) as a tokenizer (same as GPT-2) and uses a different pre-training scheme.</s>
)we can also wrapper Camembert (or the french version of BERT) around RoBERT.
The text was updated successfully, but these errors were encountered: