You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the paper you declare BERT-large is trained with a batch_size of 2048 for 464K steps but in the compute_flops.py script you use the same train args as BERT-base. Is this a mistake?
The text was updated successfully, but these errors were encountered:
In the paper you declare BERT-large is trained with a batch_size of
2048
for 464K steps but in thecompute_flops.py
script you use the same train args as BERT-base. Is this a mistake?The text was updated successfully, but these errors were encountered: