You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
classBERTTrainer:
def__init__(self, ...):
...
# Using Negative Log Likelihood Loss function for predicting the masked_tokenself.criterion=nn.NLLLoss(ignore_index=0)
...
I cannot understand why ignore index=0 is specified when calculating NLLLoss. If the ground truth of is_next is False (label = 0) in terms of the NSP task but BERT predicts True, then NLLLoss will be 0 (or nan)... so what's the aim of ignore_index = 0 ???
====================
Well, I've found that ignore_index = 0 is useful to the MLM task, but I still can't agree the NSP task should share the same NLLLoss with MLM.
The text was updated successfully, but these errors were encountered:
trainer/pretrain.py
I cannot understand why
ignore index=0
is specified when calculating NLLLoss. If the ground truth ofis_next
is False (label = 0) in terms of the NSP task but BERT predicts True, then NLLLoss will be 0 (or nan)... so what's the aim ofignore_index = 0
???====================
Well, I've found that
ignore_index = 0
is useful to the MLM task, but I still can't agree the NSP task should share the same NLLLoss with MLM.The text was updated successfully, but these errors were encountered: