-
Notifications
You must be signed in to change notification settings - Fork 24
Open
Description
Hi,
Can you shed some light on why you use
loss, loss_dict = self.loss(att, clf_logits, data.y, epoch)
and not loss, loss_dict = self.loss(att_log_logits, clf_logits, data.y, epoch)
Although the noise acts as stochastic attention, the noise may change the scores significantly, making the KLD with the Q distribution meaningless to tune MLP output.
Thanks and Regards
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels