Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hidden State mapping to two value nodes instead of 1 #20

Open
samuelzxu opened this issue May 24, 2024 · 0 comments
Open

Hidden State mapping to two value nodes instead of 1 #20

samuelzxu opened this issue May 24, 2024 · 0 comments

Comments

@samuelzxu
Copy link
Contributor

Hi,

I'm confused on why you've defined the value head as you did in models.py. Namely, the value head as it is will output two numbers instead of 1, since you're mapping from the (2,4096) final hidden state to a (2,1) dimension tensor for the final value. It looks like you're missing half the hidden states. I would expect for it to map from a flattened version of the final hidden state to a single node.

As a sanity check I looked for where this was used and in line 1114 of trainers.py, I noticed that you're only taking in the first value in this (2,1) vector.

Can you tell me why you've made this design choice? I feel like I'm misinterpreting something here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant