unfreeze_layer_from_past parameter 

Nice repo!!!

it seems that the default parameter for the policy will freeze all the layers of the language model we are using and just update the lm_head 
I tried the provided example of flan-T5 here: https://colab.research.google.com/drive/1DYHt0mi6cyl8ZTMJEkMNpsSZCCvR4jM1?usp=sharing

when I changed the value unfreeze_layer_from_past to be 1 to update the wights of the final layer of flan-t5 like this:
![Screenshot 2023-09-20 at 1 04 45 PM](https://github.com/voidful/TextRL/assets/145554661/01f72b9a-faee-4373-91dd-6f35b4379d06)

the behavior change the the actor starts to generate empty text:
![Screenshot 2023-09-20 at 1 08 58 PM](https://github.com/voidful/TextRL/assets/145554661/6b138aad-9156-4502-8028-504009d3ef57)


Also after training it gave me empty text:

![Screenshot 2023-09-20 at 1 09 50 PM](https://github.com/voidful/TextRL/assets/145554661/cd35ab97-9487-4df0-a6a8-4902bfad864b)

what is the reason of the this behavior?

NOTE: I did not change anything else in the flan-t5 code example.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unfreeze_layer_from_past parameter #25

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

unfreeze_layer_from_past parameter #25

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions