Skip to content

The last transformer layer of the model is always kept by default? #3

@Jayce0625

Description

@Jayce0625

Your work is much appreciated. I'm sorry that I'm not a native English speaker, so please understand if there are mistakes. I have a question, does your work only discuss whether the similarity between the first n-1 layers only and not the similarity between the n-1th layer and the nth layer? Is this because the last layer is connected to the classification header so we always keep the last transformer layer by default?
Under a certain threshold, if the similarity exceeds the threshold, we will always delete the latter layer, for example, if the similarity between layer i and layer i+1 exceeds the threshold and we need to delete or replace the layer, then we must target layer i+1
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions