Skip to content

Quantize pretrained models #8

@arasharchor

Description

@arasharchor

Hi,

as model is big for limited GPU memories, is it possible to compress the model by binarization or quantization method not only to reduce the size of models, but also to speed up the process?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions