Skip to content

Question about the draft/target memory ratio #5

@zhuzhui-2000

Description

@zhuzhui-2000

Thanks for your interesting work, I believe that the project provides new theoretical analysis and insights about speculative decoding.

I would like to ask a question about the draft/target memory ratio. The paper shows that "the draft models can occupy up to 38∼140% memory footprint of target models", but I didn't find any equation related to this. I wanna to know how do you analysis it theoretically? Could you provide a specific equation?
微信截图_20241120164006

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions