Skip to content

CBOW: sigmoid instead of softmax loss with sparse_categorical_crossentropy? #279

@jchwenger

Description

@jchwenger

Hi there,

In your Text Classification chapter, "Pretraining a word embedding", you use the sigmoid as your last activation before a .. sparse_categorical_crossentropy?? Looks like the results are the same if you use softmax (some all-too-nice reduction pipeline behind the scene, I'm assuming), but this looks like a typo, no (since all the other models in the chapter have the sigmoid because of the binary sentiment task)? This part in the book also has no comment about this, and Chapter 6 has a neat reminder about the appropriate losses & loss functions, not ideal from a pedagogical point of view?

I hope I haven't overlooked something major, let me know your thoughts !

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions