categorical cross entropy vs cross entropy

Should I use a categorical cross-entropy or binary cross-entropy loss for binary predictions?

stats.stackexchange.com › questions › 260505 › should-i-use-a-categorical-cross-entropy-or-binary-cross-entropy-loss-for-binary

Bernoulli $\text{[math]}$ cross-entropy loss is a special case of categorical cross-entropy loss for $\text{[math]}$ .

$\text{[math]}$

Where $\text{[math]}$ indexes samples/observations and $\text{[math]}$ indexes classes, and $\text{[math]}$ is the sample label (binary for LSH, one-hot vector on the RHS) and $\text{[math]}$ is the prediction for a sample.

I write "Bernoulli cross-entropy" because this loss arises from a Bernoulli probability model. There is not a "binary distribution." A "binary cross-entropy" doesn't tell us if the thing that is binary is the one-hot vector of $\text{[math]}$ labels, or if the author is using binary encoding for each trial (success or failure). This isn't a general convention, but it makes clear that these formulae arise from particular probability models. Conventional jargon is not clear in that way.

Answer from Sycorax on Stack Exchange

Stack Exchange

stats.stackexchange.com › questions › 260505 › should-i-use-a-categorical-cross-entropy-or-binary-cross-entropy-loss-for-binary

machine learning - Should I use a categorical cross-entropy or binary cross-entropy loss for binary predictions? - Cross Validated

Top answer

1 of 4

Bernoulli $\text{[math]}$ cross-entropy loss is a special case of categorical cross-entropy loss for $\text{[math]}$ .

$\text{[math]}$

2 of 4

There are three kinds of classification tasks:

Binary classification: two exclusive classes
Multi-class classification: more than two exclusive classes
Multi-label classification: just non-exclusive classes

Here, we can say

In the case of (1), you need to use binary cross entropy.
In the case of (2), you need to use categorical cross entropy.
In the case of (3), you need to use binary cross entropy.

You can just consider the multi-label classifier as a combination of multiple independent binary classifiers. If you have 10 classes here, you have 10 binary classifiers separately. Each binary classifier is trained independently. Thus, we can produce multi-label for each sample. If you want to make sure at least one label must be acquired, then you can select the one with the lowest classification loss function, or using other metrics.

I want to emphasize that multi-class classification is not similar to multi-label classification! Rather, multi-label classifier borrows an idea from the binary classifier!

Weights & Biases

wandb.ai › mostafaibrahim17 › ml-articles › reports › Understanding-the-Difference-in-Performance-Between-Binary-Cross-Entropy-and-Categorical-Cross-Entropy--Vmlldzo0Nzk4NDI2

Understanding the Difference in Performance Between ...

1 week ago - Weights & Biases, developer tools for machine learning

Discussions

python - difference between categorical and binary cross entropy - Stack Overflow

Using keras I have to train a model to predict either the image belongs to class 0 or class 1. I am confused in binary and categorical_cross_entropy. I have searched for that but I am still confused. More on stackoverflow.com

stackoverflow.com

python - What is the difference between sparse_categorical_crossentropy and categorical_crossentropy? - Stack Overflow

One good example of the sparse-categorical-cross-entropy is the fasion-mnist dataset. More on stackoverflow.com

stackoverflow.com

machine learning - Why binary_crossentropy and categorical_crossentropy give different performances for the same problem? - Stack Overflow

I'm trying to train a CNN to categorize text by topic. When I use binary cross-entropy I get ~80% accuracy, with categorical cross-entropy I get ~50% accuracy. I don't understand why this is. It's a More on stackoverflow.com

stackoverflow.com

Difference between binary cross entropy and categorical cross entropy?

With binary cross entropy, you can only classify two classes. With categorical cross entropy, you're not limited to how many classes your model can classify. Binary cross entropy is just a special case of categorical cross entropy. The equation for binary cross entropy loss is the exact equation for categorical cross entropy loss with one output node. For example, binary cross entropy with one output node is the equivalent of categorical cross entropy with two output nodes. More on reddit.com

r/learnmachinelearning

March 31, 2018

Videos