binary cross entropy with logits pytorch

Manual Calculation of Binary Cross Entropy with logits

discuss.pytorch.org › t › manual-calculation-of-binary-cross-entropy-with-logits › 61602

Hi Arif! [image] arif_saeed: In order to ensure that I understood how BCE with logits loss works in pytorch, I tried to manually calculate the loss, however I cannot reconcile my manual calculation with the loss generated by the pytorch function F.binary_cross_entropy_with_logits. p1=y*(math.l… Answer from KFrank on discuss.pytorch.org

PyTorch Forums

discuss.pytorch.org › t › manual-calculation-of-binary-cross-entropy-with-logits › 61602

Manual Calculation of Binary Cross Entropy with logits - PyTorch Forums

Top answer

1 of 3

2 of 3

Sorry to bump this thread, but how do I implement log(1 - sigmoid(x)) with logsigmoid?

PyTorch Forums

discuss.pytorch.org › vision

How to use binary cross entropy with logits in binary target and 3d output - vision - PyTorch Forums

August 7, 2019 - I have batch size = 5 my network output is given by the following code Output = F.upsample(per_frame_logits, t, mode='linear') Shape of output is = torch.Size([5, 2, 64]) Shape of target is = torch.Size([5]) (i.e. ex [1.0, 0.0, 0.0, 1.0, 1.0]) Then i pass it to following loss function loss = F.binary_cross_entropy_with_logits(output, target) I get the following value error raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size())) ValueErro...

Discussions

Binary Cross Entropy with logits does not work as expected

When I use F.binary_cross_entropy in combination with the sigmoid function, the model trains as expected on MNIST. However, when changing to the F.binary_cross_entropy_with_logits function, the loss suddenly becomes arbitrarily small during training and the model no longer produces meaningful ... More on discuss.pytorch.org

discuss.pytorch.org

September 14, 2019

Info about binary cross entropy with logits

The function torch.nn.functional.binary_cross_entropy_with_logits actually returns a call to the function torch.binary_cross_entropy_with_logits. But I can’t find any information about it. It doesn’t have any docstring either. What is the actual code that is called and how is it called? More on discuss.pytorch.org

discuss.pytorch.org

April 1, 2019

pytorch - binary_cross_entropy_with_logits produces negative output - Stack Overflow

I am developing a machine learning model to detect bones from a skeleton image. I am using pytorch, and the model i am using is the hourglass model. When i use binary_cross_entropy_with_logits i ca... More on stackoverflow.com

stackoverflow.com

deep learning - How is cross entropy loss work in pytorch? - Stack Overflow

I am experimenting with some of the pytorch codes. With cross entropy loss I found some interesting results and I have used both binary cross entropy loss and cross entropy loss of pytorch. import ... More on stackoverflow.com

stackoverflow.com

Sebastian Raschka

sebastianraschka.com › faq › docs › pytorch-crossentropy.html

Why are there so many ways to compute the Cross Entropy Loss in PyTorch and how do they differ? | Sebastian Raschka, PhD

January 17, 2026 - In PyTorch, these refer to implementations that accept different input arguments (but compute the same thing). This is summarized below. torch.nn.functional.binary_cross_entropy takes logistic sigmoid values as inputs · torch.nn.functional.binary_cross_entropy_with_logits takes logits as inputs

PyTorch Forums

discuss.pytorch.org › t › binary-cross-entropy-with-logits-does-not-work-as-expected › 55930

Binary Cross Entropy with logits does not work as expected - PyTorch Forums

Top answer

1 of 1

Hi Simon! [image] smonsays: Both have dimensions torch.Size([100, 784]). The first dimension is the batch size, the second is the number of pixels in a MNIST training image. The extrema are: torch.min(x_recon) # -16971.4434 torch.max(x_recon) # 15807.5469 torch.min(x_view) # -0.4242 torch.max(…

Python Guides

pythonguides.com › pytorch-binary-cross-entropy

PyTorch Binary Cross Entropy Loss

June 17, 2025 - PyTorch Binary cross entropy with logits combines a sigmoid activation and the binary cross entropy loss in one single class.

Medium

zhang-yang.medium.com › how-is-pytorchs-binary-cross-entropy-with-logits-function-related-to-sigmoid-and-d3bd8fb080e7

How is Pytorch’s binary_cross_entropy_with_logits function related to sigmoid and binary_cross_entropy | by Yang Zhang | Medium

August 25, 2019 - This notebook breaks down how binary_cross_entropy_with_logits function (corresponding to BCEWithLogitsLoss used for multi-class classification) is implemented in pytorch, and how it is related to sigmoid and binary_cross_entropy.

GitHub

gist.github.com › yang-zhang › 09460d9e90a1bf29fb6edf121865df86

binary cross entropy implementation in pytorch · GitHub

binary cross entropy implementation in pytorch · Display the source blob · Display the rendered blob Raw · binary_cross_entropy_with_logits.ipynb · Loading · Sorry, something went wrong. Reload? Sorry, we cannot display this file. Sorry, this file is invalid so it cannot be displayed.

PyTorch Forums

discuss.pytorch.org › autograd

Info about binary cross entropy with logits - autograd - PyTorch Forums

April 1, 2019 - The function torch.nn.functional.binary_cross_entropy_with_logits actually returns a call to the function torch.binary_cross_entropy_with_logits. But I can’t find any information about it. It doesn’t have any docstring e…

Find elsewhere

Google Bing Mojeek

Stack Overflow

stackoverflow.com › questions › 68607705 › binary-cross-entropy-with-logits-produces-negative-output

pytorch - binary_cross_entropy_with_logits produces negative output - Stack Overflow

Top answer

1 of 2

I've ran into this multiple times, each time reason was labels are not between 0 and 1

2 of 2

The following code block is proposed by G. Hinton in his course: http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf

optimizer = optim.RMSprop(net.parameters(), lr=0.005, weight_decay=1e-8)
if net.n_classes > 1:
    criterion = nn.CrossEntropyLoss()
else:
    criterion = nn.BCEWithLogitsLoss()

Then you will need to use sigmoid (Torch Functional: F.sigmoid) in a similar manner as the code example below:

for isample, sample in enumerate(ds):
    
    mask_torch = net2(sample['image'][None, :, :, :].type(torch.cuda.FloatTensor))
    mask = (F.sigmoid(mask_torch.type(torch.cuda.FloatTensor)) > 0.4925099).type(torch.FloatTensor)
    print(mask)
 
    for ichan in range(3):
        ax[isample, ichan].imshow(sample['image'][ichan].cpu())
        
    ax[isample, 3].imshow(sample['mask'][0].cpu())
    ax[isample, 4].imshow(mask[0, 0].cpu().detach().numpy())

Place the sigmoid at the end after al the layers. It will look something like this:

For sigmoid, It will look something like this:

def forward(self, x):
    #print(x.shape)
    x = self.layer_1(x)
    x = self.layer_2(x)
    x = self.layer_3(x)
   
    logits = F.sigmoid(self.outc(x))
    
    return logits

Stack Overflow

stackoverflow.com › questions › 64221896 › how-is-cross-entropy-loss-work-in-pytorch

deep learning - How is cross entropy loss work in pytorch? - Stack Overflow

Top answer

1 of 1

The reason that you are seeing this is because nn.CrossEntropyLoss accepts logits and targets, a.k.a X should be logits, but is already between 0 and 1. X should be much bigger, because after softmax it will go between 0 and 1.

ce_loss(X * 1000, torch.argmax(X,dim=1)) # tensor(0.)

nn.CrossEntropyLoss works with logits, to make use of the log sum trick.

The way you are currently trying after it gets activated, your predictions become about [0.73, 0.26].

Binary cross entropy example works since it accepts already activated logits. By the way, you probably want to use nn.Sigmoid for activating binary cross entropy logits. For the 2-class example, softmax is also ok.

PyTorch Forums

discuss.pytorch.org › t › does-binary-cross-entropy-with-logits-0-5-equals-random-guess › 180264

Does binary cross entropy with logits = 0.5 equals random guess? - PyTorch Forums

May 18, 2023 - Hi, I have 256 samples labeled with 1 and 256 samples labeled with 0. My loss seems to converge to 0.51 Does it mean, the model only makes a random guess? To be precise I have domain_loss = F.binary_cross_entropy_with_logits(domain_predictions, domain_y) and the printout converges to 0.51

Stack Overflow

stackoverflow.com › questions › 46218566 › pytorch-equivalence-for-softmax-cross-entropy-with-logits

tensorflow - PyTorch equivalence for softmax_cross_entropy_with_logits - Stack Overflow

Top answer

1 of 4

is there an equivalent PyTorch loss function for TensorFlow's softmax_cross_entropy_with_logits?

`torch.nn.functional.cross_entropy`

This takes logits as inputs (performing log_softmax internally). Here "logits" are just some values that are not probabilities (i.e. not necessarily in the interval [0,1]).

But, logits are also the values that will be converted to probabilities. If you consider the name of the tensorflow function you will understand it is pleonasm (since the with_logits part assumes softmax will be called).

In the PyTorch implementation looks like this:

loss = F.cross_entropy(x, target)

Which is equivalent to :

lp = F.log_softmax(x, dim=-1)
loss = F.nll_loss(lp, target)

It is not F.binary_cross_entropy_with_logits because this function assumes multi label classification:

F.sigmoid + F.binary_cross_entropy = F.binary_cross_entropy_with_logits

It is not torch.nn.functional.nll_loss either because this function takes log-probabilities (after log_softmax()) not logits.

2 of 4

A solution

from thexp.calculate.tensor import onehot
from torch.nn import functional as F
import torch

logits = torch.rand([3,10])
ys = torch.tensor([1,2,3])
targets = onehot(ys,10)
assert F.cross_entropy(logits,ys) == -torch.mean(torch.sum(F.log_softmax(logits, dim=1) * targets, dim=1))

onehot function:

def onehot(labels: torch.Tensor, label_num):
    return torch.zeros(labels.shape[0], label_num, device=labels.device).scatter_(1, labels.view(-1, 1), 1)

PyTorch Forums

discuss.pytorch.org › t › binary-cross-entropy-with-logits-with-weights › 198424

Binary cross entropy with logits with weights - PyTorch Forums

March 7, 2024 - Hi ! I am currently working with the function torch.nn.functional.binary_cross_entropy_with_logits torch.nn.functional.binary_cross_entropy_with_logits — PyTorch 2.2 documentation and I have some questions. I am not sur…

PyTorch Forums

discuss.pytorch.org › t › implementation-of-binary-cross-entropy › 98715

Implementation of Binary cross Entropy? - PyTorch Forums

Top answer

1 of 2

Hello Surya and Pytorchtester! [image] sgaur: Q1) Is BCEWithLogitLoss = BCELoss + sigmoid() ? To clarify a bit: Mathematically, BCEWithLogitsLoss is sigmoid() followed by BCELoss. But numerically they are different, with BCELoss numerically less stable. Elaborating on the above, sigmo…

2 of 2

[image] sgaur: BCEWithLogitsLoss Hello, Yes equivalent but less stable for BCELoss The code of the BCEWithLogitsLoss Class can be found in https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/loss.py You will find a call to def forward(self, input: Tensor, target: Tensor) ->…

Stack Exchange

stats.stackexchange.com › questions › 534555 › cross-entropy-in-pytorch-is-different-from-what-i-learnt-not-about-logit-input

machine learning - Cross Entropy in PyTorch is different from what I learnt (Not about logit input, but about the loss for every node) - Cross Validated

Top answer

1 of 1

The loss of a single sample is the sum of the $J$ nodes' predictions, so for each sample, the loss is a scalar value.

The expression $-\sum_j^J y_j \log p_j$ has only one nonzero term, because only one element of $y$ is nonzero. This is the correct loss for a classification problem; we can prove that this is correct by writing out the likelihood of the categorical distribution and taking the negative of its logarithm to get the cross-entropy. It's merely a special case that we can re-write the binary classification problem in terms of one probability $p$ and one binary label $y$ in the convenient form $-y \log p - (1-y)\log (1-p)$.

Let's write out a concrete example. We have predictions $[0.4,0.3,0.2]$ and label $[1,0,0]$. By inspection, we know that the loss is $-\log(0.4)$ because this is the only entry with a nonzero label.

CodingNomads

codingnomads.com › binary-classification-binary-cross-entropy

Binary Classification: Binary Cross Entropy

You'll see that the two loss values are very close but not exactly the same. This small difference is due to the numerical instability when working with probabilities close to 0 or 1. In this case, it is better to work with logits! F.binary_cross_entropy_with_logits(logits, labels) - F.binary_cross_entropy(logits.sigmoid(), labels)

Stack Overflow

stackoverflow.com › questions › 71585313 › why-my-losses-are-in-thousands-when-using-binary-cross-entropy-with-logits-witho

pytorch - Why my losses are in thousands when using binary_cross_entropy_with_logits without sigmoid beforehand? - Stack Overflow

import torch.nn.functional as F weight = torch.ones((4, 56, 96, 96)) weight[:, :, 30, 30] = 100. label = torch.zeros((4, 56, 96, 96)) weight[:, :, 25, 25] = 1. out = mynetwork(input) # shape is (4, 56, 96, 96), network arch is UNET loss = F.binary_cross_entropy_with_logits(out, label , pos_weight=weight)

PyTorch Forums

discuss.pytorch.org › vision

Equivalent of TensorFlow's Sigmoid Cross Entropy With Logits in Pytorch - vision - PyTorch Forums

April 18, 2017 - I am trying to find the equivalent of sigmoid_cross_entropy_with_logits loss in Pytorch but the closest thing I can find is the MultiLabelSoftMarginLoss. Can someone direct me to the equivalent loss? If it doesn't exis…

GitHub

github.com › eladhoffer › utils.pytorch › blob › master › cross_entropy.py

utils.pytorch/cross_entropy.py at master · eladhoffer/utils.pytorch

return F.binary_cross_entropy_with_logits(inputs, target, weight=weight, reduction=reduction)

Author eladhoffer