Brave Search

Pytorch's nn.BCEWithLogitsLoss() behaves totaly differently than nn.BCELoss()

stackoverflow.com › questions › 75979632 › pytorchs-nn-bcewithlogitsloss-behaves-totaly-differently-than-nn-bceloss

nn.BCELoss() expects your output to be probabilities, that is with the sigmoid activation.
nn.BCEWithLogitsLoss() expects your output to be logits, that is without the sigmoid activation.

I think maybe you calculated something wrong (like accuracy). Here I give you a simple example based on your code:

With probabilities:

dummy_x = torch.randn(1000,1)
dummy_y = (dummy_x > 0).type(torch.float)

model1 = nn.Sequential(
    nn.Linear(1, 1),
    nn.Sigmoid()
)
criterion1 = nn.BCELoss()
optimizer = torch.optim.Adam(model1.parameters(), 0.001)

def binary_accuracy(preds, y, logits=False):
    if logits:
        rounded_preds = torch.round(torch.sigmoid(preds))
    else:
        rounded_preds = torch.round(preds)
    correct = (rounded_preds == y).float()
    accuracy = correct.sum() / len(y)
    return accuracy

for e in range(2000):
    y_hat = model1(dummy_x)
    loss = criterion1(y_hat, dummy_y)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if e != 0 and e % 100==0:
        print(f"Epoch: {e}, Loss: {loss:.4f}")
        print(f"Epoch: {e}, Acc: {binary_accuracy(y_hat, dummy_y)}")

#Result:
Epoch: 100, Loss: 0.5840
Epoch: 100, Acc: 0.5839999914169312
Epoch: 200, Loss: 0.5423
Epoch: 200, Acc: 0.6499999761581421
...
Epoch: 1800, Loss: 0.2862
Epoch: 1800, Acc: 0.9950000047683716
Epoch: 1900, Loss: 0.2793
Epoch: 1900, Acc: 0.9929999709129333

Now with logits

model2 = nn.Linear(1, 1)
criterion2 = nn.BCEWithLogitsLoss()
optimizer2 = torch.optim.Adam(model2.parameters(), 0.001)
for e in range(2000):
    y_hat = model2(dummy_x)
    loss = criterion2(y_hat, dummy_y)
    optimizer2.zero_grad()
    loss.backward()
    optimizer2.step()

    if e != 0 and e % 100==0:
        print(f"Epoch: {e}, Loss: {loss:.4f}")
        print(f"Epoch: {e}, Acc: {binary_accuracy(y_hat, dummy_y, logits=True)}")

#Results: 
Epoch: 100, Loss: 1.1042
Epoch: 100, Acc: 0.007000000216066837
Epoch: 200, Loss: 1.0484
Epoch: 200, Acc: 0.01899999938905239
...
Epoch: 1800, Loss: 0.5019
Epoch: 1800, Acc: 0.9879999756813049
Epoch: 1900, Loss: 0.4844
Epoch: 1900, Acc: 0.9879999756813049

Answer from TheEngineerProgrammer on Stack Overflow

PyTorch

docs.pytorch.org › reference api › torch.nn › bceloss

BCELoss — PyTorch 2.11 documentation

January 1, 2023 - >>> m = nn.Sigmoid() >>> loss = nn.BCELoss() >>> input = torch.randn(3, 2, requires_grad=True) >>> target = torch.rand(3, 2, requires_grad=False) >>> output = loss(m(input), target) >>> output.backward()

DEV Community

dev.to › hyperkai › bceloss-in-pytorch-3m4d

BCELoss in PyTorch - DEV Community

November 6, 2024 - import torch from torch import nn tensor1 = torch.tensor([0.4, 0.8, 0.6, 0.3, 0.0, 0.5]) tensor2 = torch.tensor([0.2, 0.9, 0.4, 0.1, 0.8, 0.5]) # -w*(y*log(x)+(1-y)*log(1-x)) # -1*(0.2*log(0.4)+(1-0.2)*log(1-0.4)) # ↓↓↓↓↓↓ # 0.5919 + 0.3618 + 0.7541 + 0.4414 + 80.0 + 0.6931 = 82.8423 # 82.8423 / 6 = 13.8071 bceloss = nn.BCELoss() bceloss(input=tensor1, target=tensor2) # tensor(7.2500) bceloss # BCELoss() print(bceloss.weight) # None bceloss.reduction # 'mean' bceloss = nn.BCELoss(weight=None, reduction='mean') bceloss(input=tensor1, target=tensor2) # tensor(13.8071) bceloss = nn.BC

Discussions

python - Pytorch's nn.BCEWithLogitsLoss() behaves totaly differently than nn.BCELoss() - Stack Overflow

I was taking an e-course and was experimenting with pytorch. So i came across the two loss functions(The hypothesis for using these two losses is numerical stability with logits): ... For appropriate adjustments to the code and these two loss functions, I had quite different accuracy curves! For example with nn.BCELoss... More on stackoverflow.com

stackoverflow.com

BCELoss with class weights

I’m doing an image segmentation task. It’s a binary case. That is, the target pixels are either 0 (not of the class) or 1 (belong to the class). I’m using BCELoss as the loss function. I’m using BCE instead of BCEWithLogits because my model already has a sigmoid at the end. More on discuss.pytorch.org

discuss.pytorch.org

February 12, 2024

How to use BCE loss and CrossEntropyLoss correctly?

Hi, I have defined a pretrained resnet50 for data parallelism using multiple classes and use nn.CrossEntropyLoss() . model = models.resnet50(pretrained=True) model = torch.nn.DataParallel(model) for p in model.parameters(): p.requires_grad = False num_ftrs = model.module.fc.in_features ... More on discuss.pytorch.org

discuss.pytorch.org

July 13, 2020

Using BCELoss() with real-valued labels without any correspondance to a class

Hi, I start saying that I know the BCELoss is generally exploited when there’s a classification problem, but I’m also quite new with ML. I’m trying to implement a system that is explained in a paper, in which it is said they built a NN whose output layer is formed by a single neuron with ... More on discuss.pytorch.org

discuss.pytorch.org

April 16, 2020

Videos