nn.BCELoss() expects your output to be probabilities, that is with the sigmoid activation.
nn.BCEWithLogitsLoss() expects your output to be logits, that is without the sigmoid activation.

I think maybe you calculated something wrong (like accuracy). Here I give you a simple example based on your code:

With probabilities:

dummy_x = torch.randn(1000,1)
dummy_y = (dummy_x > 0).type(torch.float)

model1 = nn.Sequential(
    nn.Linear(1, 1),
    nn.Sigmoid()
)
criterion1 = nn.BCELoss()
optimizer = torch.optim.Adam(model1.parameters(), 0.001)

def binary_accuracy(preds, y, logits=False):
    if logits:
        rounded_preds = torch.round(torch.sigmoid(preds))
    else:
        rounded_preds = torch.round(preds)
    correct = (rounded_preds == y).float()
    accuracy = correct.sum() / len(y)
    return accuracy

for e in range(2000):
    y_hat = model1(dummy_x)
    loss = criterion1(y_hat, dummy_y)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if e != 0 and e % 100==0:
        print(f"Epoch: {e}, Loss: {loss:.4f}")
        print(f"Epoch: {e}, Acc: {binary_accuracy(y_hat, dummy_y)}")

#Result:
Epoch: 100, Loss: 0.5840
Epoch: 100, Acc: 0.5839999914169312
Epoch: 200, Loss: 0.5423
Epoch: 200, Acc: 0.6499999761581421
...
Epoch: 1800, Loss: 0.2862
Epoch: 1800, Acc: 0.9950000047683716
Epoch: 1900, Loss: 0.2793
Epoch: 1900, Acc: 0.9929999709129333

Now with logits

model2 = nn.Linear(1, 1)
criterion2 = nn.BCEWithLogitsLoss()
optimizer2 = torch.optim.Adam(model2.parameters(), 0.001)
for e in range(2000):
    y_hat = model2(dummy_x)
    loss = criterion2(y_hat, dummy_y)
    optimizer2.zero_grad()
    loss.backward()
    optimizer2.step()

    if e != 0 and e % 100==0:
        print(f"Epoch: {e}, Loss: {loss:.4f}")
        print(f"Epoch: {e}, Acc: {binary_accuracy(y_hat, dummy_y, logits=True)}")

#Results: 
Epoch: 100, Loss: 1.1042
Epoch: 100, Acc: 0.007000000216066837
Epoch: 200, Loss: 1.0484
Epoch: 200, Acc: 0.01899999938905239
...
Epoch: 1800, Loss: 0.5019
Epoch: 1800, Acc: 0.9879999756813049
Epoch: 1900, Loss: 0.4844
Epoch: 1900, Acc: 0.9879999756813049
Answer from TheEngineerProgrammer on Stack Overflow
🌐
PyTorch
docs.pytorch.org › reference api › torch.nn › bceloss
BCELoss — PyTorch 2.11 documentation
January 1, 2023 - >>> m = nn.Sigmoid() >>> loss = nn.BCELoss() >>> input = torch.randn(3, 2, requires_grad=True) >>> target = torch.rand(3, 2, requires_grad=False) >>> output = loss(m(input), target) >>> output.backward()
🌐
DEV Community
dev.to › hyperkai › bceloss-in-pytorch-3m4d
BCELoss in PyTorch - DEV Community
November 6, 2024 - import torch from torch import nn tensor1 = torch.tensor([0.4, 0.8, 0.6, 0.3, 0.0, 0.5]) tensor2 = torch.tensor([0.2, 0.9, 0.4, 0.1, 0.8, 0.5]) # -w*(y*log(x)+(1-y)*log(1-x)) # -1*(0.2*log(0.4)+(1-0.2)*log(1-0.4)) # ↓↓↓↓↓↓ # 0.5919 + 0.3618 + 0.7541 + 0.4414 + 80.0 + 0.6931 = 82.8423 # 82.8423 / 6 = 13.8071 bceloss = nn.BCELoss() bceloss(input=tensor1, target=tensor2) # tensor(7.2500) bceloss # BCELoss() print(bceloss.weight) # None bceloss.reduction # 'mean' bceloss = nn.BCELoss(weight=None, reduction='mean') bceloss(input=tensor1, target=tensor2) # tensor(13.8071) bceloss = nn.BC
Discussions

python - Pytorch's nn.BCEWithLogitsLoss() behaves totaly differently than nn.BCELoss() - Stack Overflow
I was taking an e-course and was experimenting with pytorch. So i came across the two loss functions(The hypothesis for using these two losses is numerical stability with logits): ... For appropriate adjustments to the code and these two loss functions, I had quite different accuracy curves! For example with nn.BCELoss... More on stackoverflow.com
🌐 stackoverflow.com
BCELoss with class weights
I’m doing an image segmentation task. It’s a binary case. That is, the target pixels are either 0 (not of the class) or 1 (belong to the class). I’m using BCELoss as the loss function. I’m using BCE instead of BCEWithLogits because my model already has a sigmoid at the end. More on discuss.pytorch.org
🌐 discuss.pytorch.org
5
0
February 12, 2024
How to use BCE loss and CrossEntropyLoss correctly?
Hi, I have defined a pretrained resnet50 for data parallelism using multiple classes and use nn.CrossEntropyLoss() . model = models.resnet50(pretrained=True) model = torch.nn.DataParallel(model) for p in model.parameters(): p.requires_grad = False num_ftrs = model.module.fc.in_features ... More on discuss.pytorch.org
🌐 discuss.pytorch.org
1
0
July 13, 2020
Using BCELoss() with real-valued labels without any correspondance to a class
Hi, I start saying that I know the BCELoss is generally exploited when there’s a classification problem, but I’m also quite new with ML. I’m trying to implement a system that is explained in a paper, in which it is said they built a NN whose output layer is formed by a single neuron with ... More on discuss.pytorch.org
🌐 discuss.pytorch.org
0
0
April 16, 2020
Top answer
1 of 3
9

nn.BCELoss() expects your output to be probabilities, that is with the sigmoid activation.
nn.BCEWithLogitsLoss() expects your output to be logits, that is without the sigmoid activation.

I think maybe you calculated something wrong (like accuracy). Here I give you a simple example based on your code:

With probabilities:

dummy_x = torch.randn(1000,1)
dummy_y = (dummy_x > 0).type(torch.float)

model1 = nn.Sequential(
    nn.Linear(1, 1),
    nn.Sigmoid()
)
criterion1 = nn.BCELoss()
optimizer = torch.optim.Adam(model1.parameters(), 0.001)

def binary_accuracy(preds, y, logits=False):
    if logits:
        rounded_preds = torch.round(torch.sigmoid(preds))
    else:
        rounded_preds = torch.round(preds)
    correct = (rounded_preds == y).float()
    accuracy = correct.sum() / len(y)
    return accuracy

for e in range(2000):
    y_hat = model1(dummy_x)
    loss = criterion1(y_hat, dummy_y)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if e != 0 and e % 100==0:
        print(f"Epoch: {e}, Loss: {loss:.4f}")
        print(f"Epoch: {e}, Acc: {binary_accuracy(y_hat, dummy_y)}")

#Result:
Epoch: 100, Loss: 0.5840
Epoch: 100, Acc: 0.5839999914169312
Epoch: 200, Loss: 0.5423
Epoch: 200, Acc: 0.6499999761581421
...
Epoch: 1800, Loss: 0.2862
Epoch: 1800, Acc: 0.9950000047683716
Epoch: 1900, Loss: 0.2793
Epoch: 1900, Acc: 0.9929999709129333

Now with logits

model2 = nn.Linear(1, 1)
criterion2 = nn.BCEWithLogitsLoss()
optimizer2 = torch.optim.Adam(model2.parameters(), 0.001)
for e in range(2000):
    y_hat = model2(dummy_x)
    loss = criterion2(y_hat, dummy_y)
    optimizer2.zero_grad()
    loss.backward()
    optimizer2.step()

    if e != 0 and e % 100==0:
        print(f"Epoch: {e}, Loss: {loss:.4f}")
        print(f"Epoch: {e}, Acc: {binary_accuracy(y_hat, dummy_y, logits=True)}")

#Results: 
Epoch: 100, Loss: 1.1042
Epoch: 100, Acc: 0.007000000216066837
Epoch: 200, Loss: 1.0484
Epoch: 200, Acc: 0.01899999938905239
...
Epoch: 1800, Loss: 0.5019
Epoch: 1800, Acc: 0.9879999756813049
Epoch: 1900, Loss: 0.4844
Epoch: 1900, Acc: 0.9879999756813049
2 of 3
1

You would need to modify the code according to the loss function (aka criterion) you are using. For BCEloss - Since you are using the sigmoid layer in your model: so the output are between 0 and 1.

For BCEWithLogitsLoss - Output is the logit. Logit can be negative or positive. Logit is z, where

z = w1*x1 + w2*x2 + ... wn*xn 

So, for your predictions while using BCEWithLogitsLoss, you need to pass this output through a sigmoid layer (For this you can create a small function which returns

1/(1+np.exp(-np.dot(x,w)))

and then you should calculate the accuracy.

Hope this helps!!!

🌐
Sebastian Raschka
sebastianraschka.com › blog › 2022 › losses-learned-part1.html
Losses Learned | Sebastian Raschka, PhD
February 5, 2026 - Using the object-oriented API, we first instantiated the loss via BCELoss, and then we used used it like a function. This works because the BCELoss and other losses in PyTorch have and has an underlying forward() method that is executed when calling bce(...).
🌐
PyTorch Forums
discuss.pytorch.org › t › using-bceloss-with-real-valued-labels-without-any-correspondance-to-a-class › 77015
Using BCELoss() with real-valued labels without any correspondance to a class - PyTorch Forums
April 16, 2020 - Hi, I start saying that I know the BCELoss is generally exploited when there’s a classification problem, but I’m also quite new with ML. I’m trying to implement a system that is explained in a paper, in which it is said they built a NN whose output layer is formed by a single neuron with ...
Find elsewhere
🌐
W3cubDocs
docs.w3cub.com › pytorch › generated › torch.nn.bceloss.html
BCELoss - PyTorch - W3cubDocs
/PyTorch · class torch.nn.BCELoss(weight: Optional[torch.Tensor] = None, size_average=None, reduce=None, reduction: str = 'mean') [source] Creates a criterion that measures the Binary Cross Entropy between the target and the output: The unreduced (i.e. with reduction set to 'none') loss can ...
🌐
Medium
medium.com › analytics-vidhya › simple-neural-network-with-bceloss-for-binary-classification-for-a-custom-dataset-8d5c69ffffee
Simple Neural Network with BCELoss for Binary classification for a custom Dataset | by Bhuvana Kundumani | Analytics Vidhya | Medium
February 11, 2024 - We converted the Boston Dataset into a classification problem, made data batches (DataLoaders) that are ready to be fed into the pytorch neural network architecture. In this blog, we will be focussing on how to use BCELoss for a simple neural network in Pytorch.
🌐
ioDraw
iodraw.com › en › blog › 210136440
【Pytorch】BCELoss and BCEWithLogitsLoss Detailed explanation of loss function - Blog - ioDraw
February 18, 2020 - Here is a simple example : import torch import torch.nn as nn predicts = torch.tensor([[0.4,0.7,1.2,0.3], [1.1,0.6,0.9,1.6]]) labels = torch.tensor([[1,0,1,0],[0,1,1,0]], dtype=torch.float) # adopt BCELoss calculation sigmoid Processed value criterion1 = nn.BCELoss() loss1= criterion1(torch.sigmoid(predicts), labels) # adopt BCEWithLogitsLoss Directly calculate the input value criterion2 = nn.BCEWithLogitsLoss() loss2 = criterion2(predicts, labels) # You'll find out loss1=loss2 BCELoss and BCEWithLogitsLoss Two important parameters are also provided : * weight: It can be used to control the weight of each sample , It is often used to align the data mask operation ( Set to 0) * reduction : Control loss output mode .
🌐
Torch for R
torch.mlverse.org › docs › reference › nn_bce_loss
Binary cross entropy loss — nn_bce_loss • torch
This would make BCELoss's backward method nonlinear with respect to \(x_n\), and using it for things like linear regression would not be straight-forward. Our solution is that BCELoss clamps its log function outputs to be greater than or equal to -100.
🌐
GitHub
github.com › pytorch › pytorch › issues › 2220
BCELoss target requires float format · Issue #2220 · pytorch/pytorch
July 26, 2017 - Hi, I find that BCELoss target require float tensor, which is wierd. It should accept LongTensor or IntTensor for a binary classification (0 or 1) right? This is the error message I get. TypeError: FloatBCECriterion_updateOutput received...
Author   zzzace2000
🌐
Liberian Geek
liberiangeek.net › home › how-to/tips › how to get bce loss in pytorch?
How to Get BCE Loss in PyTorch? | Liberian Geek
December 18, 2023 - To get BCE loss in PyTorch, use the BCELoss(), BCEWithLogitsLoss(), sigmoid(), and functional.binary_cross_entropy() functions using the torch library.
🌐
Saturn Cloud
saturncloud.io › blog › using-weights-in-crossentropyloss-and-bceloss-pytorch
Using Weights in CrossEntropyLoss and BCELoss (PyTorch) | Saturn Cloud Blog
February 21, 2024 - BCELoss, or Binary Cross Entropy Loss, is a loss function that is used for binary classification problems. It is similar to the CrossEntropyLoss, but it is used for problems where there are only two classes.