Brave Search

in machine learning, a loss function used for maximum‐margin classification

In machine learning, the hinge loss is a loss function used for training classifiers. The hinge loss is used for "maximum-margin" classification, most notably for support vector machines (SVMs). For an intended … Wikipedia

Wikipedia

en.wikipedia.org › wiki › Hinge_loss

Hinge loss - Wikipedia

January 26, 2026 - In machine learning, the hinge loss is a loss function used for training classifiers. The hinge loss is used for "maximum-margin" classification, most notably for support vector machines (SVMs). For an intended output t = ±1 and a classifier score y, the hinge loss of the prediction y is defined ...

Extensions Optimization

Medium

medium.com › analytics-vidhya › understanding-loss-functions-hinge-loss-a0ff112b40a1

Understanding loss functions : Hinge loss | by Kunal Chowdhury | Analytics Vidhya | Medium

January 18, 2024 - Looking at the graph for SVM in Fig 4, we can see that for yf(x) ≥ 1, hinge loss is ‘0’. However, when yf(x) < 1, then hinge loss increases massively.

Discussions

neural networks - What is the definition of the hinge loss function? - Artificial Intelligence Stack Exchange

I came across the hinge loss function for training a neural network model, but I did not know the analytical form for the same. I can write the mean squared error loss function (which is more often used for regression) as More on ai.stackexchange.com

ai.stackexchange.com

February 11, 2021

machine learning - hinge loss vs logistic loss advantages and disadvantages/limitations - Cross Validated

$\begingroup$ +1. Minimizing logistic loss corresponds to maximizing binomial likelihood. Minimizing squared-error loss corresponds to maximizing Gaussian likelihood (it's just OLS regression; for 2-class classification it's actually equivalent to LDA). Do you know if minimizing hinge loss ... More on stats.stackexchange.com

stats.stackexchange.com

April 14, 2015

Why do we use log-loss in logistic regression instead of just taking the absolute difference between expected probability and actual value for each instance?

You can try it and see if it works🤷‍♂️ Absolute is usually avoided because makes a "V" shaped gradient. Sharp corners are bad in general for gradient based optimization. Same reason we use MSE or RMSE instead of absolute error for regression tasks. More on reddit.com

r/learnmachinelearning

April 26, 2023

Is support vector machine just about simplifying logistic regression formula? If so, why this name?

No. The main difference between the costs function is that the cross entropy loss (CEL) penalizes based on prediction distance from the answer. So if something is predicted using CEL as class 1 with probability 0.51 and it is actually class 1, it is penalized more strongly than if it had been predicted with probability 0.99, but for the hinge loss for SVM it's just counted the same whether or not you barely predict the answer or have high confidence. However both methods are penalized by 'distance' when they predict the wrong answer

Videos

05:30

YouTube

What is the Hinge Loss in SVM in Machine Learning | Data Science ...

April 9, 2023

1.94K