why is hinge loss used for svm

in machine learning, a loss function used for maximum‐margin classification

In machine learning, the hinge loss is a loss function used for training classifiers. The hinge loss is used for "maximum-margin" classification, most notably for support vector machines (SVMs). For an intended … Wikipedia

Wikipedia

en.wikipedia.org › wiki › Hinge_loss

Hinge loss - Wikipedia

January 26, 2026 - In machine learning, the hinge loss is a loss function used for training classifiers. The hinge loss is used for "maximum-margin" classification, most notably for support vector machines (SVMs). For an intended output t = ±1 and a classifier score y, the hinge loss of the prediction y is defined ...

Extensions Optimization

Stack Overflow

stackoverflow.com › questions › 34325759 › whats-the-relationship-between-an-svm-and-hinge-loss

machine learning - What's the relationship between an SVM and hinge loss? - Stack Overflow

Videos

22:50

YouTube

Hinge Loss, SVMs, and the Loss of Users - YouTube

August 10, 2018

5.79K

m.youtube.com

Gradient Descent for Support Vector Machines and ...

youtube.com

4. Hinge Loss/Multi-class SVM Loss

12:29

YouTube

Soft Margin SVM : Data Science Concepts

November 30, 2020

View all

Programmathically

programmathically.com › home › machine learning › classical machine learning › understanding hinge loss and the svm cost function

Understanding Hinge Loss and the SVM Cost Function - Programmathically

June 26, 2022 - The hinge loss function is most commonly employed to regularize soft margin support vector machines. The degree of regularization determines how aggressively the classifier tries to prevent misclassifications and can be controlled with an additional ...

Towards Data Science

towardsdatascience.com › home › latest › a definitive explanation to hinge loss for support vector machines.

A definitive explanation to Hinge Loss for Support Vector Machines. | Towards Data Science

January 23, 2025 - ... actual predicted hinge loss ... key characteristic of the SVM and the Hinge loss is that the boundary separates negative and positive instances as +1 and -1, with -1 being on the left side of the boundary and +1 being on ...

Medium

medium.com › analytics-vidhya › understanding-loss-functions-hinge-loss-a0ff112b40a1

Understanding loss functions : Hinge loss | by Kunal Chowdhury | Analytics Vidhya | Medium

January 18, 2024 - Looking at the graph for SVM in Fig 4, we can see that for yf(x) ≥ 1, hinge loss is ‘0’. However, when yf(x) < 1, then hinge loss increases massively.

HackerNoon

hackernoon.com › hinge-loss-a-steadfast-loss-evaluation-function-for-the-svm-classification-models-in-ai-and-ml

Hinge Loss - A Steadfast Loss Evaluation Function for the SVM Classification Models in AI & ML | HackerNoon

January 4, 2023 - Researchers use an algebraic acme called “Losses” in order to optimise the machine learning space defined by a specific use case.

GeeksforGeeks

geeksforgeeks.org › machine learning › hinge-loss-relationship-with-support-vector-machines

Hinge-loss & Relationship with Support Vector Machines - GeeksforGeeks

August 21, 2025 - Its purpose is to penalize predictions that are incorrect or insufficiently confident in the context of binary classification. It is used in binary classification problems where the objective is to separate the data points in two classes typically ...

Stack Exchange

datascience.stackexchange.com › questions › 9420 › whats-the-relationship-between-an-svm-and-hinge-loss

logistic regression - What's the relationship between an SVM and hinge loss? - Data Science Stack Exchange

Top answer

1 of 2

Searching for the quoted text, it seems the book is Data Science for Business (Provost and Fawcett), and they're describing the soft-margin SVM. Their description of the hinge loss is wrong. The problem is that it doesn't penalize misclassified points that lie within the margin, as you mentioned.

In SVMs, smaller weights correspond to larger margins. So, using this "version" of the hinge loss would have pathological consequences: We could achieve the minimum possible loss (zero) simply by choosing weights small enough such that all points lie within the margin. Even if every single point is misclassified. Because the SVM optimization problem contains a regularization term that encourages small weights (i.e. large margins), the solution will always be the zero vector. This means the solution is completely independent of the data, and nothing is learned. Needless to say, this wouldn't make for a very good classifier.

The correct expression for the hinge loss for a soft-margin SVM is:

$$\max \Big( 0, 1 - y f(x) \Big)$$

where $f(x)$ is the output of the SVM given input $x$, and $y$ is the true class (-1 or 1). When the true class is -1 (as in your example), the hinge loss looks like this:

Note that the loss is nonzero for misclassified points, as well as correctly classified points that fall within the margin.

For a proper description of soft-margin SVMs using the hinge loss formulation, see The Elements of Statistical Learning (section 12.3.2) or the Wikipedia article.

2 of 2

The (A) hinge function can be expressed as

$$y_{i} = \gamma \max{\left(x_{i}-\theta, 0\right)} + \varepsilon_{i},$$

where:

$\gamma$ is the change in slope after the hinge. In your example, this amounts to the slope following the hinge, since your hinge-only model (see below) assumes zero effect of $x$ on $y$ until the hinge.
$\theta$ is the point (in $\boldsymbol{x}$) at which the hinge is located, and is a parameter estimated for the model. I believe your question is answered by considering that the location of the hinge is informed by the loss function.
$\varepsilon_{i}$ is some error term with some distribution.

Hinge functions can also be useful in changing any line:

$$y_{i} = \alpha_{0} + \beta x_{i} + \gamma \max{\left(x_{i}-\theta, 0\right)} + \varepsilon_{i},$$

where:

$\alpha$ is the model constant, and the intercept of the curve before the hinge (i.e. for $x < \theta$). Of course, if $\theta < 0$, then the curve intersects the $y$-axis after the hinge so $\alpha$ will not necessarily be the $y$-intercept of the bent line.
$\beta$ is the slope of the line relating $y$ to $x$
$\gamma$ is the change in slope after the hinge.

In addition, the hinge can be used to model how a functional relationship between $y$ and $x$ changes form, as in this model where the relationship becomes quadra

$$y_{i} = \alpha_{0} + \beta x_{i} + \gamma \max{\left(x_{i}-\theta, 0\right)^{2}} + \varepsilon_{i},$$

OpenGenus

iq.opengenus.org › hinge-loss-for-svm

Hinge Loss for SVM

April 21, 2023 - However, in many real-world scenarios, the data is not linearly separable, meaning that a hyperplane cannot perfectly separate the classes. In such cases, SVMs use a technique called soft margin classification, which allows for some misclassification of data points. To achieve this, SVMs use ...

Medium

koshurai.medium.com › understanding-hinge-loss-in-machine-learning-a-comprehensive-guide-0a1c82478de4

Understanding Hinge Loss in Machine Learning: A Comprehensive Guide | by KoshurAI | Medium

January 12, 2024 - It is designed to maximize the margin between classes, making it especially effective for support vector machines. The key idea behind hinge loss is to penalize the model more when it misclassifies a sample that is closer to the decision boundary.

Taylor & Francis

taylorandfrancis.com › knowledge › Engineering_and_technology › Engineering_support_and_special_topics › Hinge_loss

Hinge loss – Knowledge and References - Taylor & Francis

Hinge loss is a loss function used in training classifiers with large margins, such as support vector machines (SVM). It is designed to penalize negative margins that represent incorrect classifications.

Quora

quora.com › Can-you-explain-why-SVMs-use-hinge-loss-function-in-a-simple-manner

Can you explain why SVMs use hinge loss function in a simple manner? - Quora

Answer: I will explain u what I understand: From the diagram, We want to maximize the distance between positive and negative points. (let's say the distance between the optimal hyperplane to both positive and negative is 1) So maximize \frac 2 {\left\| w \right\|} In other words, we can write...

YouTube

youtube.com › watch

Introduction to Hinge Loss | Loss function SVM | Machine Learning - YouTube

11:04

Watch this video to understand the meaning of hinge loss and it is used for maximum - margin classifications for support vector machines.#hingelossfunction #...

Published February 16, 2023

YouTube

youtube.com › rohan-paul-ai

What is the Hinge Loss in SVM in Machine Learning | Data Science Interview Questions - YouTube

05:30

What is the Hinge Loss in SVM in Machine LearningThe Hinge Loss is a loss function used in Support Vector Machine (SVM) algorithms for binary classification ...

Published April 9, 2023

Views 1K

Soulpageit

soulpageit.com › home

Hinge Loss

June 30, 2023 - Hinge loss is commonly used in SVMs, where the goal is to find the hyperplane that separates the classes with the maximum margin. SVMs aim to minimize this loss while also incorporating a regularization term to control the complexity of the model.

Rice University

cs.rice.edu › ~as143 › COMP642_Spring22 › Scribes › Lect4 pdf

COMP 642 — Machine Learning Jan 20, 2021 Lecture 4

January 20, 2021 - point residing on the margin will have a hinge loss of 0 (Figure 2). ... The parameter C is the regularization parameter. The regularization parameter becomes · relevant when data sets are overlapping, i.e. not definitively separable by a classification bound- ary. This is known as a soft margin SVM...

Medium

medium.com › @vantakulasatyakiran › what-is-hinge-loss-that-is-used-in-svm-6b292fbbb48c

What is Hinge Loss that is used in SVM? | by Vantakula Satya kiran | Medium

January 28, 2025 - Hinge loss is a widely used loss function in machine learning, particularly for training classifiers like Support Vector Machines(SVMs).It plays a critical role in enforcing the margin-based optimization framework that defines SVMs.