svm loss function example

What is the loss function of hard margin SVM?

stats.stackexchange.com › questions › 74499 › what-is-the-loss-function-of-hard-margin-svm

The hinge loss term $\text{[math]}$ in soft margin SVM penalizes misclassifications. In hard margin SVM there are, by definition, no misclassifications.

This indeed means that hard margin SVM tries to minimize $\text{[math]}$ . Due to the formulation of the SVM problem, the margin is $\text{[math]}$ . As such, minimizing the norm of $\text{[math]}$ is geometrically equivalent to maximizing the margin. Exactly what we want!

Regularization is a technique to avoid overfitting by penalizing large coefficients in the solution vector. In hard margin SVM $\text{[math]}$ is both the loss function and an $\text{[math]}$ regularizer.

In soft-margin SVM, the hinge loss term also acts like a regularizer but on the slack variables instead of $\text{[math]}$ and in $\text{[math]}$ rather than $\text{[math]}$ . $\text{[math]}$ regularization induces sparsity, which is why standard SVM is sparse in terms of support vectors (in contrast to least-squares SVM).

Answer from Marc Claesen on Stack Exchange

Stack Exchange

stats.stackexchange.com › questions › 74499 › what-is-the-loss-function-of-hard-margin-svm

What is the loss function of hard margin SVM? - Cross Validated

Videos

22:50

YouTube

Hinge Loss, SVMs, and the Loss of Users - YouTube

August 10, 2018

5.79K

youtube.com

5.1 Loss Functions | 5 Support Vector Machines | Pattern ...

01:14:40

YouTube

Lecture 3 | Loss Functions and Optimization - YouTube

August 11, 2017

14:42

YouTube

Week 4 Lecture 25 SVM - Hinge Loss Formulation - YouTube

August 5, 2021

08:07

YouTube

Hinge loss/ Multiclass SVM loss function - lecture 30/machine ...

March 17, 2020

12:01

YouTube

Hinge Loss for Binary Classifiers - YouTube

March 14, 2020

View all

CS231n

cs231n.github.io › linear-classify

CS231n Deep Learning for Computer Vision

The score function takes the pixels and computes the vector $ f(x_i, W) $ of class scores, which we will abbreviate to $s$ (short for scores). For example, the score for the j-th class is the j-th element: $ s_j = f(x_i, W)_j $. The Multiclass SVM loss for the i-th example is then formalized as follows: \[L_i = \sum_{j\neq y_i} \max(0, s_j - s_{y_i} + \Delta)\]

Wikipedia

en.wikipedia.org › wiki › Hinge_loss

Hinge loss - Wikipedia

January 26, 2026 - In structured prediction, the hinge loss can be further extended to structured output spaces. Structured SVMs with margin rescaling use the following variant, where w denotes the SVM's parameters, y the SVM's predictions, φ the joint feature function, and Δ the Hamming loss:

Extensions Optimization

MathWorks

mathworks.com › statistics and machine learning toolbox › classification › support vector machine classification

loss - Find classification error for support vector machine (SVM) classifier - MATLAB

Example: loss(SVMModel,Tbl,Y,'Weights',W) weighs the observations in each row of Tbl using the corresponding weight in each row of the variable W in Tbl. Loss function, specified as a built-in loss function name or a function handle.

University of Oxford

robots.ox.ac.uk › ~az › lectures › ml › lect2.pdf pdf

Lecture 2: The SVM classifier

• Support Vector Machine (SVM) classifier · • Wide margin · • Cost function · • Slack variables · • Loss functions revisited · • Optimization · Binary Classification · Given training data (xi, yi) for i = 1 . . . N, with · xi ∈Rd and yi ∈{−1, 1}, learn a classiﬁer f(x) such that ·

Medium

medium.com › data-science › optimization-loss-function-under-the-hood-part-iii-5dff33fa015d

Loss Function(Part III): Support Vector Machine | by Shuyu Luo | TDS Archive | Medium

October 17, 2018 - Take a certain sample x and certain landmark l as an example, when σ² is very large, the output of kernel function f is close 1, as σ² getting smaller, f moves towards to 0. In other words, with a fixed distance between x and l, a big σ² ...

Stack Overflow

stackoverflow.com › questions › 36020583 › compute-the-gradient-of-the-svm-loss-function

python - Compute the gradient of the SVM loss function - Stack Overflow

Top answer

1 of 3

The method to calculate gradient in this case is Calculus (analytically, NOT numerically!). So we differentiate loss function with respect to W(yi) like this:

and with respect to W(j) when j!=yi is:

The 1 is just indicator function so we can ignore the middle form when condition is true. And when you write in code, the example you provided is the answer.

Since you are using cs231n example, you should definitely check note and videos if needed.

Hope this helps!

2 of 3

If the substraction less than zero the loss is zero so the gradient of W is also zero. If the substarction larger than zero, then the gradient of W is the partial derviation of the loss.

Find elsewhere

Google Bing Mojeek

Stack Overflow

stackoverflow.com › questions › 66740435 › svm-loss-function

python - SVM Loss Function - Stack Overflow

""" # Compute the loss num_classes = W.shape[0] # classes weights are in row wise fashion num_samples = X.shape[1] # samples of unknown images are in column-wise fashion loss = 0.0 delta = 1 # SVM parameter for i in range(num_samples): scores = np.dot(W, X[:,i]) correct_class_score = scores[y[i]] for j in range(num_classes): if j == y[i]: continue margin = max(0, scores[j] - correct_class_score + delta ) loss = loss + margin # Average loss loss = loss / num_samples return loss

NISER

niser.ac.in › ~smishra › teach › cs460 › 23cs460 › lectures › lec11.pdf pdf

HINGE LOSS IN SUPPORT VECTOR MACHINES Chandan Kumar Sahu and Maitrey Sharma

February 7, 2023 - For an intended output of t = ±1 and a classifier score y, the hinge loss of the prediction y is defined ... Note that y should be raw output of the classifier’s decision function, not the predicted class label. For instance, in linear SVMs, y = wT ·

GeeksforGeeks

geeksforgeeks.org › hinge-loss-relationship-with-support-vector-machines

Hinge-loss & relationship with Support Vector Machines - GeeksforGeeks

June 7, 2024 - Therefore the product t.y will always be negative and the value of (1-t)y will be always positive and greater than 1. So the loss function value max(0,1-t.y) will always be the value given by (1-t)y . Here the loss value will increase linearly with increase in value of y. This is indicated by the red region in above graph. Let us understand the relationship between hinge loss and svm mathematically .

PyImageSearch

pyimagesearch.com › home › blog › multi-class svm loss

Multi-class SVM Loss - PyImageSearch

April 17, 2021 - Now that we’ve taken a look at ... a worked example. We’ll again assume that we’re working with the Kaggle Dogs vs. Cats dataset, which as the name suggests, aims to classify whether a given image contains a dog or a cat. There are only two possible class labels in this dataset and is therefore a 2-class problem which can be solved using a standard, binary SVM loss function...

HackerNoon

hackernoon.com › hinge-loss-a-steadfast-loss-evaluation-function-for-the-svm-classification-models-in-ai-and-ml

Hinge Loss - A Steadfast Loss Evaluation Function for the SVM Classification Models in AI & ML | HackerNoon

January 4, 2023 - Researchers use an algebraic acme called “Losses” in order to optimise the machine learning space defined by a specific use case.

ResearchGate

researchgate.net › figure › Various-loss-functions-used-with-SVM_fig4_337692020

Various loss functions used with SVM | Download Scientific Diagram

Download scientific diagram | Various loss functions used with SVM from publication: Robust statistics-based support vector machine and its variants: a survey | Support vector machines (SVMs) are versatile learning models which are used for both classification and regression.

Medium

medium.com › analytics-vidhya › understanding-loss-functions-hinge-loss-a0ff112b40a1

Understanding loss functions : Hinge loss | by Kunal Chowdhury | Analytics Vidhya | Medium

January 18, 2024 - I hope, that now the intuition behind loss function and how it contributes to the overall mathematical cost of a model is clear. Almost, all classification models are based on some kind of models. E.g. Logistic regression has logistic loss (Fig 4: exponential), SVM has hinge loss (Fig 4: Support Vector), etc.

Cornell Computer Science

cs.cornell.edu › courses › cs4780 › 2018sp › lectures › lecturenote10.html

10: Empirical Risk Minimization

Remember the unconstrained SVM Formulation \[ \min_{\mathbf{w}}\ C\underset{Hinge-Loss}{\underbrace{\sum_{i=1}^{n}\max[1-y_{i}\underset{h({\mathbf{x}_i})}{\underbrace{(w^{\top}{\mathbf{x}_i}+b)}},0]}}+\underset{l_{2}-Regularizer}{\underbrace{\left\Vert w\right\Vert _{z}^{2}}} \] The hinge loss is the SVM's error function of choice, whereas the $\left.l_{2}\right.$-regularizer reflects the complexity of the solution, and penalizes complex solutions. This is an example of empirical risk minimization with a loss function $ \ell$ and a regularizer $r$, \[ \min_{\mathbf{w}}\frac{1}{n}\sum_{i=1}^{n}\underset{Loss}{\underbrace{l(h_{\mathbf{w}}({\mathbf{x}_i}),y_{i})}}+\underset{Regularizer}{\underbrace{\lambda r(w)}}, \] where the loss function is a continuous function which penalizes training error, and the regularizer is a continuous function which penalizes classifier complexity.

Programmathically

programmathically.com › home › machine learning › classical machine learning › understanding hinge loss and the svm cost function

Understanding Hinge Loss and the SVM Cost Function - Programmathically

June 26, 2022 - The further an observation lies from the plane, the more confident it is in the classification. For example, if an observation was associated with an actual outcome of +1, and the SVM produced an output of 1.5, the loss ...

OpenGenus

iq.opengenus.org › hinge-loss-for-svm

Hinge Loss for SVM

April 21, 2023 - If C is set to a very large value, then the SVM will try to minimize the hinge loss function at all costs, even if it means overfitting the data. Conversely, if C is set to a very small value, then the SVM will prioritize having a large margin, even if it means misclassifying some data points.

Anna-Lena Popkes

alpopkes.com › posts › machine_learning › support_vector_machines

Support vector machines

April 13, 2021 - If a training example ($y = 1$) is on the wrong side of the decision hyperplane (that is, $f(\mathbf{x}) \lt 0$), the hinge loss returns an even larger value. This value increases linearly with the distance from the decision hyperplane · Using the hinge loss we can reformulate the optimization problem of the primal soft-margin SVM.

MathWorks

mathworks.com › statistics and machine learning toolbox › classification › neural networks

resubLoss - Resubstitution classification loss - MATLAB

Cost is a K-by-K numeric matrix of misclassification costs. For example, Cost = ones(K) – eye(K) specifies a cost of 0 for correct classification and 1 for misclassification. ... Classification loss functions measure the predictive inaccuracy of classification models.