The method to calculate gradient in this case is Calculus (analytically, NOT numerically!). So we differentiate loss function with respect to W(yi) like this:

and with respect to W(j) when j!=yi is:

The 1 is just indicator function so we can ignore the middle form when condition is true. And when you write in code, the example you provided is the answer.

Since you are using cs231n example, you should definitely check note and videos if needed.

Hope this helps!

Answer from dexhunter on Stack Overflow
🌐
Stack Overflow
stackoverflow.com › questions › 66740435 › svm-loss-function
python - SVM Loss Function - Stack Overflow
def svm_loss_naive(W, X, y): """ SVM loss function, naive implementation calculating loss for each sample using loops. Inputs: - X: A numpy array of shape (n, m) containing data(samples).
🌐
CS231n
cs231n.github.io › linear-classify
CS231n Deep Learning for Computer Vision
Code. Here is the loss function (without regularization) implemented in Python, in both unvectorized and half-vectorized form: def L_i(x, y, W): """ unvectorized version. Compute the multiclass svm loss for a single example (x,y) - x is a column vector representing an image (e.g.
🌐
PyImageSearch
pyimagesearch.com › home › blog › multi-class svm loss
Multi-class SVM Loss - PyImageSearch
April 17, 2021 - There are only two possible class labels in this dataset and is therefore a 2-class problem which can be solved using a standard, binary SVM loss function. That said, let’s still apply Multi-class SVM loss so we can have a worked example on how to apply it.
🌐
GitHub
github.com › huyouare › CS231n › blob › master › assignment1 › cs231n › classifiers › linear_svm.py
CS231n/assignment1/cs231n/classifiers/linear_svm.py at master · huyouare/CS231n
Structured SVM loss function, vectorized implementation. · Inputs and outputs are the same as svm_loss_naive. """ loss = 0.0 · dW = np.zeros(W.shape) # initialize the gradient as zero ·
Author   huyouare
🌐
freeCodeCamp
freecodecamp.org › news › support-vector-machines
SVM Machine Learning Algorithm Explained
January 24, 2020 - The following is code written for training, predicting and finding accuracy for SVM in Python: import numpy as np class Svm (object): """" Svm classifier """ def __init__ (self, inputDim, outputDim): self.W = None # - Generate a random svm weight matrix to compute loss # # with standard normal distribution and Standard deviation = 0.01. # sigma =0.01 self.W = sigma * np.random.randn(inputDim,outputDim) def calLoss (self, x, y, reg): """ Svm loss function D: Input dimension.
🌐
MaviccPRP@web.studio
maviccprp.github.io › a-support-vector-machine-in-just-a-few-lines-of-python-code
A Support Vector Machine in just a few Lines of Python Code
April 3, 2017 - We will use hinge loss for our SVM: $c$ is the loss function, $x$ the sample, $y$ is the true label, $f(x)$ the predicted label.
🌐
Stack Exchange
stats.stackexchange.com › questions › 529550 › adjusting-the-loss-function-for-support-vector-machines-for-svc-in-sklearn
python - Adjusting the loss function for Support Vector Machines for SVC in sklearn - Cross Validated
$\begingroup$ I'm not sure if it's possible to make this change in sklearn, but this seems like a perfect job for CVXPY if you have a convex program (that depends on whether $\nu_i$ is an affine function). I've used it previously (actually, to implement the dual SVM algorithm) and it's quite well-documented -- it's plug-and-play with numpy mostly.
Find elsewhere
Top answer
1 of 1
1

To answer your question, unless you have a very good idea of why you want to define a custom kernel, I'd stick with the built-ins. They are very fast, flexible, and powerful, and are well-suited to most applications.

That being said, let's go into a bit more detail:

A Kernel Function is a special kind of measure of similarity between two points. Basically a larger value of the similarity means the points are more similar. The scikit-learn SVM is designed to be able to work with any kernel function. Several kernels built-in (e.g. linear, radial basis function, polynomial, sigmoid) but you can also define your own.

Your custom kernel function should look something like this:

def my_kernel(x, y):
    """Compute My Kernel

    Parameters
    ----------
    x : array, shape=(N, D)
    y : array, shape=(M, D)
        input vectors for kernel similarity

    Returns
    -------
    K : array, shape=(N, M)
        matrix of similarities between x and y
    """
    # ... compute something here ...
    return similarity_matrix

The most basic kernel, a linear kernel, would look like this:

def linear_kernel(x, y):
    return np.dot(x, y.T)

Equivalently, you can write

def linear_kernel_2(x, y):
    M = np.array([[1, 0],
                  [0, 1]])
    return np.dot(x, np.dot(M, y.T))

The matrix M here defines the so-called inner product space in which the kernel acts. This matrix can be modified to define a new inner product space; the custom function from the example you linked to just modifies M to effectively double the importance of the first dimension in determining the similarity.

More complicated non-linear modifications are possible as well, but you have to be careful: kernel functions must meet certain requirements (they must satisfy the properties of an inner-product space) or the SVM algorithm will not work correctly.

🌐
Wikipedia
en.wikipedia.org › wiki › Hinge_loss
Hinge loss - Wikipedia
January 26, 2026 - In machine learning, the hinge loss is a loss function used for training classifiers. The hinge loss is used for "maximum-margin" classification, most notably for support vector machines (SVMs).
🌐
GeeksforGeeks
geeksforgeeks.org › hinge-loss-relationship-with-support-vector-machines
Hinge-loss & relationship with Support Vector Machines - GeeksforGeeks
June 7, 2024 - We will study hard margin and soft margin SVM in detail latter. Let us first understand hinge loss. ... Hinge loss is used in binary classification problems where the objective is to separate the data points in two classes typically labeled as +1 and -1. Mathematically, Hinge loss for a data point can be represented as : ... In this case the product t.y will always be positive and its value greater than 1 and therefore the value of 1-t.y will be negative. So the loss function value max(0,1-t.y) will always be zero.
🌐
Medium
medium.com › analytics-vidhya › loss-functions-multiclass-svm-loss-and-cross-entropy-loss-9190c68f13e0
Loss Functions — Multiclass SVM Loss and Cross Entropy Loss | by Ramji Balasubramanian | Analytics Vidhya | Medium
December 24, 2020 - image_2 =max(0, 3.76 — (-1.20) + 1) + max(0, -3.81 — (-1.20) + 1)image_3 =max(0, -2.37 — (-2.27) + 1) + max(0, 1.03 — (-2.27) + 1)loss = (image_1 + image_2 + image_3) / 3.0 ... Our goal here is to classify our input image(Panda) as Dog, Cat or Panda. This involves three steps. Step 1 — We will get the scoring value for each of the three classes as we got in Multiclass SVM based on the used function.
🌐
University of Oxford
robots.ox.ac.uk › ~az › lectures › ml › lect2.pdf pdf
Lecture 2: The SVM classifier
• Support Vector Machine (SVM) classifier · • Wide margin · • Cost function · • Slack variables · • Loss functions revisited · • Optimization · Binary Classification · Given training data (xi, yi) for i = 1 . . . N, with · xi ∈Rd and yi ∈{−1, 1}, learn a classifier f(x) such that ·
🌐
scikit-learn
scikit-learn.org › stable › modules › generated › sklearn.metrics.hinge_loss.html
hinge_loss — scikit-learn 1.8.0 documentation
>>> import numpy as np >>> X = np.array([[0], [1], [2], [3]]) >>> Y = np.array([0, 1, 2, 3]) >>> labels = np.array([0, 1, 2, 3]) >>> est = svm.LinearSVC() >>> est.fit(X, Y) LinearSVC() >>> pred_decision = est.decision_function([[-1], [2], [3]]) >>> y_true = [0, 2, 3] >>> hinge_loss(y_true, pred_decision, labels=labels) 0.56 ·
🌐
Medium
medium.com › swlh › support-vector-machine-machine-learning-in-python-5befb92ba3d0
Support Vector Machine: Machine Learning in Python | by Divyansh Chaudhary | The Startup | Medium
January 25, 2021 - For an intended output t = ±1 and a classifier y, the hinge loss of the prediction is defined as: where t is the output of the SVM given input x, and y is the true class [-1,1]. Note: sklearn library is used only to create a Dataset using make_classification() function. This blog is a summary of Support Vector Machine and its Math involved in Python.
🌐
EITCA
eitca.org › home › what is the role of the loss function in svm training?
What is the role of the loss function in SVM training? - EITCA Academy
August 7, 2023 - In SVM training, the loss function is used to quantify the error or discrepancy between the predicted outputs of the SVM model and the true labels of the training data. The goal of training an SVM is to find the optimal hyperplane that maximally separates the different classes in the data.