This type of SVM is often implemented with the SMO algorithm. You may want check for the original published version (Platt, John. Fast Training of Support Vector Machines using Sequential Minimal Optimization, in Advances in Kernel Methods Support Vector Learning, B. Scholkopf, C. Burges, A. Smola, eds., MIT Press (1998)), but it is quite complicated as for me.

A bit simplified version is presented in Stanford Lecture Notes, but derivation of all the formulas should be found somewhere else (e.g. this random notes I found on the Internet).

As an alternative I can propose you my own variation of the SMO algorithm. It is highly simplified, implementation contains a bit more than 30 lines of code

class SVM:
  def __init__(self, kernel='linear', C=10000.0, max_iter=100000, degree=3, gamma=1):
    self.kernel = {'poly':lambda x,y: np.dot(x, y.T)**degree,
                   'rbf':lambda x,y:np.exp(-gamma*np.sum((y-x[:,np.newaxis])**2,axis=-1)),
                   'linear':lambda x,y: np.dot(x, y.T)}[kernel]
    self.C = C
    self.max_iter = max_iter

  def restrict_to_square(self, t, v0, u):
    t = (np.clip(v0 + t*u, 0, self.C) - v0)[1]/u[1]
    return (np.clip(v0 + t*u, 0, self.C) - v0)[0]/u[0]

  def fit(self, X, y):
    self.X = X.copy()
    self.y = y * 2 - 1
    self.lambdas = np.zeros_like(self.y, dtype=float)
    self.K = self.kernel(self.X, self.X) * self.y[:,np.newaxis] * self.y
    
    for _ in range(self.max_iter):
      for idxM in range(len(self.lambdas)):
        idxL = np.random.randint(0, len(self.lambdas))
        Q = self.K[[[idxM, idxM], [idxL, idxL]], [[idxM, idxL], [idxM, idxL]]]
        v0 = self.lambdas[[idxM, idxL]]
        k0 = 1 - np.sum(self.lambdas * self.K[[idxM, idxL]], axis=1)
        u = np.array([-self.y[idxL], self.y[idxM]])
        t_max = np.dot(k0, u) / (np.dot(np.dot(Q, u), u) + 1E-15)
        self.lambdas[[idxM, idxL]] = v0 + u * self.restrict_to_square(t_max, v0, u)
    
    idx, = np.nonzero(self.lambdas > 1E-15)
    self.b = np.sum((1.0-np.sum(self.K[idx]*self.lambdas, axis=1))*self.y[idx])/len(idx)
  
  def decision_function(self, X):
    return np.sum(self.kernel(X, self.X) * self.y * self.lambdas, axis=1) + self.b

In simple cases it works not much worth than sklearn.svm.SVC, comparison shown below

I have posted this code with some more code producing images for comparison on GitHub. For more elaborate explanation with formulas you may want to refer to my preprint on ResearchGate.

UPDATE: now live version is available, see Github Pages

Answer from guest on Stack Overflow
🌐
scikit-learn
scikit-learn.org › stable › auto_examples › svm › plot_rbf_parameters.html
RBF SVM parameters — scikit-learn 1.8.0 documentation
Download Jupyter notebook: plot_rbf_parameters.ipynb · Download Python source code: plot_rbf_parameters.py
🌐
GeeksforGeeks
geeksforgeeks.org › python › rbf-svm-parameters-in-scikit-learn
RBF SVM Parameters in Scikit Learn - GeeksforGeeks
April 28, 2025 - This code performs a grid search to find the best combination of parameters (C and gamma) for an SVM model with an RBF kernel on the iris Data Set. It then plots the accuracy of the model against the different values of C and gamma to show how ...
🌐
Quark Machine Learning
quarkml.com › home › data science › machine learning
The RBF kernel in SVM: A Complete Guide - Quark Machine Learning
April 6, 2025 - The Radial Basis Function (RBF) kernel is one of the most powerful, useful, and popular kernels in the Support Vector Machine (SVM) family of classifiers. In this article, we’ll discuss what exactly makes this kernel so powerful, look at its working, and study examples of it in action. We’ll also provide code samples for implementing the RBF kernel from scratch in Python that illustrates how to use the RBF kernel on your own data sets.
🌐
scikit-learn
scikit-learn.org › stable › modules › svm.html
1.4. Support Vector Machines — scikit-learn 1.8.0 documentation
Proper choice of C and gamma is critical to the SVM’s performance. One is advised to use GridSearchCV with C and gamma spaced exponentially far apart to choose good values. ... You can define your own kernels by either giving the kernel as a python function or by precomputing the Gram matrix.
Top answer
1 of 1
5

This type of SVM is often implemented with the SMO algorithm. You may want check for the original published version (Platt, John. Fast Training of Support Vector Machines using Sequential Minimal Optimization, in Advances in Kernel Methods Support Vector Learning, B. Scholkopf, C. Burges, A. Smola, eds., MIT Press (1998)), but it is quite complicated as for me.

A bit simplified version is presented in Stanford Lecture Notes, but derivation of all the formulas should be found somewhere else (e.g. this random notes I found on the Internet).

As an alternative I can propose you my own variation of the SMO algorithm. It is highly simplified, implementation contains a bit more than 30 lines of code

class SVM:
  def __init__(self, kernel='linear', C=10000.0, max_iter=100000, degree=3, gamma=1):
    self.kernel = {'poly':lambda x,y: np.dot(x, y.T)**degree,
                   'rbf':lambda x,y:np.exp(-gamma*np.sum((y-x[:,np.newaxis])**2,axis=-1)),
                   'linear':lambda x,y: np.dot(x, y.T)}[kernel]
    self.C = C
    self.max_iter = max_iter

  def restrict_to_square(self, t, v0, u):
    t = (np.clip(v0 + t*u, 0, self.C) - v0)[1]/u[1]
    return (np.clip(v0 + t*u, 0, self.C) - v0)[0]/u[0]

  def fit(self, X, y):
    self.X = X.copy()
    self.y = y * 2 - 1
    self.lambdas = np.zeros_like(self.y, dtype=float)
    self.K = self.kernel(self.X, self.X) * self.y[:,np.newaxis] * self.y
    
    for _ in range(self.max_iter):
      for idxM in range(len(self.lambdas)):
        idxL = np.random.randint(0, len(self.lambdas))
        Q = self.K[[[idxM, idxM], [idxL, idxL]], [[idxM, idxL], [idxM, idxL]]]
        v0 = self.lambdas[[idxM, idxL]]
        k0 = 1 - np.sum(self.lambdas * self.K[[idxM, idxL]], axis=1)
        u = np.array([-self.y[idxL], self.y[idxM]])
        t_max = np.dot(k0, u) / (np.dot(np.dot(Q, u), u) + 1E-15)
        self.lambdas[[idxM, idxL]] = v0 + u * self.restrict_to_square(t_max, v0, u)
    
    idx, = np.nonzero(self.lambdas > 1E-15)
    self.b = np.sum((1.0-np.sum(self.K[idx]*self.lambdas, axis=1))*self.y[idx])/len(idx)
  
  def decision_function(self, X):
    return np.sum(self.kernel(X, self.X) * self.y * self.lambdas, axis=1) + self.b

In simple cases it works not much worth than sklearn.svm.SVC, comparison shown below

I have posted this code with some more code producing images for comparison on GitHub. For more elaborate explanation with formulas you may want to refer to my preprint on ResearchGate.

UPDATE: now live version is available, see Github Pages

🌐
Towards Data Science
towardsdatascience.com › home › latest › svm classifier and rbf kernel – how to make better models in python
SVM Classifier and RBF Kernel - How to Make Better Models in Python | Towards Data Science
January 23, 2025 - SVM with RBF kernel and high gamma. See how it was created in the Python section at the end of this story. Image by author. It is essential to understand how different Machine Learning algorithms work to succeed in your Data Science projects. I have written this story as part of the series that dives into each ML algorithm explaining its mechanics, supplemented by Python code examples and intuitive visualizations.
🌐
DZone
dzone.com › data engineering › ai/ml › svm rbf kernel parameters with code examples
SVM RBF Kernel Parameters With Code Examples
July 28, 2020 - In this post, you will learn about SVM RBF (Radial Basis Function) kernel hyperparameters with the python code example.
Find elsewhere
🌐
VitalFlux
vitalflux.com › home › data science › svm rbf kernel parameters: python examples
SVM RBF Kernel Parameters: Python Examples - Analytics Yogi
April 15, 2023 - It can thus be understood that the selection of appropriate values of Gamma is important. Here is the code which is used. svm = SVC(kernel='rbf', random_state=1, gamma=0.008, C=0.1) svm.fit(X_train_std, y_train)
🌐
Machinecurve
machinecurve.com › index.php › 2020 › 11 › 25 › using-radial-basis-functions-for-svms-with-python-and-scikit-learn
Using Radial Basis Functions for SVMs with Python and Scikit-learn | MachineCurve.com
November 25, 2020 - Let's take a look what happens when we implement our Scikit-learn classifier with the RBF kernel. We can easily implement an RBF based SVM classifier with Scikit-learn: the only thing we have to do is change kernel='linear' to kernel='rbf' during SVC(...) initialization.
🌐
GitHub
github.com › xbeat › Machine-Learning › blob › main › The Mathematics of RBF Kernel in Python.md
Machine-Learning/The Mathematics of RBF Kernel in Python.md at main · xbeat/Machine-Learning
Selecting the right gamma value is crucial for the performance of RBF kernel-based models. Too small gamma can lead to underfitting, while too large gamma can cause overfitting. from sklearn.svm import SVC from sklearn.datasets import make_moons from sklearn.model_selection import train_test_split import numpy as np import matplotlib.pyplot as plt # Generate non-linear data X, y = make_moons(n_samples=100, noise=0.15, random_state=42) # Split the data X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Train SVM with different gamma values gammas = [0.01
Author   xbeat
🌐
GeeksforGeeks
geeksforgeeks.org › how-to-make-better-models-in-python-using-svm-classifier-and-rbf-kernel
How to Make Better Models in Python using SVM Classifier and RBF Kernel | GeeksforGeeks
April 28, 2025 - One powerful tool that can be used ... a variety of different data types. In this article, we will focus on how to use the SVM classifier and the radial basis function (RBF) kernel in Python to build better models for your data....
🌐
Medium
ujangriswanto08.medium.com › step-by-step-implementation-of-the-rbf-kernel-in-python-or-r-a498b3acf9d6
Step-by-Step Implementation of the RBF Kernel in Python (or R) | by Ujang Riswanto | Medium
July 11, 2025 - In this guide, we’ll break down the RBF kernel step by step, implementing it from scratch in both Python and R. We’ll also see how it performs when used with an SVM and discuss how to fine-tune it for the best results. Let’s dive in!
🌐
GitHub
github.com › booleanhunter › code-samples › blob › master › Python programs › machine-learning › svm-rbf-kernel.py
code-samples/Python programs/machine-learning/svm-rbf-kernel.py at master · booleanhunter/code-samples
A repository to try out new languages, write sample programs and mini projects, and save interesting stuff and code snippets from the internet - code-samples/Python programs/machine-learning/svm-rbf-kernel.py at master · booleanhunter/code-samples
Author   booleanhunter
🌐
Kaggle
kaggle.com › code › manmohan291 › 16-sklearn-svm-rbf-kernel
16 SKLearn - SVM RBF Kernel
Checking your browser before accessing www.kaggle.com · Click here if you are not automatically redirected after 5 seconds
🌐
scikit-learn
scikit-learn.org › stable › modules › generated › sklearn.metrics.pairwise.rbf_kernel.html
rbf_kernel — scikit-learn 1.8.0 documentation
>>> from sklearn.metrics.pairwise import rbf_kernel >>> X = [[0, 0, 0], [1, 1, 1]] >>> Y = [[1, 0, 0], [1, 1, 0]] >>> rbf_kernel(X, Y) array([[0.71, 0.51], [0.51, 0.71]]) Decision boundary of semi-supervised classifiers versus SVM on the Iris ...
🌐
scikit-learn
scikit-learn.org › dev › auto_examples › svm › plot_rbf_parameters.html
RBF SVM parameters — scikit-learn 1.9.dev0 documentation
Download Jupyter notebook: plot_rbf_parameters.ipynb · Download Python source code: plot_rbf_parameters.py
🌐
AI Mind
pub.aimind.so › using-radial-basis-functions-for-svms-with-python-and-scikit-learn-c935aa06a56e
Using Radial Basis Functions for Support Vector Machines | by Francesco Franco | AI Mind
June 12, 2025 - We showed that Radial Basis Functions, which estimate the distance between a sample and a point, may be utilized as a kernel function, allowing us to learn a linear decision boundary in nonlinear data via the kernel technique. Using a number of visual and code examples, we demonstrated how to utilize Scikit-learn and Python to apply RBFs to a Support Vector Machine-based Machine Learning model.
Top answer
1 of 1
9

Say that mat1 is $n \times d$ and mat2 is $m \times d$.

Recall that the Gaussian RBF kernel is defined as $k(x, y) = \exp\left( - \frac{1}{2 \sigma^2} \lVert x - y \rVert^2 \right)$. But we can write $\lVert x - y \rVert^2$ as $(x - y)^T (x - y) = x^T x + y^T y - 2 x^T y$. The code uses this decomposition.

First, the trnorms1 vector stores $x^T x$ for each input $x$ in mat1, and trnorms2 stores $y^T y$ for each $y$ in mat2.

Then, the k1 matrix is obtained by multiplying the $n \times 1$ matrix of $x^T x$ entries by a $1 \times m$ matrix of ones, getting an $n \times m$ matrix with $x^T x$ entries repeated across the rows, so that k1[i, j] is $x_i^T x_i$.

The next line does basically the same thing for the $y$ norms repeated across columns, getting an $n \times m$ matrix with k2[i, j] of $y_j^T y_j$.

k is then their sum, so that k[i, j] is $x_i^T x_i + y_j^T y_j$. The next line then subtracts twice the product of the data matrices, so that k[i, j] becomes $x_i^T x_i + y_j^T y_j - 2 x_i^T y_j = \lVert x_i - y_j \rVert^2$.

Then, the code multiplies by $\frac{-1}{2 \sigma^2}$ and finally takes the elementwise $\exp$, getting out the Gaussian kernel.

If you dig into the scikit-learn implementation, it's exactly the same, except:

  • It's parameterized instead with $\gamma = \frac{1}{2 \sigma^2}$.
  • It's written in much better Python, not wasting memory all over the place and doing computations in a needlessly slow way.
  • It's broken up into helper functions.

But, algorithmically, it's doing the same basic operations.

🌐
Medium
medium.com › data-science › svm-classifier-and-rbf-kernel-how-to-make-better-models-in-python-73bb4914af5b
SVM Classifier and RBF Kernel — How to Make Better Models in Python | by Saul Dobilas | TDS Archive | Medium
February 11, 2024 - SVM with RBF kernel and high gamma. See how it was created in the Python section at the end of this story. Image by author. It is essential to understand how different Machine Learning algorithms work to succeed in your Data Science projects. I have written this story as part of the series that dives into each ML algorithm explaining its mechanics, supplemented by Python code examples and intuitive visualizations.