Let us start from the beggining. Support vector machine is a linear model and it always looks for a hyperplane to separate one class from another. I will focus on two-dimensional case because it is easier to comprehend and - possible to visualize to give some intuition, however bear in mind that this is true for higher dimensions (simply lines change into planes, parabolas into paraboloids etc.).

Kernel in very short words

What kernels do is to change the definition of the dot product in the linear formulation. What does it mean? SVM works with dot products, for finite dimension defined as <x,y> = x^Ty = SUM_{i=1}^d x_i y_i. This more or less captures similarity between two vectors (but also a geometrical operation of projection, it is also heavily related to the angle between vectors). What kernel trick does is to change each occurence of <x,y> in math of SVM into K(x,y) saying "K is dot product in SOME space", and there exists a mapping f_K for each kernel, such that K(x,y)=<f_K(x), f_K(y)> the trick is, you do not use f_K directly, but just compute their dot products, which saves you tons of time (sometimes - infinite amount, as f_K(x) might have infinite number of dimensions). Ok, so what it meas for us? We still "live" in the space of x, not f_K(x). The result is quite nice - if you build a hyperplane in space of f_K, separate your data, and then look back at space of x (so you might say you project hyperplane back through f_K^{-1}) you get non-linear decision boundaries! Type of the boundary depends on f_K, f_K depends on K, thus, choice of K will (among other things) affect the shape of your boundary.

Linear kernel

Here we in fact do not have any kernel, you just have "normal" dot product, thus in 2d your decision boundary is always line.

As you can see we can separate most of points correctly, but due to the "stiffness" of our assumption - we will not ever capture all of them.

Poly

Here, our kernel induces space of polynomial combinations of our features, up to certain degree. Consequently we can work with slightly "bended" decision boundaries, such as parabolas with degree=2

As you can see - we separated even more points! Ok, can we get all of them by using higher order polynomials? Lets try 4!

Unfortunately not. Why? Because polynomial combinations are not flexible enough. It will not "bend" our space hard enough to capture what we want (maybe it is not that bad? I mean - look at this point, it looks like an outlier!).

RBF kernel

Here, our induced space is a space of Gaussian distributions... each point becomes probability density function (up to scaling) of a normal distribution. In such space, dot products are integrals (as we do have infinite number of dimensions!) and consequently, we have extreme flexibility, in fact, using such kernel you can separate everything (but is it good?)

Rough comparison

Ok, so what are the main differences? I will now sort these three kernels under few measures

  • time of SVM learning: linear < poly < rbf
  • ability to fit any data: linear < poly < rbf
  • risk of overfitting: linear < poly < rbf
  • risk of underfitting: rbf < poly < linear
  • number of hyperparameters: linear (0) < rbf (2) < poly (3)
  • how "local" is particular kernel: linear < poly < rbf

So which one to choose? It depends. Vapnik and Cortes (inventors of SVM) supported quite well the idea that you always should try to fit simpliest model possible and only if it underfits - go for more complex ones. So you should generally start with linear model (kernel in case of SVM) and if it gets really bad scores - switch to poly/rbf (however remember that it is much harder to work with them due to number of hyperparameters)

All images done using a nice applet on the site of libSVM - give it a try, nothing gives you more intuition then lots of images and interaction :-) https://www.csie.ntu.edu.tw/~cjlin/libsvm/

Answer from lejlot on Stack Overflow
🌐
GeeksforGeeks
geeksforgeeks.org › machine learning › support-vector-machine-algorithm
Support Vector Machine (SVM) Algorithm - GeeksforGeeks
Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression tasks. It tries to find the best boundary known as hyperplane that separates different classes in the data. It is useful when you want to do binary classification like spam vs. not spam or cat vs. dog. The main goal of SVM is to maximize the margin between the two classes.
Published   3 weeks ago
🌐
DataFlair
data-flair.training › blogs › svm-kernel-functions
Kernel Functions-Introduction to SVM Kernel & Examples - DataFlair
July 28, 2025 - Now we are going to provide you a detailed description of SVM Kernel and Different Kernel Functions and its examples such as linear, nonlinear, polynomial, Gaussian kernel, Radial basis function (RBF), sigmoid etc.
People also ask

Can SVM be used for continuous data?
SVM is used to create a classification model. So, if you have a classifier, it has to work with only two classes. If you have continuous data, then you will have to turn that data into classes, the process is called dimensionality reduction. For example, if you have something like age, height, weight, grade etc. then you can take the mean of that data and make it closer to either one class or another, which then will make the classification easier.
🌐
upgrad.com
upgrad.com › home › blog › artificial intelligence › support vector machines: types of svm [algorithm explained]
Support Vector Machines: Types of SVM [Algorithm Explained]
What are the advantages of using the Support Vector Machine algorithm?
The Support Vector Machine algorithm is highly effective for high-dimensional data, robust against overfitting, and works well for both linear and non-linear classification problems.
🌐
upgrad.com
upgrad.com › home › blog › artificial intelligence › support vector machines: types of svm [algorithm explained]
Support Vector Machines: Types of SVM [Algorithm Explained]
How does the Support Vector Machine algorithm work?
The Support Vector Machine algorithm works by mapping input data points into a high-dimensional space and finding the optimal hyperplane that maximizes the margin between different classes. It utilizes support vectors, which are data points closest to the decision boundary.
🌐
upgrad.com
upgrad.com › home › blog › artificial intelligence › support vector machines: types of svm [algorithm explained]
Support Vector Machines: Types of SVM [Algorithm Explained]
🌐
Scribd
scribd.com › document › 916329012 › SVM-Kernels-and-Its-Type
Types of SVM Kernels Explained | PDF | Support Vector ...
JavaScript is disabled in your browser · Please enable JavaScript to proceed · A required part of this site couldn’t load. This may be due to a browser extension, network issues, or browser settings. Please check your connection, disable any ad blockers, or try using a different browser
🌐
Plutus Education
plutuseducation.com › blog › study material › svm in machine learning: types, examples, and key advantages
SVM in Machine Learning: Types, Examples, and Key Advantages
April 4, 2025 - We create a model for one class concerning all other courses in One-vs-All. Then, we select the best result across the models. Nu-SVM and C-SVM are a few types that allow us to adjust the model.
🌐
Upgrad
upgrad.com › home › blog › artificial intelligence › support vector machines: types of svm [algorithm explained]
Support Vector Machines: Types of SVM [Algorithm Explained]
February 17, 2026 - Types of Support Vector Machine (SVM) include Linear SVM, used for linearly separable data, and Non-Linear SVM, which handles complex data using kernel functions like RBF and polynomial.
Find elsewhere
Top answer
1 of 1
26

Let us start from the beggining. Support vector machine is a linear model and it always looks for a hyperplane to separate one class from another. I will focus on two-dimensional case because it is easier to comprehend and - possible to visualize to give some intuition, however bear in mind that this is true for higher dimensions (simply lines change into planes, parabolas into paraboloids etc.).

Kernel in very short words

What kernels do is to change the definition of the dot product in the linear formulation. What does it mean? SVM works with dot products, for finite dimension defined as <x,y> = x^Ty = SUM_{i=1}^d x_i y_i. This more or less captures similarity between two vectors (but also a geometrical operation of projection, it is also heavily related to the angle between vectors). What kernel trick does is to change each occurence of <x,y> in math of SVM into K(x,y) saying "K is dot product in SOME space", and there exists a mapping f_K for each kernel, such that K(x,y)=<f_K(x), f_K(y)> the trick is, you do not use f_K directly, but just compute their dot products, which saves you tons of time (sometimes - infinite amount, as f_K(x) might have infinite number of dimensions). Ok, so what it meas for us? We still "live" in the space of x, not f_K(x). The result is quite nice - if you build a hyperplane in space of f_K, separate your data, and then look back at space of x (so you might say you project hyperplane back through f_K^{-1}) you get non-linear decision boundaries! Type of the boundary depends on f_K, f_K depends on K, thus, choice of K will (among other things) affect the shape of your boundary.

Linear kernel

Here we in fact do not have any kernel, you just have "normal" dot product, thus in 2d your decision boundary is always line.

As you can see we can separate most of points correctly, but due to the "stiffness" of our assumption - we will not ever capture all of them.

Poly

Here, our kernel induces space of polynomial combinations of our features, up to certain degree. Consequently we can work with slightly "bended" decision boundaries, such as parabolas with degree=2

As you can see - we separated even more points! Ok, can we get all of them by using higher order polynomials? Lets try 4!

Unfortunately not. Why? Because polynomial combinations are not flexible enough. It will not "bend" our space hard enough to capture what we want (maybe it is not that bad? I mean - look at this point, it looks like an outlier!).

RBF kernel

Here, our induced space is a space of Gaussian distributions... each point becomes probability density function (up to scaling) of a normal distribution. In such space, dot products are integrals (as we do have infinite number of dimensions!) and consequently, we have extreme flexibility, in fact, using such kernel you can separate everything (but is it good?)

Rough comparison

Ok, so what are the main differences? I will now sort these three kernels under few measures

  • time of SVM learning: linear < poly < rbf
  • ability to fit any data: linear < poly < rbf
  • risk of overfitting: linear < poly < rbf
  • risk of underfitting: rbf < poly < linear
  • number of hyperparameters: linear (0) < rbf (2) < poly (3)
  • how "local" is particular kernel: linear < poly < rbf

So which one to choose? It depends. Vapnik and Cortes (inventors of SVM) supported quite well the idea that you always should try to fit simpliest model possible and only if it underfits - go for more complex ones. So you should generally start with linear model (kernel in case of SVM) and if it gets really bad scores - switch to poly/rbf (however remember that it is much harder to work with them due to number of hyperparameters)

All images done using a nice applet on the site of libSVM - give it a try, nothing gives you more intuition then lots of images and interaction :-) https://www.csie.ntu.edu.tw/~cjlin/libsvm/

🌐
Plain English
python.plainenglish.io › support-vector-machine-svm-clearly-explained-d9db9123b7ac
Support Vector Machine (SVM), Clearly Explained! | by Risdan Kristori | Python in Plain English
August 9, 2024 - Support Vector Machine (SVM), Clearly Explained! A completed explanation of the SVM machine learning model with an example in Python. Support Vector Machines (SVM) are powerful tools in the realm of …
🌐
TechTarget
techtarget.com › whatis › definition › support-vector-machine-SVM
What is a Support Vector Machine (SVM)? | Definition from TechTarget
Instead of explicitly calculating the coordinates of the transformed space, the kernel function enables the SVM to implicitly compute the dot products between the transformed feature vectors and avoid handling expensive, unnecessary computations for extreme cases. SVMs can handle both linearly separable and non-linearly separable data. They do this by using different types of kernel functions, such as the linear kernel, polynomial kernel or radial basis function (RBF) kernel.
🌐
Spot Intelligence
spotintelligence.com › home › support vector machines (svm) in machine learning made simple & how to tutorial
Support Vector Machines (SVM) In Machine Learning Made Simple & How To Tutorial
November 15, 2024 - Support Vector Machines (SVMs) are renowned for their ability to handle complex datasets by employing various kernel functions. These kernels are crucial in transforming data into higher-dimensional spaces, where linear separation becomes feasible. ... Understanding the different types of SVM ...
🌐
Roche
giftsandentertainment.roche.com › open-outlook › support-vector-machine-svm-a-simple-explanation-1767648748
Support Vector Machine (SVM): A Simple Explanation
January 6, 2026 - Open Outlook brings you the latest in news, sports, and entertainment, all from a unique. Stay informed, stay entertained.
🌐
Medium
medium.com › @RobuRishabh › support-vector-machines-svm-27cd45b74fbb
Support Vector Machines (SVM). Support Vector Machines (SVM) is a… | by Rishabh Singh | Medium
October 17, 2024 - A good separation is achieved by the hyperplane that has the largest margin, meaning the maximum distance between data points of different classes. Support Vector Machine(SVM) is a powerful classifier that works both on linearly and nonlinearly separable data.
🌐
Reddit
reddit.com › r/machinelearning › please explain support vector machines (svm) like i am a 5 year old.
r/MachineLearning on Reddit: Please explain Support Vector Machines (SVM) like I am a 5 year old.
January 5, 2013 - Note that both the youtube video and your awesome images are just how to do linear classification in a high dimensional space, they both do not talk about the use of support vectors (which is a computation optimization). ... This is the greatest explanation ever. I've got an exam with this stuff on it tomorrow. Thank you very much! ... This has to be the most interesting way to explain SVM.
🌐
NCBI
ncbi.nlm.nih.gov › books › NBK583961
Support Vector Machines and Support Vector Regression - Multivariate Statistical Machine Learning Methods for Genomic Prediction - NCBI Bookshelf
January 14, 2022 - It is important to point out that to fit a model with the svm() function without the G × E term, we can implement not only the Gaussian kernel (radial) but also the linear, polynomial, and sigmoid kernels, by only specifying in svm(y = y_tr,x = X_tr_New, kernel = “linear”), the required ...
🌐
Hero Vired
herovired.com › learning-hub › blogs › support-vector-machines
Support Vector Machines (SVM) - Types & Applications | Hero Vired
July 4, 2024 - Here is an example of a linear support vector machine: Let’s say we have a dataset with data from two different classes, like cats and dogs. You can use a point on a 2D plane to represent every single data point. You can further use this point to plot the data points for every class. The SVM aims to find a straight line that can precisely divide the data points of the two classes.
🌐
IBM
ibm.com › think › topics › support-vector-machine
What Is Support Vector Machine? | IBM
November 17, 2025 - The SVM algorithm is widely used in machine learning as it can handle both linear and nonlinear classification tasks. However, when the data is not linearly separable, kernel functions are used to transform the data higher-dimensional space to enable linear separation. This application of kernel functions can be known as the “kernel trick”, and the choice of kernel function, such as linear kernels, polynomial kernels, radial basis function (RBF) kernels, or sigmoid kernels, depends on data characteristics and the specific use case.
🌐
Coursera
coursera.org › coursera articles › data › data science › what are support vector machine (svm) algorithms?
What Are Support Vector Machine (SVM) Algorithms? | Coursera
SVM algorithms can be used for classification and regression, with practical applications in signal processing, natural language processing, speech and image recognition, handwriting recognition, email classification, and more. The following offers a closer look at some practical uses for supporting vector machine algorithms.
Published   October 15, 2025
Views   764
🌐
Analytics Vidhya
analyticsvidhya.com › home › the a-z guide to support vector machine
Support Vector Machine : Beginners Guide - Analytics Vidhya
November 16, 2023 - Here, the dependent variable is a qualitative type like binary or multi-label types like yes or no, normal or abnormal, and categorical types like good, better, best, type 1 or type 2, or type 3.
🌐
MathWorks
mathworks.com › discovery › support-vector-machine.html
What Is a Support Vector Machine? - MATLAB & Simulink
MATLAB supports various data types, such as time-series data, text, images, and audio. Specialized toolboxes, such as Audio Toolbox™ and Signal Processing Toolbox™, provide feature extraction capabilities, enabling you to measure distinctive features in different domains and reuse intermediate computations. You can train your SVM models for binary or multiclass classification and regression tasks using the fitcsvm and fitrsvm functions.
🌐
Google
developers.google.com › machine-learning
Machine Learning | Google for Developers
Our guides offer simple step-by-step walkthroughs for solving common machine learning problems using best practices.
🌐
Snowflake
snowflake.com › en › fundamentals › support-vector-machine
Support Vector Machine (SVM) Explained: Components & Types
October 22, 2025 - For new, unseen data points, the trained SVM applies the same kernel transformation and simply checks which side of the learned hyperplane each point falls on. The distance from the hyperplane can also indicate the confidence level of the classification. There are five primary types of support vector machines: