Here I will try and provide some intuition for SVCs. The code for the figures is at the end.


Linear SVCs Suppose you have a batch of 2-pixel "images", and each image is labelled either A or B. Here's a sample of 5 images:

Each pixel is considered as a feature. In this example, the two pixels defined a 2D feature space that we could visualise along an x and y axis. When you have an 8x8 image for example, you have a 64-dimensional feature space.

For each image (there are 500), you can consider the pixel values (intensities) as features. Plotting the samples in feature space (pixel 1 - pixel 2 space), and colouring them by their label, yields this representation:

Linear SVCs work by drawing a line in feature space that separates the two classes. The best line is the one with the widest margin. The entire area above the decision line is considered to belong to class A, and the entire area below the decision line is considered to be class B. When you have a new sample, the SVC assesses where it falls in relation to the decision line: above the line means it will predict 'A', and below the line means it'll predict 'B'.

The margin is defined by where it first hits a sample. The samples that limit or define the margins are called support vectors:

In this case the best margin is defined by 3 samples. In practice, you want your margin (and therefore the SVC's decision line or solution) to be defined by more samples, i.e. allow more margin violations. This is determined by the regularisation parameter C. Smaller values of C average more samples when defining the margin (more regularised, potentially underfitting). Larger values of C only let a few samples become support vectors, so the margin might end up depending on just a few noisy points (less regularised, potentially overfitting). The optimal C for your dataset is a matter of hyperparameter tuning.


Non-linear SVCs and the kernel trick The kernel trick is used where you can't simply linearly separate the classes using the features you have, such as in this example:

In such cases, you can try creating more complex features, such as these quadratic ones: pixel1 x pixel2, (pixel1)^2, (pixel2)^2 (similar to kernel='poly', degree=2). In this richer feature space, you have a better chance of finding a linear separation of the classes. The kernel trick is an efficient way of converging on a linear separation in the higher dimensional space - it saves you explicitly computing the new features, while still allowing you to navigate the space for a solution. When you bring that linear solution back into the original pixel1-pixel2 space, it'll be nonlinear because it was only linear in the higher dimensional space where it was found. This results in complex decision boundaries:

I tried to replicate this result manually - i.e. without using the kernel trick. I combined the features to create the product features described above (pixel 1 times pixel 2, etc), and then ran it through a linear SVC such that it would find a linear separation of my quadratic feature set. I get a similar result (green) to the above (red):

SVCs are binary classifiers, like the one shown above. When you apply SVCs to more than 2 classes, sklearn creates several binary SVCs (one for each pair of classes), and the final results is determined using a one-vs-rest strategy.


Code used to generate the plots

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

#
#Synthesise 2-pixel images. Each image belongs to A or B.
#
np.random.seed(rand_seed)

synth = np.random.randn(4, 250)
X_df = pd.DataFrame({'pixel1': np.c_[ synth[[0]] + 2, synth[[1]] - 2 ].ravel(),
                     'pixel2': np.c_[ synth[[2]] + 2, synth[[3]] - 5 ].ravel(),
                     'label': ['A'] * 250 + ['B'] * 250})
X_df.index.name = 'sample'
X, y = X_df.drop(columns='label').values, X_df.label.values

#
# View some samples
#
f, axs = plt.subplots(1, 5)
for ax in axs.flatten():
    i = np.random.randint(0, 500)
    ax.imshow(X_df.loc[i, ['pixel1', 'pixel2']].infer_objects().values[None, :],
              cmap='binary', vmin=-5, vmax=5)
    #ax.axis('off')
    ax.set_title(X_df.label[i])
    ax.set_xticks([])
    ax.set_yticks([])

#
# View samples in feature space (pixel1-pixel2 space)
#
f, ax = plt.subplots(figsize=(6, 3))
for label in X_df.label.unique():
    X_df.loc[X_df.label == label].plot(
        kind='scatter', x='pixel1', y='pixel2', c='red' if label=='A' else 'blue', label=label, ax=ax
    )
ax_lims = ax.axis()

#
# Fit linear SVC
#
from sklearn.svm import SVC
svc = SVC(kernel='linear').fit(X, y)

#Plot decision boundary
(w0, w1), b = svc.coef_.ravel().tolist(), svc.intercept_
x0 = np.array([-10, 10])
x1 = (-w0*x0 - b) / w1
ax.plot(x0, x1, 'k', label='decision boundary', linewidth=3)
ax.plot(x0, x1 + 1/w1, 'k:', label='margin', linewidth=1)
ax.plot(x0, x1 - 1/w1, 'k:', linewidth=1)
ax.axis(ax_lims)
ax.legend()

#Highlight support vectors
ax.scatter(
    svc.support_vectors_[:, 0], svc.support_vectors_[:, 1],
    marker='s', edgecolor='k',facecolor='none', s=70, label='support vectors'
)
ax.legend()

Code example for non-linear SVMs/kernel trick:

#
# Kernel trick demo
#
np.random.seed(1)

#Synthesise a dataset that's not linearly seperable
synth = np.random.randn(4, 250)
X_df = pd.DataFrame({'pixel1': np.c_[ synth[[0]] + 2, synth[[1], :125] - 2, synth[[1], 125:] + 8].ravel(),
                     'pixel2': np.c_[ synth[[2]] + 2, synth[[3], :125] - 5, synth[[3], :125] + 8 ].ravel(),
                     'label': ['A'] * 250 + ['B'] * 250})
X_df.index.name = 'sample'
X, y = X_df.drop(columns='label').values, X_df.label.values

#Plot the data
f, ax = plt.subplots(figsize=(6, 3))
for label in X_df.label.unique():
    X_df.loc[X_df.label == label].plot(
        kind='scatter', x='pixel1', y='pixel2', c='red' if label=='A' else 'blue', label=label, ax=ax
    )
ax_lims = ax.axis()

#
#SVC(kernel='poly') uses the kernel trick
#
svc = SVC(kernel='poly', degree=2, C=1e6).fit(X, y)

xx, yy = np.meshgrid(np.linspace(-10, 10),
                     np.linspace(-10, 10))
X = np.c_[xx.ravel(), yy.ravel()]
preds = svc.predict(X).reshape(xx.shape)
preds[preds == 'A'] = 0
preds[preds == 'B'] = 1
preds = np.float32(preds)
decision = svc.decision_function(X).reshape(xx.shape)

ax.contour(xx, yy, preds, cmap='Reds')

#
# Manual equivalent of the SVC above
# but without using the kernel trick
#

#Create a richer feature space from the existing features
#We'll opt for a quadratic polynomial feature space, like the SVC above
X2_df = X_df.copy()
X2_df['pixel1_squared'] = X2_df.pixel1 ** 2
X2_df['pixel2_squared'] = X2_df.pixel2 ** 2
X2_df['pixel1_pixel2'] = X2_df.pixel1 * X2_df.pixel2
X2 = X2_df.drop(columns='label').values

#Find a linear separation in this space
svc2 = SVC(kernel='linear', C=1e6).fit(X2, y)

#Plot the decision boundary
preds2 = svc2.predict( np.c_[X, X[:, 0]**2, X[:, 1]**2, X[:, 0]*X[:, 1]] ).reshape(xx.shape)
preds2[preds2 == 'A'] = 0
preds2[preds2 == 'B'] = 1
preds2 = np.float32(preds2)

decision2 = svc2.decision_function( np.c_[X, X[:, 0]**2, X[:, 1]**2, X[:, 0]*X[:, 1]] ).reshape(xx.shape)
ax.contour(xx, yy, preds2, cmap='Greens')
Answer from MuhammedYunus on Stack Overflow
🌐
GeeksforGeeks
geeksforgeeks.org › machine learning › image-classification-using-support-vector-machine-svm-in-python
Image classification using Support Vector Machine (SVM) in Python - GeeksforGeeks
July 23, 2025 - Now we can use the classification_report function from scikit-learn to generate a classification report for your SVM model. Here is an example code snippet: ... precision recall f1-score support cat 0.57 0.72 0.64 50 dog 0.62 0.46 0.53 50 accuracy 0.59 100 macro avg 0.60 0.59 0.58 100 weighted avg 0.60 0.59 0.58 100 · Finally, we mention that the trained SVM model can be used to predict the class labels of new, unseen data. Now we will give a new image to our model and it will predict whether the given image is of cat or dog
🌐
Medium
medium.com › analytics-vidhya › image-classification-using-machine-learning-support-vector-machine-svm-dc7a0ec92e01
Image Classification Using Machine Learning-Support Vector Machine(SVM) | by Vegi Shanmukh | Analytics Vidhya | Medium
March 5, 2024 - → Python syntax → Pandas library for data frame → Support vector Machine(svm) from sklearn (a.k.a scikit-learn) library → GridSearchCV → skimage library for reading the image → matplotlib for visualization purpose · First, let’s understand the concept and dive into the coding part 😉 · “Support Vector Machine” (SVM) is a supervised machine learning algorithm that can be used for both classification or regression challenges.
Discussions

python - Implementing Image classification using SVM - Stack Overflow
SVCs are binary classifiers, like the one shown above. When you apply SVCs to more than 2 classes, sklearn creates several binary SVCs (one for each pair of classes), and the final results is determined using a one-vs-rest strategy. ... import numpy as np import pandas as pd from matplotlib import pyplot as plt # #Synthesise 2-pixel images... More on stackoverflow.com
🌐 stackoverflow.com
Image classification using SVM Python - Stack Overflow
For example, the output could be whether or not there is a banana in the picture. I would like to implement a classifier using SVM with output yes or no the image contains the given characteristics. What is the simplest way to train a SVM classifier on images with 2 outputs? Is there any template to use in Python... More on stackoverflow.com
🌐 stackoverflow.com
image classification - Where can I find python code for SVM that use multiple feature data? - Data Science Stack Exchange
I am trying do an Image Classification where each sample of training data contains data of the current pixel with the 8 surrounding ones. Where can I find examples of SVM, in python, that use 5 or... More on datascience.stackexchange.com
🌐 datascience.stackexchange.com
machine learning - Code with explanation on using the OpenCV for image classification using the SVM with python from Scratch - Cross Validated
Could anyone share links and resources on Image classification using SVM(Support Vector Machine) from Scratch? Also, there should be the use of the OpenCV library. I got one link, where I am trying... More on stats.stackexchange.com
🌐 stats.stackexchange.com
🌐
RPubs
rpubs.com › Sharon_1684 › 454441
RPubs - Image classification using SVM
Image classification using SVM · by Sharon Morris · Last updated about 7 years ago · Hide Comments (–) Share Hide Toolbars ·
🌐
GitHub
github.com › whimian › SVM-Image-Classification
GitHub - whimian/SVM-Image-Classification: Image Classification with `sklearn.svm`
Jupyter notebook performing image classification with sklearn.svm.
Starred by 58 users
Forked by 46 users
Languages   Jupyter Notebook 100.0% | Jupyter Notebook 100.0%
🌐
GitHub
github.com › CheshtaK › Image-Classification
GitHub - CheshtaK/Image-Classification: Image Classification using SVM · GitHub
Image Classification using SVM. Contribute to CheshtaK/Image-Classification development by creating an account on GitHub.
Starred by 23 users
Forked by 18 users
Languages   Jupyter Notebook
🌐
Kaggle
kaggle.com › code › ashutoshvarma › image-classification-using-svm-92-accuracy
Image classification using SVM ( 92% accuracy)
Checking your browser before accessing www.kaggle.com · Click here if you are not automatically redirected after 5 seconds
Top answer
1 of 1
2

Here I will try and provide some intuition for SVCs. The code for the figures is at the end.


Linear SVCs Suppose you have a batch of 2-pixel "images", and each image is labelled either A or B. Here's a sample of 5 images:

Each pixel is considered as a feature. In this example, the two pixels defined a 2D feature space that we could visualise along an x and y axis. When you have an 8x8 image for example, you have a 64-dimensional feature space.

For each image (there are 500), you can consider the pixel values (intensities) as features. Plotting the samples in feature space (pixel 1 - pixel 2 space), and colouring them by their label, yields this representation:

Linear SVCs work by drawing a line in feature space that separates the two classes. The best line is the one with the widest margin. The entire area above the decision line is considered to belong to class A, and the entire area below the decision line is considered to be class B. When you have a new sample, the SVC assesses where it falls in relation to the decision line: above the line means it will predict 'A', and below the line means it'll predict 'B'.

The margin is defined by where it first hits a sample. The samples that limit or define the margins are called support vectors:

In this case the best margin is defined by 3 samples. In practice, you want your margin (and therefore the SVC's decision line or solution) to be defined by more samples, i.e. allow more margin violations. This is determined by the regularisation parameter C. Smaller values of C average more samples when defining the margin (more regularised, potentially underfitting). Larger values of C only let a few samples become support vectors, so the margin might end up depending on just a few noisy points (less regularised, potentially overfitting). The optimal C for your dataset is a matter of hyperparameter tuning.


Non-linear SVCs and the kernel trick The kernel trick is used where you can't simply linearly separate the classes using the features you have, such as in this example:

In such cases, you can try creating more complex features, such as these quadratic ones: pixel1 x pixel2, (pixel1)^2, (pixel2)^2 (similar to kernel='poly', degree=2). In this richer feature space, you have a better chance of finding a linear separation of the classes. The kernel trick is an efficient way of converging on a linear separation in the higher dimensional space - it saves you explicitly computing the new features, while still allowing you to navigate the space for a solution. When you bring that linear solution back into the original pixel1-pixel2 space, it'll be nonlinear because it was only linear in the higher dimensional space where it was found. This results in complex decision boundaries:

I tried to replicate this result manually - i.e. without using the kernel trick. I combined the features to create the product features described above (pixel 1 times pixel 2, etc), and then ran it through a linear SVC such that it would find a linear separation of my quadratic feature set. I get a similar result (green) to the above (red):

SVCs are binary classifiers, like the one shown above. When you apply SVCs to more than 2 classes, sklearn creates several binary SVCs (one for each pair of classes), and the final results is determined using a one-vs-rest strategy.


Code used to generate the plots

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

#
#Synthesise 2-pixel images. Each image belongs to A or B.
#
np.random.seed(rand_seed)

synth = np.random.randn(4, 250)
X_df = pd.DataFrame({'pixel1': np.c_[ synth[[0]] + 2, synth[[1]] - 2 ].ravel(),
                     'pixel2': np.c_[ synth[[2]] + 2, synth[[3]] - 5 ].ravel(),
                     'label': ['A'] * 250 + ['B'] * 250})
X_df.index.name = 'sample'
X, y = X_df.drop(columns='label').values, X_df.label.values

#
# View some samples
#
f, axs = plt.subplots(1, 5)
for ax in axs.flatten():
    i = np.random.randint(0, 500)
    ax.imshow(X_df.loc[i, ['pixel1', 'pixel2']].infer_objects().values[None, :],
              cmap='binary', vmin=-5, vmax=5)
    #ax.axis('off')
    ax.set_title(X_df.label[i])
    ax.set_xticks([])
    ax.set_yticks([])

#
# View samples in feature space (pixel1-pixel2 space)
#
f, ax = plt.subplots(figsize=(6, 3))
for label in X_df.label.unique():
    X_df.loc[X_df.label == label].plot(
        kind='scatter', x='pixel1', y='pixel2', c='red' if label=='A' else 'blue', label=label, ax=ax
    )
ax_lims = ax.axis()

#
# Fit linear SVC
#
from sklearn.svm import SVC
svc = SVC(kernel='linear').fit(X, y)

#Plot decision boundary
(w0, w1), b = svc.coef_.ravel().tolist(), svc.intercept_
x0 = np.array([-10, 10])
x1 = (-w0*x0 - b) / w1
ax.plot(x0, x1, 'k', label='decision boundary', linewidth=3)
ax.plot(x0, x1 + 1/w1, 'k:', label='margin', linewidth=1)
ax.plot(x0, x1 - 1/w1, 'k:', linewidth=1)
ax.axis(ax_lims)
ax.legend()

#Highlight support vectors
ax.scatter(
    svc.support_vectors_[:, 0], svc.support_vectors_[:, 1],
    marker='s', edgecolor='k',facecolor='none', s=70, label='support vectors'
)
ax.legend()

Code example for non-linear SVMs/kernel trick:

#
# Kernel trick demo
#
np.random.seed(1)

#Synthesise a dataset that's not linearly seperable
synth = np.random.randn(4, 250)
X_df = pd.DataFrame({'pixel1': np.c_[ synth[[0]] + 2, synth[[1], :125] - 2, synth[[1], 125:] + 8].ravel(),
                     'pixel2': np.c_[ synth[[2]] + 2, synth[[3], :125] - 5, synth[[3], :125] + 8 ].ravel(),
                     'label': ['A'] * 250 + ['B'] * 250})
X_df.index.name = 'sample'
X, y = X_df.drop(columns='label').values, X_df.label.values

#Plot the data
f, ax = plt.subplots(figsize=(6, 3))
for label in X_df.label.unique():
    X_df.loc[X_df.label == label].plot(
        kind='scatter', x='pixel1', y='pixel2', c='red' if label=='A' else 'blue', label=label, ax=ax
    )
ax_lims = ax.axis()

#
#SVC(kernel='poly') uses the kernel trick
#
svc = SVC(kernel='poly', degree=2, C=1e6).fit(X, y)

xx, yy = np.meshgrid(np.linspace(-10, 10),
                     np.linspace(-10, 10))
X = np.c_[xx.ravel(), yy.ravel()]
preds = svc.predict(X).reshape(xx.shape)
preds[preds == 'A'] = 0
preds[preds == 'B'] = 1
preds = np.float32(preds)
decision = svc.decision_function(X).reshape(xx.shape)

ax.contour(xx, yy, preds, cmap='Reds')

#
# Manual equivalent of the SVC above
# but without using the kernel trick
#

#Create a richer feature space from the existing features
#We'll opt for a quadratic polynomial feature space, like the SVC above
X2_df = X_df.copy()
X2_df['pixel1_squared'] = X2_df.pixel1 ** 2
X2_df['pixel2_squared'] = X2_df.pixel2 ** 2
X2_df['pixel1_pixel2'] = X2_df.pixel1 * X2_df.pixel2
X2 = X2_df.drop(columns='label').values

#Find a linear separation in this space
svc2 = SVC(kernel='linear', C=1e6).fit(X2, y)

#Plot the decision boundary
preds2 = svc2.predict( np.c_[X, X[:, 0]**2, X[:, 1]**2, X[:, 0]*X[:, 1]] ).reshape(xx.shape)
preds2[preds2 == 'A'] = 0
preds2[preds2 == 'B'] = 1
preds2 = np.float32(preds2)

decision2 = svc2.decision_function( np.c_[X, X[:, 0]**2, X[:, 1]**2, X[:, 0]*X[:, 1]] ).reshape(xx.shape)
ax.contour(xx, yy, preds2, cmap='Greens')
Find elsewhere
🌐
Analytics Vidhya
analyticsvidhya.com › home › build an image classifier with svm!
What is SVM | Build an Image Classifier With SVM
August 27, 2021 - Apply regularizer in the final output layer & apply activation = “linear” for binary & “softmax” for multiclass classification. history = model.fit(x = training_set, validation_data = test_set, epochs=15) ... Here I just created a simple model, you can increase the accuracy by making some changes in the model like increasing the number of layers, applying some regularization techniques like Dropout, MaxPool2D, etc. You can find the whole code here. And there it is!! It’s really simple to apply SVM for image classification.
🌐
Python Data Science Handbook
jakevdp.github.io › PythonDataScienceHandbook › 05.07-support-vector-machines.html
In-Depth: Support Vector Machines | Python Data Science Handbook
Where SVM becomes extremely powerful is when it is combined with kernels. We have seen a version of kernels before, in the basis function regressions of In Depth: Linear Regression. There we projected our data into higher-dimensional space defined by polynomials and Gaussian basis functions, and thereby were able to fit for nonlinear relationships with a linear classifier.
🌐
Metana
metana.io › metana: coding bootcamp | software, web3 & cyber › ai & machine learning › svm classifier in python: full step-by-step implementation
Implementing Support Vector Machine (SVM) Classifier in Python - Metana
July 12, 2024 - Discover how to implement the Support Vector Machine (SVM) classifier in Python. Learn step-by-step the process from data preparation to model evaluation.
🌐
GeeksforGeeks
geeksforgeeks.org › machine learning › classifying-data-using-support-vector-machinessvms-in-python
Classifying data using Support Vector Machines(SVMs) in Python - GeeksforGeeks
def plot_decision_boundary(X, y, model, scaler): h = 0.02 # Step size for mesh x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1 y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1 xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) # Predict on mesh points Z = model.predict(scaler.transform(np.c_[xx.ravel(), yy.ravel()])) Z = Z.reshape(xx.shape) # Plot decision boundary and data points plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.3) plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.coolwarm, edgecolors='k') plt.xlabel(data.feature_names[0]) plt.ylabel(data.fea
Published   August 2, 2025
🌐
GitHub
github.com › ikaurrr › Image-Classification-using-SVM-Fully-explained-tutorial
GitHub - ikaurrr/Image-Classification-using-SVM-Fully-explained-tutorial · GitHub
conda create -n image_classification python=3.x # Replace 3.x with your desired Python version source activate image_classification # Linux/macOS conda activate image_classification # Windows ...
Author   ikaurrr
🌐
GitHub
github.com › abdullahnizami77 › Image-Classification-using-Support-Vector-Machine-SVM-
GitHub - abdullahnizami77/Image-Classification-using-Support-Vector-Machine-SVM-: This repository contains Python code for image classification using the Support Vector Machine (SVM) algorithm. The project focuses on differentiating between "Infected" and "Healthy" images.
This repository contains Python code for image classification using the Support Vector Machine (SVM) algorithm. The project focuses on differentiating between "Infected" and "Healthy" images. - abdullahnizami77/Image-Classification-using-Support-Vector-Machine-SVM-
Author   abdullahnizami77
🌐
DataCamp
datacamp.com › tutorial › svm-classification-scikit-learn-python
Scikit-learn SVM Tutorial with Python (Support Vector Machines) | DataCamp
December 27, 2019 - Until now, you have learned about the theoretical background of SVM. Now you will learn about its implementation in Python using scikit-learn. In the model the building part, you can use the cancer dataset, which is a very famous multi-class classification problem. This dataset is computed from a digitized image of a fine needle aspirate (FNA) of a breast mass.
🌐
GitHub
github.com › mrgarg › Image-Classification
GitHub - mrgarg/Image-Classification: classification of images using support vector machine (python)
classification of images using support vector machine (python) - mrgarg/Image-Classification
Starred by 5 users
Forked by 4 users
Languages   Jupyter Notebook 100.0% | Jupyter Notebook 100.0%
🌐
Stack Exchange
stats.stackexchange.com › questions › 562636 › code-with-explanation-on-using-the-opencv-for-image-classification-using-the-svm
machine learning - Code with explanation on using the OpenCV for image classification using the SVM with python from Scratch - Cross Validated
Could anyone share links and resources on Image classification using SVM(Support Vector Machine) from Scratch? Also, there should be the use of the OpenCV library. I got one link, where I am trying...
🌐
MachineLearningMastery
machinelearningmastery.com › home › blog › support vector machines for image classification and detection using opencv
Support Vector Machines for Image Classification and Detection Using OpenCV - MachineLearningMastery.com
January 30, 2024 - How to apply Support Vector Machines to the problems of image classification and detection. ... Ask your questions in the comments below, and I will do my best to answer. ...using OpenCV in advanced ways and work beyond pixels · Discover how in my new Ebook: Machine Learing in OpenCV · It provides self-study tutorials with all working code in Python to turn you from a novice to expert. It equips you with logistic regression, random forest, SVM, k-means clustering, neural networks, and much more...all using the machine learning module in OpenCV