Consider building an SVM over the (very little) data set shown in Picture for an example like this, the maximum margin weight vector will be parallel to the shortest line connecting points of the two classes, that is, the line between and , giving a weight vector of . The optimal decision surface is orthogonal to that line and intersects it at the halfway point. Therefore, it passes through . So, the SVM decision boundary is:

Working algebraically, with the standard constraint that , we seek to minimize . This happens when this constraint is satisfied with equality by the two support vectors. Further we know that the solution is for some . So we have that:

Therefore a=2/5 and b=-11/5, and . So the optimal hyperplane is given by


and b= -11/5 . The margin boundary is

This answer can be confirmed geometrically by examining picture.

Answer from Ehsan Keramat on Stack Overflow
🌐
MathWorks
mathworks.com › statistics and machine learning toolbox › classification › support vector machine classification
margin - Find classification margins for support vector machine (SVM) classifier - MATLAB
If the response variable is a character array, then each element must correspond to one row of the array. ... Class labels, specified as a categorical, character, or string array, logical or numeric vector, or cell array of character vectors. Y must be the same as the data type of SVMModel.ClassNames. (The software treats string arrays as cell arrays of character vectors.) The length of Y must equal the number of rows in Tbl or the number of rows in X. ... The edge is the weighted mean of the classification margins.
🌐
scikit-learn
scikit-learn.org › stable › auto_examples › svm › plot_svm_margin.html
SVM Margins Example — scikit-learn 1.8.0 documentation
This is sqrt(1+a^2) away vertically in # 2-d. margin = 1 / np.sqrt(np.sum(clf.coef_**2)) yy_down = yy - np.sqrt(1 + a**2) * margin yy_up = yy + np.sqrt(1 + a**2) * margin # plot the line, the points, and the nearest vectors to the plane ...
🌐
Stanford NLP Group
nlp.stanford.edu › IR-book › html › htmledition › support-vector-machines-the-linearly-separable-case-1.html
Support vector machines: The linearly separable case
While some learning methods such ... be looking for a decision surface that is maximally far away from any data point. This distance from the decision surface to the closest data point determines the margin of the classifier....
🌐
Jeremy Jordan
jeremyjordan.me › support-vector-machines
Support vector machines. - Jeremy Jordan
October 18, 2017 - We can calculate the margin of our hyperplane by comparing the classifier's prediction (${{w^T}x + b}$) to the actual class ($y _i$) since our training dataset has a labeled class for each data point.
Top answer
1 of 2
1

The optimization objective of SVM is to reduce w, b in such a way that we have the maximum margin with the hyperplane.

Mathematically speaking, it is a nonlinear optimization task which is solved by KKT (Karush-Kunn-Tucker) conditions, using lagrange multipliers.

The following video explains this in simple terms for linearly seperable case

https://www.youtube.com/watch?v=1NxnPkZM9bc

Also how this is calculated is better explained here for both linear and primal cases.

https://www.csie.ntu.edu.tw/~cjlin/talks/rome.pdf

2 of 2
0

The margin between the separating hyperplane and the class boundaries of an SVM is an essential feature of this algorithm.

See, you have two hyperplanes (1) w^tx+b>=1, if y=1 and (2) w^tx+b<=-1, if y=-1. This says that any vector with a label y=1 must lie ether on or behind the hyperplane (1). The same applies to the vectors with label y=-1 and hyperplane (2).

Note: If those requirements can be fulfilled, it implicitly means the dataset is linearly separatable. This makes sense because otherwise no such margin can be constructed.

So, what an SVM tries to find is a decision boundary which ist half-way between (1) and (2). Let's define this boundary as (3) w^tx+b=0. What you see here is that (1), (2) and (3) are parallel hyperplanes because they share the same parameters w and b. The parameters w holds the direction of those planes. Recall that a vector always has a direction and a magnitude/length.

The question is now: How can one calculate the hyperplane (3)? The equations (1) and (2) tell us that any vector with a label y=1 which is closest to (3) lies exactly on the hyperplane (1), hence (1) becomes w^tx+b=1 for such x. The similar applies for the closest vectors with a negative label and (2). Those vectors on the planes called 'support vectors' and the decision boundary (3) only depends on those, because one simply can subtract (2) from (1) for the support vectors and gets:

w^tx+b-w^tx+b=1-(-1) => wt^x-w^tx=2

Note: x for the two planes are different support vectors.

Now, we want to get the direction of w but ignoring it's length to get the shortest distance between (3) and the other planes. This distance is a perpendicular line segment from (3) to the others. To do so, one can divide by the length of w to get the norm vector which is perpendicular to (3), hence (wt^x-w^tx)/||w||=2/||w||. By ignoring the left hand site (it's equal) we see that the distance between the two planes is actually 2/||w||. This distance must be maximized.

Edit: As others state here, use Lagrange multipliers or the SMO algorithm to minimize the term 1/2 ||w||^2 s.t. y(w^tx+b)>=1 This is the convex form of the optimization problem for the primal svm.

Find elsewhere
🌐
SVM Tutorial
svm-tutorial.com › home › svm - understanding the math - part 2
SVM - Understanding the math - What is a vector?
July 26, 2020 - We can see in Figure 23 that this distance is the same thing as . Let's compute this value. We start with two vectors, which is normal to the hyperplane, and which is the vector between the origin and . ... We did it ! We computed the margin of the hyperplane !
🌐
scikit-learn
scikit-learn.org › dev › auto_examples › svm › plot_svm_margin.html
SVM Margins Example — scikit-learn 1.8.dev0 documentation
This is sqrt(1+a^2) away vertically in # 2-d. margin = 1 / np.sqrt(np.sum(clf.coef_**2)) yy_down = yy - np.sqrt(1 + a**2) * margin yy_up = yy + np.sqrt(1 + a**2) * margin # plot the line, the points, and the nearest vectors to the plane ...
🌐
GeeksforGeeks
geeksforgeeks.org › machine learning › using-a-hard-margin-vs-soft-margin-in-svm
Using a Hard Margin vs Soft Margin in SVM - GeeksforGeeks
July 23, 2025 - The hyperplane equation plays a crucial role in hard margin SVMs because it defines the boundary that separates the classes. Ideally, we want this boundary to have a maximum margin from the nearest data points of each class. The objective function in hard margin SVM aims to find the weight vector and bias term that maximize this margin while ensuring that all data points are correctly classified.
🌐
Towards Data Science
towardsdatascience.com › home › latest › support vector machines – soft margin formulation and kernel trick
Support Vector Machines - Soft Margin Formulation and Kernel Trick | Towards Data Science
January 21, 2025 - Let us compare this with SVM’s objective which handles the linearly separable cases (as given below). ... We see that only ξ_i terms are extra in the modified objective and everything else is the same. Point to note: In the final solution, λ_is corresponding to points that are closest to the margin and on the wrong side of the margin (i.e.
🌐
Medium
medium.com › @skilltohire › support-vector-machines-4d28a427ebd
Support Vector Machines. Introduction to margins of separation… | by skilltohire | Medium
July 28, 2020 - Introduction to margins of separation: Margin of separation as the name itself suggests is some sort of margin or boundary which is used…
🌐
Medium
medium.com › @apurvjain37 › support-vector-machines-s-v-m-hyperplane-and-margins-ee2f083381b4
Support Vector Machines(S.V.M) — Hyperplane and Margins | by apurv jain | Medium
September 25, 2020 - An SVM model is basically a representation of different classes in a hyperplane in multidimensional space. The hyperplane will be generated in an iterative manner by SVM so that the error can be minimized. The goal of SVM is to divide the datasets into classes to find a maximum marginal hyperplane ...
Top answer
1 of 3
11

Solving the SVM problem by inspection

By inspection we can see that the boundary decision line is the function $x_2 = x_1 - 3$. Using the formula $w^T x + b = 0$ we can obtain a first guess of the parameters as

$$ w = [1,-1] \ \ b = -3$$

Using these values we would obtain the following width between the support vectors: $\frac{2}{\sqrt{2}} = \sqrt{2}$. Again by inspection we see that the width between the support vectors is in fact of length $4 \sqrt{2}$ meaning that these values are incorrect.

Recall that scaling the boundary by a factor of $c$ does not change the boundary line, hence we can generalize the equation as

$$ cx_1 - xc_2 - 3c = 0$$ $$ w = [c,-c] \ \ b = -3c$$

Plugging back into the equation for the width we get

\begin{aligned} \frac{2}{||w||} & = 4 \sqrt{2} \\ \frac{2}{\sqrt{2}c} & = 4 \sqrt{2} \\ c = \frac{1}{4} \end{aligned}

Hence the parameters are in fact $$ w = [\frac{1}{4},-\frac{1}{4}] \ \ b = -\frac{3}{4}$$

To find the values of $\alpha_i$ we can use the following two constraints which come from the dual problem:

$$ w = \sum_i^m \alpha_i y^{(i)} x^{(i)} $$ $$\sum_i^m \alpha_i y^{(i)} = 0 $$

And using the fact that $\alpha_i \geq 0$ for support vectors only (i.e. 3 vectors in this case) we obtain the system of simultaneous linear equations: \begin{aligned} \begin{bmatrix} 6 \alpha_1 - 2 \alpha_2 - 3 \alpha_3 \\ -1 \alpha_1 - 3 \alpha_2 - 4 \alpha_3 \\ 1 \alpha_1 - 2 \alpha_2 - 1 \alpha_3 \end{bmatrix} & = \begin{bmatrix} 1/4 \\ -1/4 \\ 0 \end{bmatrix} \\ \alpha & = \begin{bmatrix} 1/16 \\ 1/16 \\ 0 \end{bmatrix} \end{aligned}

Source

  • https://ai6034.mit.edu/wiki/images/SVM_and_Boosting.pdf
  • Full post here
2 of 3
1

Instead of computing the width between the support vectors (which in this case was easy because two of them happened to be directly across from each other over the decision line), it might be more convenient to use that the support vectors should have value $\pm1$ under the decision function:

$$ cx_1 - cx_2 -3c =0 $$

represents the line, but using the point $B=(2,3)$ with target $-1$ in the diagram, we should have

$$ c(2) - c(3) -3c =-1$$

and hence (again) $c=1/4$.

🌐
DEV Community
dev.to › harsimranjit_singh_0133dc › support-vector-machines-from-hard-margin-to-soft-margin-1bj1
Support Vector Machines: From Hard Margin to Soft Margin - DEV Community
August 12, 2024 - In SVMs, the margin is the distance between the hyperplane and the closest data points from each class(support vectors). To calculate the margin, we use the following formula:
🌐
Quora
quora.com › What-is-the-mathematical-definition-of-margin-in-support-vector-machine-SVM
What is the mathematical definition of margin in support vector machine (SVM)? - Quora
Answer (1 of 2): I’ve explained SVMs in detail here — In layman's terms, how does SVM work? — including what is the margin. In short, you want to find a line that separates the points in two classes, while being as far as possible from each class. So in the figure below, the bold line ...
🌐
Stack Exchange
math.stackexchange.com › questions › 2609510 › how-to-calculate-the-width-of-the-separating-margin-in-support-vector-machine-s
optimization - How to calculate the width of the separating margin in support vector machine (SVM) - Mathematics Stack Exchange
January 20, 2018 - There are 4 points, $\mathrm x_1 = (1,1,0,1), \;\mathrm x_2= (1,1,0,-1) , \; \mathrm x_3 = (-1,1,0,1) , \; \mathrm x_4 = (1,-2,0,1)$ in a 4-Dimensional Euclidean space. Let the labels of $x_1$ and $x_2$ be $+1$ and of $x_3$ and $x_4$ be $-1.$ Suppose there is a separating hyperplane between the two labels given by $\mathrm w^\top \mathrm x = 0,$ where $\mathrm w = (4,3,0,0).$ Calculate margin of separating hyperplane.