how to optimize svm python - Brave Search

Techniques to improve the accuracy of SVM classifier

stackoverflow.com › questions › 39001936 › techniques-to-improve-the-accuracy-of-svm-classifier

For SVM, it's important to have the same scaling for all features and normally it is done through scaling the values in each (column) feature such that the mean is 0 and variance is 1. Another way is to scale it such that the min and max are for example 0 and 1. However, there isn't any difference between [0, 1] and [0, 10]. Both will show the same performance.

If you insist on using SVM for classification, another way that may result in improvement is ensembling multiple SVM. In case you are using Python, you can try BaggingClassifier from sklearn.ensemble.

Also notice that you can't expect to get any performance from a real set of training data. I think 97% is a very good performance. It is possible that you overfit the data if you go higher than this.

Answer from Mahsa.Ghasemi on Stack Overflow

linkedin.com › pulse › svm-parameter-optimization-python-step-by-step-guide-usama-zafar

SVM Parameter Optimization with Python: A Step-by-Step Guide

April 16, 2023 - Grid Search: Grid search is a ... to find the optimal combination that yields the best performance. It works by creating a grid of all possible hyperparameter values and evaluating each combination using cross-validation. Here's how you can perform grid search for an SVM model in Python...

scikit-learn.org › stable › modules › svm.html

1.4. Support Vector Machines — scikit-learn 1.8.0 documentation

Proper choice of C and gamma is critical to the SVM’s performance. One is advised to use GridSearchCV with C and gamma spaced exponentially far apart to choose good values. ... You can define your own kernels by either giving the kernel as a python function or by precomputing the Gram matrix.

Discussions

machine learning - Techniques to improve the accuracy of SVM classifier - Stack Overflow

I am trying to build a classifier to predict breast cancer using the UCI dataset. I am using support vector machines. Despite my most sincere efforts to improve upon the accuracy of the classifier, I More on stackoverflow.com

stackoverflow.com

scikit learn - Making SVM run faster in python - Stack Overflow

Try setting the n_jobs parameter according to the docs here. ... Try MKL Optimizations from Continuum, see store.continuum.io/cshop/mkl-optimizations. They offer a 30 day free trial and cost is $99. I am not a sales rep, but I use their Anaconda Python distribution and like it - it was recommended at Spark Summit training. Incidentally Spark supports SVM ... More on stackoverflow.com

stackoverflow.com

How to implement Incremental Learning for Support Vector Classifier in ML?

Look into extending online learning. If you are using a linear SVM, you can. Save the weights learned and learn. Basically your goal is transfer learning and online learning bits.

More on reddit.com

r/learnmachinelearning

2

1

October 19, 2020

How to draw the decision boundary of SVM

I found it. Using libsvm for 2 features the decision line is produced like that: w=model_linear.SVs'*model_linear.sv_coef; b=-model_linear.rho; y_hat = sign(w'*X' + b); sv = full(model_linear.SVs); plot support vectors plot(sv(:,1),sv(:,2),'ko', 'MarkerSize', 10); % FOR 2D plot decision boundary plot_x = linspace(min(X(:,1)), max(X(:,1)), 30); plot_y = (-1/w(2))*(w(1)*plot_x + b); plot(plot_x, plot_y, 'k-', 'LineWidth', 1) ( https://stackoverflow.com/questions/28556266/plot-svm-margins-using-matlab-and-libsvm ) And for decision plane for 3 features: xgrid=[0:200]; ygrid=[0:200]; [X, Y]=meshgrid(xgrid, ygrid); Z=(-b-w(1)*X-w(2)*Y)/w(3);surf(X, Y, Z) ( https://www.mathworks.com/matlabcentral/answers/73087-how-to-plot-a-hyper-plane-in-3d-for-the-svm-results ) More on reddit.com

r/learnmachinelearning

1

1

July 3, 2020

Videos

Solving Optimization Problem Support Vector Machine SVM || Lesson ...

Optimization Problem Support Vector Machine SVM || Lesson 80 || ...

Support Vector Machine Optimization - Practical Machine Learning ...

Support Vector Machine (SVM) Hyperparameter Tuning In Python | ...

February 7, 2022

Machine Learning Tutorial : SVM Classification Hyperparameter ...

November 25, 2020

SVM and Parameter Optimization with GridSearchCV - YouTube

Python Programming

pythonprogramming.net › svm-optimization-python-machine-learning-tutorial

Support Vector Machine Optimization in Python

The dictionary will be { ||w|| : [w,b] }. When we're all done optimizing, we'll choose the values of w and b for whichever one in the dictionary has the lowest key value (which is ||w||). Finally, we set our transforms. We've explained that our intention there is to make sure we check every ...

dnmtechs.com › home › blog › optimizing svm performance in python 3

Optimizing SVM Performance in Python 3 - DNMTechs - Sharing and Storing Technology Knowledge

October 12, 2024 - The GridSearchCV class performs ... SVM performance in Python 3 involves various techniques such as scaling the data and tuning the hyperparameters....

kopaljain95.medium.com › how-to-improve-support-vector-machine-9561ab96ed18

How to Improve Support Vector Machine? | by Kopal Jain | Medium

February 25, 2021 - Why this step: To find an optimal combination of hyperparameters that minimizes a predefined loss function to give better results. ... y_pred = svmModel_grid.predict(X_test)print(y_pred)...[1 1 0 0 0 0 1 1 1 1 0 0 1 0 0 0 1 0 0 0 0 1 1 0 1 0 1 1 1 1 0 1 0 0 0 0 1 0 1 1 0 1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 0 0 1 0 1 0 1 0 1 1 0 0 1 0 0 1 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 1 1 1 1 0 1 1 1 0 0 1 0 1 1 0 1 1 1 0 1 0 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 1 0 1 0 1 0 0 0 1 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1 1 0 1 0 1 0 1 1 1 1 1 0 1 1 1 1 1 0 0 1 0 1 1 0 1 1 1 1 1 1 1 0 1 0 1 0]

stackoverflow.com › questions › 39001936 › techniques-to-improve-the-accuracy-of-svm-classifier

machine learning - Techniques to improve the accuracy of SVM classifier - Stack Overflow

For SVM, it's important to have the same scaling for all features and normally it is done through scaling the values in each (column) feature such that the mean is 0 and variance is 1. Another way is to scale it such that the min and max are for example 0 and 1. However, there isn't any difference between [0, 1] and [0, 10]. Both will show the same performance.

If you insist on using SVM for classification, another way that may result in improvement is ensembling multiple SVM. In case you are using Python, you can try BaggingClassifier from sklearn.ensemble.

Also notice that you can't expect to get any performance from a real set of training data. I think 97% is a very good performance. It is possible that you overfit the data if you go higher than this.

some thoughts that have come to my mind when reading your question and the arguments you putting forward with this author claiming to have achieved acc=99.51%. My first thought was OVERFITTING. I can be wrong, because it might depend on the dataset - But the first thought will be overfitting. Now my questions;

1- Has the author in his article stated whether the dataset was split into training and testing set? 2- Is this acc = 99.51% achieved with the training set or the testing one?

With the training set you can hit this acc = 99.51% when your model is overfitting. Generally, in this case the performance of the SVM classifier on unknown dataset is poor.

MachineLearningMastery

machinelearningmastery.com › home › blog › method of lagrange multipliers: the theory behind support vector machines (part 3: implementing an svm from scratch in python)

Method of Lagrange Multipliers: The Theory Behind Support Vector Machines (Part 3: Implementing An SVM From Scratch In Python) - MachineLearningMastery.com

March 15, 2022 - Tutorial on the implementation of SVM classifier from scratch using spicy.optimize library and defining linear constraints.

Find elsewhere

Google Bing Mojeek

medium.com › @prayushshrestha89 › tuning-svm-hyperparameters-making-your-classifier-shine-like-a-pro-8673639ddb16

Tuning SVM Hyperparameters: Making Your Classifier Shine Like a Pro! | by Prayush Shrestha | Medium

September 9, 2024 - First, here’s the code we’ll use to train our SVM and plot the decision boundaries. The code gives you visuals for how each hyperparameter affects the SVM’s performance on a simple dataset: the Iris flower dataset.

csie.ntu.edu.tw › ~cjlin › talks › rome.pdf pdf

Optimization, Support Vector Machines, and Machine Learning Chih-Jen Lin

Optimization, Support Vector Machines, and Machine Learning · Chih-Jen Lin · Department of Computer Science · National Taiwan University · Talk at DIS, University of Rome and IASI, CNR, September, 20 · . Outline · Introduction to machine learning and support vector · machines (SVM) SVM and optimization theory ·

eitca.org › home › what is the objective of the svm optimization problem and how is it mathematically formulated?

What is the objective of the SVM optimization problem and how is it mathematically formulated? - EITCA Academy

June 15, 2024 - For example, to train an SVM with a linear kernel using `scikit-learn`, the following code can be used: python from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.svm import SVC from sklearn.metrics import accuracy_score # Load a sample dataset iris = datasets.load_iris() X = iris.data y = iris.target # Use only two classes for binary classification X = X[y != 2] y = y[y != 2] # Split the dataset into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Create an SVM classifier with a linear kernel svm = SVC(kernel='linear', C=1.0) # Train the SVM classifier svm.fit(X_train, y_train) # Make predictions on the test set y_pred = svm.predict(X_test) # Evaluate the accuracy of the classifier accuracy = accuracy_score(y_test, y_pred) print(f'Accuracy: {accuracy:.2f}')

stackoverflow.com › questions › 31681373 › making-svm-run-faster-in-python

scikit learn - Making SVM run faster in python - Stack Overflow

If you want to stick with SVC as much as possible and train on the full dataset, you can use ensembles of SVCs that are trained on subsets of the data to reduce the number of records per classifier (which apparently has quadratic influence on complexity). Scikit supports that with the BaggingClassifier wrapper. That should give you similar (if not better) accuracy compared to a single classifier, with much less training time. The training of the individual classifiers can also be set to run in parallel using the n_jobs parameter.

Alternatively, I would also consider using a Random Forest classifier - it supports multi-class classification natively, it is fast and gives pretty good probability estimates when min_samples_leaf is set appropriately.

I did a quick tests on the iris dataset blown up 100 times with an ensemble of 10 SVCs, each one trained on 10% of the data. It is more than 10 times faster than a single classifier. These are the numbers I got on my laptop:

Single SVC: 45s

Ensemble SVC: 3s

Random Forest Classifier: 0.5s

See below the code that I used to produce the numbers:

import time
import numpy as np
from sklearn.ensemble import BaggingClassifier, RandomForestClassifier
from sklearn import datasets
from sklearn.multiclass import OneVsRestClassifier
from sklearn.svm import SVC

iris = datasets.load_iris()
X, y = iris.data, iris.target

X = np.repeat(X, 100, axis=0)
y = np.repeat(y, 100, axis=0)
start = time.time()
clf = OneVsRestClassifier(SVC(kernel='linear', probability=True, class_weight='auto'))
clf.fit(X, y)
end = time.time()
print "Single SVC", end - start, clf.score(X,y)
proba = clf.predict_proba(X)

n_estimators = 10
start = time.time()
clf = OneVsRestClassifier(BaggingClassifier(SVC(kernel='linear', probability=True, class_weight='auto'), max_samples=1.0 / n_estimators, n_estimators=n_estimators))
clf.fit(X, y)
end = time.time()
print "Bagging SVC", end - start, clf.score(X,y)
proba = clf.predict_proba(X)

start = time.time()
clf = RandomForestClassifier(min_samples_leaf=20)
clf.fit(X, y)
end = time.time()
print "Random Forest", end - start, clf.score(X,y)
proba = clf.predict_proba(X)

If you want to make sure that each record is used only once for training in the BaggingClassifier, you can set the bootstrap parameter to False.

SVM classifiers don't scale so easily. From the docs, about the complexity of sklearn.svm.SVC.

The fit time complexity is more than quadratic with the number of samples which makes it hard to scale to dataset with more than a couple of 10000 samples.

In scikit-learn you have svm.linearSVC which can scale better. Apparently it could be able to handle your data.

Alternatively you could just go with another classifier. If you want probability estimates I'd suggest logistic regression. Logistic regression also has the advantage of not needing probability calibration to output 'proper' probabilities.

Edit:

I did not know about linearSVC complexity, finally I found information in the user guide:

Also note that for the linear case, the algorithm used in LinearSVC by the liblinear implementation is much more efficient than its libsvm-based SVC counterpart and can scale almost linearly to millions of samples and/or features.

To get probability out of a linearSVC check out this link. It is just a couple links away from the probability calibration guide I linked above and contains a way to estimate probabilities. Namely:

    prob_pos = clf.decision_function(X_test)
    prob_pos = (prob_pos - prob_pos.min()) / (prob_pos.max() - prob_pos.min())

Note the estimates will probably be poor without calibration, as illustrated in the link.

Analytics Vidhya

analyticsvidhya.com › home › support vector machine (svm)

Support Vector Machine (SVM)

April 21, 2025 - These are the data points closest to the hyperplane. They are critical in determining the hyperplane’s position and orientation. Support vectors directly influence the optimal hyperplane. In this article, we looked at a very powerful machine learning algorithm, Support Vector Machine in detail. I discussed its concept of working, math intuition behind SVM, implementation in python, the tricks to classify non-linear datasets, Pros and cons, and finally, we solved a problem with the help of SVM.

hackerearth.com › home › blog › simple tutorial on svm and parameter tuning in python and r

Simple Tutorial on SVM and Parameter Tuning in Python and R

March 2, 2023 - They work by finding an optimal separating hyperplane that maximizes the margin between classes for better generalization. Slack variables and the regularization parameter C allow SVMs to handle non-separable data by balancing misclassification ...

datacamp.com › tutorial › svm-classification-scikit-learn-python

Scikit-learn SVM Tutorial with Python (Support Vector Machines) | DataCamp

December 27, 2019 - Regularization: Regularization parameter in python's Scikit-learn C parameter used to maintain regularization. Here C is the penalty parameter, which represents misclassification or error term. The misclassification or error term tells the SVM optimization how much error is bearable.

geeksforgeeks.org › machine learning › svm-hyperparameter-tuning-using-gridsearchcv-ml

SVM Hyperparameter Tuning using GridSearchCV - ML - GeeksforGeeks

Support Vector Machines (SVM) are used for classification tasks but their performance depends on the right choice of hyperparameters like C and gamma. Finding the optimal combination of these hyperparameters can be an issue.

Published 3 weeks ago

github.com › xbeat › Machine-Learning › blob › main › Building a Support Vector Machine (SVM) Algorithm from Scratch in Python.md

Machine-Learning/Building a Support Vector Machine (SVM) Algorithm from Scratch in Python.md at main · xbeat/Machine-Learning

To find the best hyperparameters for our SVM model, let's implement a simple grid search algorithm. This will help us optimize the model's performance.

Author xbeat

Quark Machine Learning

quarkml.com › home › data science › machine learning

Implementing SVM from Scratch Using Python - Quark Machine Learning

April 6, 2025 - What happens in this fit method is really simple, we are trying to reduce the loss in consecutive iterations and find the best parameters w and b. Note that here we are using Batch Gradient Descent. The weights(w) and bias(b) are updated in every iteration using the gradients and the learning rate resulting in the minimization of the loss. When the optimal parameters are found the method simply returns it along with the losses. ... Alright, we have created an SVM class only with the help of NumPy.

sciencedirect.com › science › article › pii › S0377042705005856

Efficient optimization of support vector machine learning parameters for unbalanced datasets - ScienceDirect

November 8, 2005 - In this paper, we propose an automated approach to adjusting the learning parameters using a derivative-free numerical optimizer. To make the optimization process more efficient, a new sensitive quality measure is introduced.

quora.com › How-do-you-optimize-the-parameters-of-a-support-vector-machine-SVM-classifier

How to optimize the parameters of a support vector machine (SVM) classifier - Quora

Answer: You don’t. We don’t use SVMs in the real-world because they aren’t the best model. Now you know.

github.com › yiboyang › PRMLPY › blob › master › ch7 › svm.py

PRMLPY/ch7/svm.py at master · yiboyang/PRMLPY

c1 = np.where(np.prod(X, axis=1) > 0) # indices of data points belonging to the other class ... # set up the convex optimization (QP) problem; the most important settings are the regularizer and the kernel; should be

Author yiboyang