Radial Support Vector Machines(rbf-SVM)

1 of 9

111

You have to do some encoding before using fit(). As it was told fit() does not accept strings, but you solve this.

There are several classes that can be used :

LabelEncoder : turn your string into incremental value
OneHotEncoder : use One-of-K algorithm to transform your String into integer

2 of 9

LabelEncoding worked for me (basically you've to encode your data feature-wise) (mydata is a 2d array of string datatype):

myData=np.genfromtxt(filecsv, delimiter=",", dtype ="|a20" ,skip_header=1);

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
for i in range(*NUMBER OF FEATURES*):
    myData[:,i] = le.fit_transform(myData[:,i])

datascience.stackexchange.com › questions › 110669 › could-not-convert-string-to-float-yellow

machine learning - could not convert string to float: 'YELLOW' - Data Science Stack Exchange

ML models have an understanding of vectors only. As the data has categorical features, need to encode them and apply the ML models. That is the reason you encounter ValueError: could not convert string to float: 'YELLOW'

Stack Overflow

stackoverflow.com › questions › 8420143 › valueerror-could-not-convert-string-to-float-id

python - ValueError: could not convert string to float: id - Stack Overflow

reddit.com › r/learnpython › can not convert a string to a float when using naive bayes from sklearn.

1 of 12

Obviously some of your lines don't have valid float data, specifically some line have text id which can't be converted to float.

When you try it in interactive prompt you are trying only first line, so best way is to print the line where you are getting this error and you will know the wrong line e.g.

#!/usr/bin/python

import os,sys
from scipy import stats
import numpy as np

f=open('data2.txt', 'r').readlines()
N=len(f)-1
for i in range(0,N):
    w=f[i].split()
    l1=w[1:8]
    l2=w[8:15]
    try:
        list1=[float(x) for x in l1]
        list2=[float(x) for x in l2]
    except ValueError,e:
        print "error",e,"on line",i
    result=stats.ttest_ind(list1,list2)
    print result[1]

2 of 12

My error was very simple: the text file containing the data had some space (so not visible) character on the last line.

As an output of grep, I had 45 instead of just 45.

r/learnpython on Reddit: Can not convert a string to a float when using Naive Bayes from Sklearn.

February 27, 2022 -

So I am trying to use the Naive Bayes model from Sklearn to run some data. It is a machine learning model. Although every time I run it, it says I can't convert a string to a float. Here's my code:

# Import necessary libraries.
from sklearn.naive_bayes import MultinomialNB # Import the Naive Bayes model.
from sklearn.model_selection import train_test_split # Import the train_test_split function.
from sklearn.metrics import accuracy_score # For testing the accuracy of the model
import pandas as pd # Import pandas

print('NAIVE_BAYES.PY IS RUNNING')

df = pd.read_csv('C:\\Users\\gjohn\\Documents\\code\\machineLearning\\trading_bot\\train_test.csv') # Reads in the filtered posts.
classes = df['class'] # Gets the labels from the dataframe.
df.drop('class', axis=1) # Drops the class column from the dataframe.

# Split the data into training and testing data.
train_x, test_x, train_y, test_y = train_test_split(df, classes, test_size=0.2)

# Create the Naive Bayes model.
model = MultinomialNB()
# Train the model.
model.fit(train_x, train_y)
# Test the model.
y_predict = model.predict(test_x)
# Calculate the accuracy of the model.
accuracy = accuracy_score(test_y, y_predict)
print(f'Accuracy: {accuracy}')

And here is the error:

Traceback (most recent call last):
  File "c:\Users\gjohn\Documents\code\machineLearning\trading_bot\naive_bayes.py", line 22, in <module>
    model.fit(train_x, train_y)
  File "C:\Users\gjohn\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\naive_bayes.py", line 663, in fit
    X, y = self._check_X_y(X, y)
  File "C:\Users\gjohn\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\naive_bayes.py", line 523, in _check_X_y
    return self._validate_data(X, y, accept_sparse="csr", reset=reset)
  File "C:\Users\gjohn\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\base.py", line 572, in _validate_data
    X, y = check_X_y(X, y, **check_params)
  File "C:\Users\gjohn\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\utils\validation.py", line 956, in check_X_y
    X = check_array(
  File "C:\Users\gjohn\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\utils\validation.py", line 738, in check_array
    array = np.asarray(array, order=order, dtype=dtype)
  File "C:\Users\gjohn\AppData\Local\Programs\Python\Python39\lib\site-packages\numpy\core\_asarray.py", line 83, in asarray
    return array(a, dtype, copy=False, order=order)
  File "C:\Users\gjohn\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\generic.py", line 1993, in __array__
    return np.asarray(self._values, dtype=dtype)
  File "C:\Users\gjohn\AppData\Local\Programs\Python\Python39\lib\site-packages\numpy\core\_asarray.py", line 83, in asarray
    return array(a, dtype, copy=False, order=order)
ValueError: could not convert string to float: 'Tranzylvania'

And every time I run it, it is a different word. For example this time it was "Tranzylvania", but the next time I run it, there will be something else. Can anyone please help me? I'm stumped.

The word is different every time because train_test_split shuffles your samples when you use it. You're getting that error because your values to input into your model need to be floating numbers, not strings. You'll need to remove the strings from your DataFrame if they aren't supposed to be there, or you will need to convert them to either a floating-point or integer-based representation if you need to keep them.

1 of 1

datascience.stackexchange.com › questions › 124913 › python-sk-learn-knn-imputer-valueerror-could-not-convert-string-to-float

pandas - Python SK-Learn KNN Imputer ( "ValueError: could not convert string to float: ) - Data Science Stack Exchange

codedamn.com › news › programming

1 of 1

After replacing the ? with np.NaN, you will be able to convert the type of the BareNuc column from object to float (as np.NaN is considered a special floating-point value and cannot be converted to any other type than float). with that, you wouldn’t need to convert to a NumPy array.

df = df.astype({"BareNuc": float})

At the end, all columns will be used as float type, so that wouldn’t affect the resource allocation.

Codedamn

Fix ValueError: could not convert string to float

November 7, 2022 - input_string = '' try: converted = float(input_string) except ValueError: converted = 0 print(converted) # 0 (as except block is executed) Code language: Python (python) This method has a huge advantage in that all three previous approaches could be used in the except block and can handle any case with ease.

GeeksforGeeks

geeksforgeeks.org › machine learning › valueerror-could-not-convert-string-to-float-decision-tree

ValueError: could not convert string to float decision tree - GeeksforGeeks

July 23, 2025 - When you encounter the ValueError: could not convert string to float error in decision trees, it means the model is trying to process text data, which it can't. Decision trees work with numerical data, so you need to convert categorical data ...

Statology

statology.org › home › how to fix in pandas: could not convert string to float

How to Fix in Pandas: could not convert string to float

July 16, 2022 - #attempt to convert 'revenue' from string to float df['revenue'] = df['revenue'].astype(float) ValueError: could not convert string to float: '$400.42'

github.com › scikit-learn › scikit-learn › discussions › 19624

ValueError: could not convert string to float: 'OLIFE' · scikit-learn/scikit-learn · Discussion #19624

Hi, im trying to calibrate logistic regression classifier and i get the error ValueError: could not convert string to float: 'OLIFE', I did onehotencode my categorical values using pipeline, it works fine when i test my model but when i calibrate it doesnt work even if im passing the pipeline model in to CalibratedClassifierCV, kindly assist please as im new to machine learning · import pandas as pd import numpy as np from pandas_profiling import ProfileReport from sklearn.preprocessing import OneHotEncoder,LabelEncoder from sklearn.impute import SimpleImputer from sklearn.compose import make

Author scikit-learn

Find elsewhere

Google Bing Mojeek

Medium

louwersj.medium.com › solved-valueerror-could-not-convert-string-to-float-afbc5c3828e7

Solved: ValueError: could not convert string to float: ‘’ | by Johan Louwers | Medium

September 23, 2022 - This commonly is the case if your dataset is not clean and “avg_use” contains values that are not able to be converted to a float. Even though a dataset should be clean, obviously, reality shows that datasets are not always as clean as we ...

datascience.stackexchange.com › questions › 102368 › calibrated-classifier-valueerror-could-not-convert-string-to-float

scikit learn - calibrated classifier ValueError: could not convert string to float - Data Science Stack Exchange

edureka.co › home › community › categories › machine learning › valueerror could not convert string to float in...

1 of 2

Once I assume you are using text data as your input matrix X. The first point is that you have to include your preprocessing step as you would do when not using a calibrated classifier, so as you already know you can use a Pipeline like so:

calibrated_svc = CalibratedClassifierCV(linear_svc,
                                        method='sigmoid',
                                        cv=3) 


model = Pipeline([('tfidf', TfidfVectorizer()), ('clf', calibrated_svc)]).fit(X, y)

Another option if your are interested in using probabilities in your SVM you can set the parameter probability = True inside your SVM but using the class SVC with a linear kernel is equvilalent to LinearSVC like:

model = Pipeline([('tfidf', TfidfVectorizer()), ('clf',SVC(probability = True, kernel = 'linear') )]).fit(X, y)

This will run a Logistic regression on the top of the binary predictions of the SVM.

Both options are feasible if you are only interested in using probabilities per se but if you are also interested on the calibration of your probabilities, the first option is better

2 of 2

For any kind of Machine Learning task or a NLP task (which is what you are doing), you need to convert string/text values to numeric values. The machine cannot uderstand or work with string values. It only understands numeric values.

So for example if you are doing a machine learning task, you would use libraries like OneHotEncoder, LabelEncoder etc to covert string values to numeric.

For your case, you are working on a NLP task which uses text values instead of string values. So you need to convert them into numeric values first and then fit the preferred algorithm. There are many ways to encode text into numeric such as Bag of Words, Tfidf, word2vec etc. You can read about them by searching on Google.

Edureka Community

ValueError could not convert string to float in Machine learning | Edureka Community

April 14, 2020 - Hi Guys, I am trying to filter my dataset using constant variable method, but it shows me the bellow ... NE 37010-5101' How can I solve this error?

github.com › scikit-learn › scikit-learn › issues › 19625

ValueError: could not convert string to float: 'OLIFE' · Issue #19625 · scikit-learn/scikit-learn

November 20, 2020 - Hi, im trying to calibrate logistic regression classifier and i get the error ValueError: could not convert string to float: 'OLIFE', I did onehotencode my categorical values using pipeline, it wor...

Author Solly7

reddit.com › r/learnpython › sklearn problem - valueerror: could not convert string to float: normal

r/learnpython on Reddit: Sklearn Problem - ValueError: could not convert string to float: Normal

July 20, 2017 -

I get the following error when I run my script - "ValueError: could not convert string to float: Normal

Any help would be greatly appreciated

rom sklearn.linear_model import LogisticRegression #logistic regression from sklearn import svm #support vector Machine from sklearn.ensemble import RandomForestClassifier #Random Forest from sklearn.neighbors import KNeighborsClassifier #KNN from sklearn.naive_bayes import GaussianNB #Naive bayes from sklearn.tree import DecisionTreeClassifier #Decision Tree from sklearn.model_selection import train_test_split #training and testing data split from sklearn import metrics #accuracy measure from sklearn.metrics import confusion_matrix #for confusion matrix

train,test=train_test_split(train_csv, test_size=0.3, random_state=0) train_X=train[train.columns[1:]] train_Y=train[train.columns[:1]] test_X=test[test.columns[1:]] test_Y=test[test.columns[:1]] X=train_csv[train_csv.columns[1:]] Y=train_csv['SalePrice']

Radial Support Vector Machines(rbf-SVM)

model=svm.SVC(kernel='rbf',C=1,gamma=0.1) model.fit(train_X, train_Y) prediction1=model.predict(test_X) print('Accuracy for rbf SVM is ', metrics.accuracy_score(prediction1,test_Y))

Traceback (most recent call last): File "house_prices.py", line 318, in model.fit(train_X,train_Y)

1 of 3

2 of 3

It would be helpful to have the full error traceback. The error means that some part of the code attempts to convert values to floating point numbers, but encountered the text "Normal," which is not a valid floating point number.

github.com › DanilZherebtsov › verstack › issues › 30

could not convert string to float: 'x' - using FeatureSelector · Issue #30 · DanilZherebtsov/verstack

November 7, 2022 - 842 """ 843 first_call = not hasattr(self, "n_samples_seen_") --> 844 X = self._validate_data( 845 X, 846 accept_sparse=("csr", "csc"), 847 dtype=FLOAT_DTYPES, 848 force_all_finite="allow-nan", 849 reset=first_call, 850 ) 851 n_features = X.shape[1] 853 if sample_weight is not None: File C:\Anaconda3\envs\python_310\lib\site-packages\sklearn\base.py:577, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, **check_params) 575 raise ValueError("Validation should be done on X, y or both.") 576 elif not no_val_X and no_val_y: --> 577 X = check_array(X, input_name="X", **check_p

Author balgad

datascience.stackexchange.com › questions › 114774 › when-i-run-random-forest-classification-model-then-at-every-rows-of-my-train-dat

python - when I run Random Forest classification model then at every rows of my train data set show this error (ValueError: could not convert string to float:) - Data Science Stack Exchange

''' from sklearn.ensemble import RandomForestClassifier forest = RandomForestClassifier(n_estimators = 500, max_depth = None, min_samples_split=2, min_samples_leaf =1, bootstrap = True, random_state=0) forest = forest.fit(X_train, y_train) print(forest.score(X_test, y_test)) ''' ... The error message is not lying to you :) It cannot convert the string "one favourite christmas gifts year love" to a float.

Saturn Cloud

saturncloud.io › blog › how-to-handle-the-pandas-valueerror-could-not-convert-string-to-float

How to Handle the pandas ValueError could not convert string to float | Saturn Cloud Blog

October 19, 2023 - The pandas ValueError occurs when you use the float() function to convert a string to a float, but the string contains characters that cannot be interpreted as a float.

GeeksforGeeks

geeksforgeeks.org › pandas › how-to-handle-pandas-value-error-could-not-convert-string-to-float

How To Handle Pandas Value Error : Could Not Convert String To Float - GeeksforGeeks

July 23, 2025 - When the string in the dataframe contains inappropriate characters that cause problems in converting the string to a float type, the replace() method is a good and easy way to remove those characters from the string.

github.com › scikit-learn › scikit-learn › issues › 14297

Value error could not convert string to float: in clf.score() for LogisticRegression · Issue #14297 · scikit-learn/scikit-learn

July 9, 2019 - ~/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator) 565 # make sure we actually converted to numeric: 566 if dtype_numeric and array.dtype.kind == "O": --> 567 array = array.astype(np.float64) 568 if not allow_nd and array.ndim >= 3: 569 raise ValueError("Found array with dim %d.

Author asis-shukla