If a column contains string or is treated as string, it will have a dtype of object (but not necessarily true backward -- more below). Here is a simple example:
import pandas as pd
df = pd.DataFrame({'SpT': ['string1', 'string2', 'string3'],
'num': ['0.1', '0.2', '0.3'],
'strange': ['0.1', '0.2', 0.3]})
print df.dtypes
#SpT object
#num object
#strange object
#dtype: object
If a column contains only strings, we can apply len on it like what you did should work fine:
print df['num'].apply(lambda x: len(x))
#0 3
#1 3
#2 3
However, a dtype of object does not means it only contains strings. For example, the column strange contains objects with mixed types -- and some str and a float. Applying the function len will raise an error similar to what you have seen:
print df['strange'].apply(lambda x: len(x))
# TypeError: object of type 'float' has no len()
Thus, the problem could be that you have not properly converted the column to string, and the column still contains mixed object types.
Continuing the above example, let us convert strange to strings and check if apply works:
df['strange'] = df['strange'].astype(str)
print df['strange'].apply(lambda x: len(x))
#0 3
#1 3
#2 3
(There is a suspicious discrepancy between df_cleaned and df_clean there in your question, is it a typo or a mistake in the code that causes the problem?)
If a column contains string or is treated as string, it will have a dtype of object (but not necessarily true backward -- more below). Here is a simple example:
import pandas as pd
df = pd.DataFrame({'SpT': ['string1', 'string2', 'string3'],
'num': ['0.1', '0.2', '0.3'],
'strange': ['0.1', '0.2', 0.3]})
print df.dtypes
#SpT object
#num object
#strange object
#dtype: object
If a column contains only strings, we can apply len on it like what you did should work fine:
print df['num'].apply(lambda x: len(x))
#0 3
#1 3
#2 3
However, a dtype of object does not means it only contains strings. For example, the column strange contains objects with mixed types -- and some str and a float. Applying the function len will raise an error similar to what you have seen:
print df['strange'].apply(lambda x: len(x))
# TypeError: object of type 'float' has no len()
Thus, the problem could be that you have not properly converted the column to string, and the column still contains mixed object types.
Continuing the above example, let us convert strange to strings and check if apply works:
df['strange'] = df['strange'].astype(str)
print df['strange'].apply(lambda x: len(x))
#0 3
#1 3
#2 3
(There is a suspicious discrepancy between df_cleaned and df_clean there in your question, is it a typo or a mistake in the code that causes the problem?)
"Hidden" nulls
If the column dtype is object, TypeError: object of type 'float' has no len() often occurs if the column contains NaN. Check if that's the case by calling
df['Col2'].isna().any()
If it returns True, then there's NaN and you probably need to handle that.
Vectorized str. methods
If null handling is not important, you can also call vectorized str.len(), str.isdigit() etc. methods. For example, the code in the OP can be written as:
df['Col3'] = df['Col2'].str.len().ge(2) & df['Col2'].str[0].str.isalpha()
to get the desired output without errors.
'string' dtype
Since pandas 1.0, there's a new 'string' dtype where you can keep a Nullable integer dtype after casting a column into a 'string' dtype. For example, if you want to convert floats to strings without decimals, yet the column contains NaN values that you want to keep as null, you can use 'string' dtype.
df = pd.DataFrame({
'Col1': [1.2, 3.4, 5.5, float('nan')]
})
df['Col1'] = df['Col1'].astype('string').str.split('.').str[0]
returns
0 1
1 3
2 5
3 <NA>
Name: Col1, dtype: object
where <NA> is a Nullable integer that you can drop with dropna() while df['Col1'].astype(str) casts NaNs into strings.
Videos
I tried
n1=input('First number')
n2=input('Second number')
sum = float(n1) + float(n2)
str(sum)
print('The sum of the values is: ' + sum)My error is:
TypeError: can only concatenate str (not "float") to str
I tried googling this error and got some answers like print(f' which I didn't really understand, and some others that looked a little complicated, I am very new.
I am trying to improve my googling skills.
NOTE:
pd.convert_objectshas now been deprecated. You should usepd.Series.astype(float)orpd.to_numericas described in other answers.
This is available in 0.11. Forces conversion (or set's to nan)
This will work even when astype will fail; its also series by series
so it won't convert say a complete string column
In [10]: df = DataFrame(dict(A = Series(['1.0','1']), B = Series(['1.0','foo'])))
In [11]: df
Out[11]:
A B
0 1.0 1.0
1 1 foo
In [12]: df.dtypes
Out[12]:
A object
B object
dtype: object
In [13]: df.convert_objects(convert_numeric=True)
Out[13]:
A B
0 1 1
1 1 NaN
In [14]: df.convert_objects(convert_numeric=True).dtypes
Out[14]:
A float64
B float64
dtype: object
You can try df.column_name = df.column_name.astype(float). As for the NaN values, you need to specify how they should be converted, but you can use the .fillna method to do it.
Example:
In [12]: df
Out[12]:
a b
0 0.1 0.2
1 NaN 0.3
2 0.4 0.5
In [13]: df.a.values
Out[13]: array(['0.1', nan, '0.4'], dtype=object)
In [14]: df.a = df.a.astype(float).fillna(0.0)
In [15]: df
Out[15]:
a b
0 0.1 0.2
1 0.0 0.3
2 0.4 0.5
In [16]: df.a.values
Out[16]: array([ 0.1, 0. , 0.4])
Hello Guys,
I have a question regarding DataFrames. I have a line of code, which looks similar to this:
import os
import pandas as pd
import numpy as np
file_paths = ('C:/Users/DR/Documents/Polymer Science/Mitarbeiterpraktika/Forschungsmodul I Elektrochemie/Wasserspaltung/Co15-FTO_calc_WS_LS.txt', 'C:/Users/DR/Documents/Polymer Science/Mitarbeiterpraktika/Forschungsmodul I Elektrochemie/Wasserspaltung/Co16-FTO_calc_WS_LS.txt')
files_infos = pd.DataFrame()
for n, file_path in enumerate(file_paths) :
file_name = os.path.basename(file_path)
file_name = file_name.split(".txt")[0]
files_infos[file_name] = [np.nan] * len(files_infos)
files_infos.at["file_path", file_name] = file_path
If I run this script I get this Error. ValueError: could not convert string to float: 'C:/Users/DR/Documents/Polymer Science/Mitarbeiterpraktika/Forschungsmodul I Elektrochemie/Wasserspaltung/Co16-FTO_calc_WS_LS.txt'
I just don´t understand, why pandas tries to convert my string into a float. I thougt mabye it has something to do with the dtype of the DataFrame, but I couldn´t really find an answer (the dtype is object). What I find really confusing about this Error is, that I did use the same approach in different projects and it didn´t occur before.
Can someone of you mabye explain to me, why this error occurs and what I have to look up to find a solution? Please dont give me a solution to my problem, since I would like to solve it by myself in order to learn it.
Thank you for your help in advance.