In pandas the object type is used when there is not a clear distinction between the types stored in the column.
So, I guess that in your column, some objects are float type and some objects are str type. Or maybe, you are also dealing with NaN objects, NaN objects are float objects.
a) Convert the column to string: Are you getting your DataFrame from a CSV or XLS format file? Then at the moment of reading the file, you can specify that that column is an str type or just make the type conversion of the column you are dealing with.
b) After that, you can apply the string changes and/or deal with the NaN objects.
c) Finally, you transform your column into float type`.
In pandas the object type is used when there is not a clear distinction between the types stored in the column.
So, I guess that in your column, some objects are float type and some objects are str type. Or maybe, you are also dealing with NaN objects, NaN objects are float objects.
a) Convert the column to string: Are you getting your DataFrame from a CSV or XLS format file? Then at the moment of reading the file, you can specify that that column is an str type or just make the type conversion of the column you are dealing with.
b) After that, you can apply the string changes and/or deal with the NaN objects.
c) Finally, you transform your column into float type`.
Maybe it's a very rudimentary method but I would just do
listt = []
for i in data['column_name']:
listt.append(float(i))
data['FloatData'] = listt
str.find returns dtype: float64
'NoneType' object has no attribute 'loc'
python - Replacing column value by NaN when another column has a certain value in pandas - Stack Overflow
Python Script (Labs) Error No attribute Loc
Videos
To pickup from the comment: "I was doing this:"
df = [df.hc== 2]
What you create there is a "mask": an array with booleans that says which part of the index fulfilled your condition.
To filter your dataframe on your condition you want to do this:
df = df[df.hc == 2]
A bit more explicit is this:
mask = df.hc == 2
df = df[mask]
If you want to keep the entire dataframe and only want to replace specific values, there are methods such replace: Python pandas equivalent for replace. Also another (performance wise great) method would be creating a separate DataFrame with the from/to values as column and using pd.merge to combine it into the existing DataFrame. And using your index to set values is also possible:
df[mask]['fname'] = 'Johnson'
But for a larger set of replaces you would want to use one of the two other methods or use "apply" with a lambda function (for value transformations). Last but not least: you can use .fillna('bla') to rapidly fill up NA values.
The traceback indicates to you that df is a list and not a DataFrame as expected in your line of code.
It means that between df = pd.read_csv("test.csv") and df.loc[df.ID == 103, ['fname', 'lname']] = 'Michael', 'Johnson' you have other lines of codes that assigns a list object to df. Review that piece of code to find your bug
I am learning pandas right now and I am working on a little project for fun with Netflix movies. I am trying to filter the dataset, so it contains only movies with a genre specified by the user; however, I am getting the following error for this chunk of code after an input has been made:
genre = input("Genre: ")
for index in df.index:
if df.loc[index, genre] == 0:
df = df.drop(index, inplace = True)
print(df.head())
if df.loc[index, genre] == 0:
^^^^^^
AttributeError: 'NoneType' object has no attribute 'loc'
The error points to this line:
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() \
if x not in stop_words))
split is being used here as a method of Python's built-in str class. Your error indicates one or more values in df['content'] is of type float. This could be because there is a null value, i.e. NaN, or a non-null float value.
One workaround, which will stringify floats, is to just apply str on x before using split:
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in str(x).split() \
if x not in stop_words))
Alternatively, and possibly a better solution, be explicit and use a named function with a try / except clause:
def converter(x):
try:
return ' '.join([x.lower() for x in str(x).split() if x not in stop_words])
except AttributeError:
return None # or some other value
df['content'] = df['content'].apply(converter)
Since pd.Series.apply is just a loop with overhead, you may find a list comprehension or map more efficient:
df['content'] = [converter(x) for x in df['content']]
df['content'] = list(map(converter, df['content']))
split() is a python method which is only applicable to strings. It seems that your column "content" not only contains strings but also other values like floats to which you cannot apply the .split() mehthod.
Try converting the values to a string by using str(x).split() or by converting the entire column to strings first, which would be more efficient. You do this as follows:
df['column_name'].astype(str)
Try this instead,
print(
"{:.3f}% {} ({} sentences)".format(pcent, gender, nsents)
)
Refer the latest docs for more examples and check the Py version!
You could also use {:.3%} instead of {:.3f}%.
It will transform the value into percentages automatically.
That means "{:.3%}".format(0.3) will print "30%" while you have to write "{:.3f}%".format(0.3 * 100) to get "30%" as well.
As pointed out by warren-weckesser this can also happen if you use dtype object (and in fact this is likelier the issue you are facing):
>>> s = pd.Series([1.0], dtype='object')
>>> s
0 1
dtype: object
>>> np.log(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'float' object has no attribute 'log'
You can address this by setting the dtype to float explicitly:
>>> np.log(s.astype('float64'))
0 0.0
dtype: float64
In your case:
np.log(df['price'].astype('float'))
Note: You can have more control using to_numeric.
First/alternative answer:
You have a float variable np in scope.
The problem is that:
import numpy as np
np = 1
np.log
is perfectly valid python.
>>> import numpy as np
>>> np = 1.
>>> np.log
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'float' object has no attribute 'log'
The solution is not to use np are a variable name, or other popular import abbreviations pd or dt etc.
You can pick this kind of error up using a linter.
The problem is outside of the code that you posted. Your code works. At least if I assume that df is a dict. But I cannot assume anything else, because your question does not specify it.
import numpy as np
df = {'price': 10.0}
df['ln_price'] = np.log(df['price'])
print(df)
{'price': 10.0, 'ln_price': 2.3025850929940459}
I'm doing minimization using barrier method based on scipy.optimize.minimize.
My objective function has one term, which is supposed to be inside a logarithm. I've tried running the program without the log and it works fine.
However when I add the sympy.log, then I get:
AttributeError: 'Float' object has no attribute 'gradient'
What is wrong?
More traceback:
Traceback (most recent call last):
File "C:\Users\matti\AppData\Roaming\Python\Python36\site-packages\scipy\optimize\_minimize.py", line 484, in minimize
**options)
File "C:\Users\matti\AppData\Roaming\Python\Python36\site-packages\scipy\optimize\optimize.py", line 1551, in _minimize_newtoncg
b = -fprime(xk)
File "C:\Users\matti\AppData\Roaming\Python\Python36\site-packages\scipy\optimize\optimize.py", line 292, in function_wrapper
return function(*(wrapper_args + args))
File "C:\Users\matti\AppData\Local\Programs\Python\Python36\lib\site-packages\ad\__init__.py", line 1090, in grad
return numpy.array(ans.gradient(list(xa)))
AttributeError: 'Float' object has no attribute 'gradient'Also, replacing the sympy.log with Python's standard math.log seems to work.
Something buggy with sympy?
Probably there's something wrong with the input values for X and/or T. The function from the question works ok:
import numpy as np
from math import e
def sigmoid(X, T):
return 1.0 / (1.0 + np.exp(-1.0 * np.dot(X, T)))
X = np.array([[1, 2, 3], [5, 0, 0]])
T = np.array([[1, 2], [1, 1], [4, 4]])
print(X.dot(T))
# Just to see if values are ok
print([1. / (1. + e ** el) for el in [-5, -10, -15, -16]])
print()
print(sigmoid(X, T))
Result:
[[15 16]
[ 5 10]]
[0.9933071490757153, 0.9999546021312976, 0.999999694097773, 0.9999998874648379]
[[ 0.99999969 0.99999989]
[ 0.99330715 0.9999546 ]]
Probably it's the dtype of your input arrays. Changing X to:
X = np.array([[1, 2, 3], [5, 0, 0]], dtype=object)
Gives:
Traceback (most recent call last):
File "/[...]/stackoverflow_sigmoid.py", line 24, in <module>
print sigmoid(X, T)
File "/[...]/stackoverflow_sigmoid.py", line 14, in sigmoid
return 1.0 / (1.0 + np.exp(-1.0 * np.dot(X, T)))
AttributeError: exp
You convert type np.dot(X, T) to float32 like this:
z=np.array(np.dot(X, T),dtype=np.float32)
def sigmoid(X, T):
return (1.0 / (1.0 + np.exp(-z)))
Hopefully it will finally work!
[Solved - thanks to DisasterArt]
https://codeshare.io/246gXj
I keep getting this error:
AttributeError: 'float' object has no attribute 'time'
I don't see anything wrong? Thanks!