Try df.values instead. This will have the same effect for versions of pandas prior to 0.24.0
Try df.values instead. This will have the same effect for versions of pandas prior to 0.24.0
This feature was just added in Version 0.24.0, which was released a couple of days ago. If you haven't updated yet, the attribute doesn't exist! Once you update pandas the problem should resolve itself.
BUG: AttributeError: 'bool' object has no attribute 'to_numpy' in "mask_missing" method of core/missing.py
python - AttributeError: 'DataFrame' object has no attribute 'to_numpy' - Stack Overflow
python - Pandas function to_numpy - Stack Overflow
python - How to fix AttributeError: 'Series' object has no attribute 'to_numpy' - Stack Overflow
Check the version of your pandas library:
import pandas
print(pandas.__version__)
If your version is less than 0.24.1:
pip install --upgrade pandas
If you need your code to work with all versions of pandas, here's a simple way to convert a Series into a NumPy array:
import pandas as pd
import numpy as np
s = pd.Series([1.1, 2.3])
a = np.array(s)
print(a) # [1.1 2.3]
On an advanced note, if your Series has missing values (as NaN values), these can be converted to a masked array:
s = pd.Series([1.1, np.nan])
a = np.ma.masked_invalid(s)
print(a) # [1.1 --]
"sklearn.datasets" is a scikit package, where it contains a method load_iris().
load_iris(), by default return an object which holds data, target and other members in it. In order to get actual values you have to read the data and target content itself.
Whereas 'iris.csv', holds feature and target together.
FYI: If you set return_X_y as True in load_iris(), then you will directly get features and target.
from sklearn import datasets
data,target = datasets.load_iris(return_X_y=True)
The Iris Dataset from Sklearn is in Sklearn's Bunch format:
print(type(iris))
print(iris.keys())
output:
<class 'sklearn.utils.Bunch'>
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])
So, that's why you can access it as:
x=iris.data
y=iris.target
But when you read the CSV file as DataFrame as mentioned by you:
iris = pd.read_csv('iris.csv',header=None).iloc[:,2:4]
iris.head()
output is:
2 3
0 petal_length petal_width
1 1.4 0.2
2 1.4 0.2
3 1.3 0.2
4 1.5 0.2
Here the column names are '1' and '2'.
First of all you should read the CSV file as:
df = pd.read_csv('iris.csv')
you should not include header=None as your csv file includes the column names i.e. the headers.
So, now what you can do is something like this:
X = df.iloc[:, [2, 3]] # Will give you columns 2 and 3 i.e 'petal_length' and 'petal_width'
y = df.iloc[:, 4] # Label column i.e 'species'
or if you want to use the column names then:
X = df[['petal_length', 'petal_width']]
y = df.iloc['species']
Also, if you want to convert labels from string to numerical format use sklearn LabelEncoder
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
y = le.fit_transform(y)
https://imgur.com/gallery/yAdAjdx
Hello,
Linked is the screenshot of the two error messages I keep recieving as well as the code leading up to it. I'm trying to run an uplift on a classification tree. Any help is appreciated, I am still fairly new to python. Thank you!!
(M - 3) is getting interpreted as a numpy.ndarray. This implies that M is defined somewhere as a numpy.ndarray. Test it out by running:
print type(M)
You are calling the .fillna() method on a numpy array. And numpy arrays don't have that method defined.
You can probably convert the numpy array to a pandas.DataFrame and then apply the .fillna() method.
if you need to flatten you can directly use flatten() on numpy.ndarray object
X = X.flatten()
Link to doc: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.flatten.html
DO NOT VOTE UP RESPONSE MAIN WORK DONE BY @wayne above:
Did you run print(type(X)) on X prior to that line that gave an error? Given you got an error of AttributeError: 'numpy.ndarray' object has no attribute 'to_numpy', X appears to already be numpy object. Also, I don't think the to_numpy comes from numpy. Pandas has one. Polars has one. This drops us in the end of the problem and so we cannot provide much help without more code upstream.
solution
reload data set.
df = pd.read_csv(file_name, header=0)
[str(i) for i in ([["CPU_frequency"]])]
code above resolves.