df['a'] returns a Series object that has astype as a vectorized way to convert all elements in the series into another one.
df['a'][1] returns the content of one cell of the dataframe, in this case the string '0.123'. This is now returning a str object that doesn't have this function. To convert it use regular python instruction:
type(df['a'][1])
Out[25]: str
float(df['a'][1])
Out[26]: 0.123
type(float(df['a'][1]))
Out[27]: float
As per your second question, the operator in that is at the end calling __contains__ against the series with '' as argument, here is the docstring of the operator:
help(pd.Series.__contains__)
Help on function __contains__ in module pandas.core.generic:
__contains__(self, key)
True if the key is in the info axis
It means that the in operator is searching your empty string in the index, not the contents of it.
The way to search your empty strings is to use the equal operator:
df
Out[54]:
a
0 42
1
'' in df
Out[55]: False
df==''
Out[56]:
a
0 False
1 True
df[df['a']=='']
Out[57]:
a
1
Answer from Zeugma on Stack Overflowdf['a'] returns a Series object that has astype as a vectorized way to convert all elements in the series into another one.
df['a'][1] returns the content of one cell of the dataframe, in this case the string '0.123'. This is now returning a str object that doesn't have this function. To convert it use regular python instruction:
type(df['a'][1])
Out[25]: str
float(df['a'][1])
Out[26]: 0.123
type(float(df['a'][1]))
Out[27]: float
As per your second question, the operator in that is at the end calling __contains__ against the series with '' as argument, here is the docstring of the operator:
help(pd.Series.__contains__)
Help on function __contains__ in module pandas.core.generic:
__contains__(self, key)
True if the key is in the info axis
It means that the in operator is searching your empty string in the index, not the contents of it.
The way to search your empty strings is to use the equal operator:
df
Out[54]:
a
0 42
1
'' in df
Out[55]: False
df==''
Out[56]:
a
0 False
1 True
df[df['a']=='']
Out[57]:
a
1
df['a'][1] will return the actual value inside the array, at the position 1, which is in fact a string. You can convert it by using float(df['a'][1]).
>>> df = pd.DataFrame({'a':['1.23', '0.123']})
>>> type(df['a'])
<class 'pandas.core.series.Series'>
>>> df['a'].astype(float)
0 1.230
1 0.123
Name: a, dtype: float64
>>> type(df['a'][1])
<type 'str'>
For the second question, maybe you have an empty value on your raw data. The correct test would be:
>>> df = pd.DataFrame({'a':['1', '']})
>>> '' in df['a'].values
True
Source for the second question: https://stackoverflow.com/a/21320011/5335508
[FEA] as_type() for Dataframes
AnnData Operations - AttributeError: 'DataFrame' object has no attribute 'dtype'
BUG: `Dataframe.astype(...)` Fails With `'bool' object has no attribute 'all'`
python - I got the following error : 'DataFrame' object has no attribute 'data' - Data Science Stack Exchange
"sklearn.datasets" is a scikit package, where it contains a method load_iris().
load_iris(), by default return an object which holds data, target and other members in it. In order to get actual values you have to read the data and target content itself.
Whereas 'iris.csv', holds feature and target together.
FYI: If you set return_X_y as True in load_iris(), then you will directly get features and target.
from sklearn import datasets
data,target = datasets.load_iris(return_X_y=True)
The Iris Dataset from Sklearn is in Sklearn's Bunch format:
print(type(iris))
print(iris.keys())
output:
<class 'sklearn.utils.Bunch'>
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])
So, that's why you can access it as:
x=iris.data
y=iris.target
But when you read the CSV file as DataFrame as mentioned by you:
iris = pd.read_csv('iris.csv',header=None).iloc[:,2:4]
iris.head()
output is:
2 3
0 petal_length petal_width
1 1.4 0.2
2 1.4 0.2
3 1.3 0.2
4 1.5 0.2
Here the column names are '1' and '2'.
First of all you should read the CSV file as:
df = pd.read_csv('iris.csv')
you should not include header=None as your csv file includes the column names i.e. the headers.
So, now what you can do is something like this:
X = df.iloc[:, [2, 3]] # Will give you columns 2 and 3 i.e 'petal_length' and 'petal_width'
y = df.iloc[:, 4] # Label column i.e 'species'
or if you want to use the column names then:
X = df[['petal_length', 'petal_width']]
y = df.iloc['species']
Also, if you want to convert labels from string to numerical format use sklearn LabelEncoder
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
y = le.fit_transform(y)
why is this code throwing this error:
'tuple' object has no attribute 'dtype'
Code:
hsv_img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
h, s, v = cv2.split(hsv_img)
clahe = cv2.createCLAHE(clipLimit=2.6, tileGridSize=(3,3))
clahe_v = clahe.apply(v)
dwts = pywt.dwt(s, 'bior1.3')
smin = np.amin(dwts)
smax = np.amax(dwts)
print("smin:", smin, "smax:", smax)
sleep(2)
smin_new = 2.589 * smin
smax_new = 0.9 * smax
print("smin_new:", smin_new, "smax_new:", smax_new)
sleep(2)
magic_s = skimage.exposure.rescale_intensity(dwts, in_range=(smin,smax), out_range=(smin_new,smax_new)).astype(np.uint8) # this is the part throwing the error
df[col] will return a DataFrame object (not a Series) if df contains more than one column with the same name. Make sure you do not have more than one column with the same name in df as that is the likely source of this error.
You seem to somehow get back a DataFrame and not a Series by calling X[col]. Not sure why, because you did not supply the full structure and data of your dataframe.
.dtype is for pandas Series https://pandas.pydata.org/docs/reference/api/pandas.Series.dtype.html
.dtypes is for pandas Dataframes (and seems also to work with pandas Series) https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.dtypes.html