You are using tolist incorrectly. You want: .values followed by tolist()
type tsneX tsneY
0 A 53.828863 20.740931
1 B 57.816909 18.478468
2 A 55.913429 22.948167
3 C 56.603005 15.738954
For the above dataframe, to get your X and Y values as a list you can do:
tsneY_data = df['tsneY'].values.tolist()
>> [20.740931, 18.478468, 22.948167, 15.7389541]
tsneX_data = df['tsneX'].values.tolist()
>> [53.828863, 57.816909, 55.913429, 56.603005]
As you have tried to set this to the column of a new dataframe, you can do:
new_data = pd.DataFrame()
new_data['tsneY'] = df['tsneY'].values.tolist()
> new_data
tsneY
0 20.740931
1 18.478468
2 22.948167
3 15.738954
Answer from Chuck on Stack OverflowYou are using tolist incorrectly. You want: .values followed by tolist()
type tsneX tsneY
0 A 53.828863 20.740931
1 B 57.816909 18.478468
2 A 55.913429 22.948167
3 C 56.603005 15.738954
For the above dataframe, to get your X and Y values as a list you can do:
tsneY_data = df['tsneY'].values.tolist()
>> [20.740931, 18.478468, 22.948167, 15.7389541]
tsneX_data = df['tsneX'].values.tolist()
>> [53.828863, 57.816909, 55.913429, 56.603005]
As you have tried to set this to the column of a new dataframe, you can do:
new_data = pd.DataFrame()
new_data['tsneY'] = df['tsneY'].values.tolist()
> new_data
tsneY
0 20.740931
1 18.478468
2 22.948167
3 15.738954
I solved the problem by first extracting the relevant columns from the dataframe.
df = df.loc[:, ('type', 'tsneX', 'tsneY')
scatter = Scatter(df, x='tsneX', y='tsneY',
color='type', marker='type',
title='t-sne',
legend=True)
AttributeError: 'Index' object has no attribute 'to_list' in function decision_plot
BUG: AttributeError: 'StringArray' object has no attribute 'tolist'
python - sklearn package with AttributeError: 'MissingValues' object has no attribute 'to_list' - Data Science Stack Exchange
Test_race_count ERROR AttributeError: 'DataFrame' object has no attribute 'tolist'
Videos
Check the version of your pandas library:
import pandas
print(pandas.__version__)
If your version is less than 0.24.1:
pip install --upgrade pandas
If you need your code to work with all versions of pandas, here's a simple way to convert a Series into a NumPy array:
import pandas as pd
import numpy as np
s = pd.Series([1.1, 2.3])
a = np.array(s)
print(a) # [1.1 2.3]
On an advanced note, if your Series has missing values (as NaN values), these can be converted to a masked array:
s = pd.Series([1.1, np.nan])
a = np.ma.masked_invalid(s)
print(a) # [1.1 --]
Not sure I got what you need done. Please try filter stroke==1, groupby and count;
df.query("Stroke==1").groupby('Residence_type')['Stroke'].agg('count').to_frame('Stroke_Count')
Stroke_Count
Residence_type
Rural 3
Urban 2
You could try the following if you need the differences between categories
df1 =df.query("Stroke==1").groupby('Residence_type')['Stroke'].agg('count').to_frame('Stroke_Count')
df1.loc['Diff'] = abs(df1.loc['Rural']-df1.loc['Urban'])
print(df1)
Stroke_Count
Residence_type
Rural 3
Urban 2
Diff 1
Assuming that Stroke only contains 1 or 0, you can do:
result_df = df.groupby('Residence_type').sum()
>>> result_df
Stroke
Residence_type
Rural 3
Urban 2
>>> result_df.Stroke['Rural'] - result_df.Stroke['Urban']
1
Hey guys, I am learning how to convert the dictionary to data frame. I have a nested dictionary called user_dict like this:
File of dictionary in pickle format
[{'1000003': {'car': 0.0, 'car_passenger': 0.0, 'pt': 0.0, 'walk': 0.0, 'bike': 0.0}}, {'1000007': {'car': 0.0, 'car_passenger': 0.0, 'pt': 856.0786277323101, 'walk': 2546.869189662443, 'bike': 0.0}},
{'1000008': {'car': 0.0, 'car_passenger': 34189.569164682835, 'pt': 0.0, 'walk': 0.0, 'bike': 0.0}},
{'1000009': {'car': 0.0, 'car_passenger': 0.0, 'pt': 0.0, 'walk': 0.0, 'bike': 9847.472668350396}}]I want to convert the dict to data frame like this:
car car_passenger pt walk bike 1000003 0.0 0.0 0.0 0.0 0.0 1000007 0.0 0.0 856.078 2546.869 0.0 1000008 0.0 34189.569 0.0 0.0 0.0 1000009 0.0 0.0 0.0 0.0 9847.472
I converted it through from_dict:
df =pd.DataFrame.from_dict(user_dict,orient='index') df
But I got an error as this:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-44-2ef0fc236180> in <module>
----> 1 df =pd.DataFrame.from_dict(user_dict,orient='index')
2 df
/Library/Python/3.7/site-packages/pandas/core/frame.py in from_dict(cls, data, orient, dtype, columns)
1361 if len(data) > 0:
1362 # TODO speed up Series case
-> 1363 if isinstance(list(data.values())[0], (Series, dict)):
1364 data = _from_nested_dict(data)
1365 else:
AttributeError: 'list' object has no attribute 'values'I do not know how to fix it. Can anyone help me or explain me how to fix it?
Any help is appreciated.