Following the documentation for the pandas.DataFrame.groupby method, there are a couple ways you could fix this. The way I'd recommend is to explicitly specify the grouping column using the by parameter (although you don't need to), and then providing an aggregation function (looks like you want the mean). Any of these will work:
# Option 1
New.groupby(['P5PN']).mean()
# Option 2
New.groupby('P5PN').mean()
# Option 3
New.groupby(by=['P5PN']).mean()
The issue here is that while you are specifying a grouping column, you are not telling pandas how to aggregate the measures in the other columns of your data.
Note: you might want to update the title of your question to "'SeriesGroupBy' object has no attribute 'shape'" to match the actual error you're getting.
Answer from matsuninja on Stack Overflowpython - AttributeError: 'DataFrame' object has no attribute 'group_by' - Stack Overflow
python - 'DataFrameGroupBy' object has no attribute 'to_frame' - Stack Overflow
Pandas AttributeError: 'DataFrame' object has no attribute 'group_by'
I mean, isn't it groupby(), not group_by()?
More on reddit.comErrors after deploying the app
Hello,
Has anyone ever come across this before?
I'm trying to group some data in a dataframe and getting this error. The steps I've taken are:
-
in a for loop:
read in a csv from an api using pd.read_csv() replaced some values in a column using a for loop and .loc[] appended the resulting data frame to a list
2) concatenated the list of dataframes using pd.concat()
3) added a calculated column to the new DF by multiplying another column
4) added two empty columns
5) filtered the DF using .loc[] based on a value within a column
6) filtered the DF using .loc[] based on a value in a different column
7) tried to use this code:
new_DF = old_df.group_by(['col1', 'col_2', 'col_3', 'adgroup', 'col_4', 'col5', 'col6'], as_index=False)[['col7', 'col8', 'col9']].sum()
The DF seems to behaving normally for example I can do dtypes and columns on it and add columns which are calculated from other columns. What is super frustrating is that I can do pd.to_csv() and then pd.read_csv() on the DF and then I'm able to do the grouping I want (however this isn't ideal which is why I'm posting).
Any advice would be appreciated.
Cheers
