There is string 'pct', need variable pct - lambda function by removing '':

aggs = {'B':pct}
print(df.groupby('A').agg(aggs))

          B
A          
1  0.333333
4  0.333333
7  0.333333
Answer from jezrael on Stack Overflow
🌐
GitHub
github.com › dask › dask › issues › 4307
SeriesGroupBy Object has not Attribute Diff · Issue #4307 · dask/dask
December 17, 2018 - df.groupby('IndexName')['ColName'].diff() ..'SeriesGroupBy' object has no attribute 'diff · The Dask Series object has a diff method, as does the pandas series groupby object, and it seems logical that the dask SeriesGroupBy object would as well. I've tried the following work around: MyDiff = df.groupby('IndexName')['ColName'].apply( lambda x : x.diff(1) ); "but that give me a wrong number of items passed 0..." error.
Author   bgoodman44
Discussions

python - How to use .mode with groupby - Stack Overflow
Communities for your favorite technologies. Explore all Collectives · Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work More on stackoverflow.com
🌐 stackoverflow.com
'mode' not recognized by df.groupby().agg(), but pd.Series.mode works
There was an error while loading. Please reload this page · I thought all the series aggregate methods propagated automatically to groupby, but I've probably misunderstood More on github.com
🌐 github.com
7
November 9, 2015
Using isin() on grouped data
You need to use a boolean indexing because isin() is not compatible with a groupby object. I’m on mobile but here’s a link that will help you out https://stackoverflow.com/questions/50611929/python-pandas-groupby-isin More on reddit.com
🌐 r/learnpython
2
3
December 1, 2021
ENH:AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis'
Code gives error: AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis' instead of correct group-wise kurtosis. More on github.com
🌐 github.com
11
March 1, 2021
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.Series.groupby.html
pandas.Series.groupby — pandas 3.0.2 documentation
Group Series using a mapper or by a Series of columns · A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups
🌐
GitHub
github.com › pandas-dev › pandas › issues › 11562
'mode' not recognized by df.groupby().agg(), but pd.Series.mode works · Issue #11562 · pandas-dev/pandas
November 9, 2015 - This works: df = pd.DataFrame({'A': [1, 2, 1, 2, 1, 2, 3], 'B': [1, 1, 1, 2, 2, 2, 2]}) df.groupby('B').agg(pd.Series.mode) but this doesn't: df.groupby('B').agg('mode') ... AttributeError: Cannot access callable attribute 'mode' of 'Dat...
Author   patricksurry
🌐
Aimee
aimee.codes › blog › 2020 › 11 › 11 › filling-NA-values
Filling Categorical NAs
November 11, 2020 - One issue is that there is not a .mode() we can use at the end of our statement - this groupby function only works with numerical summaries, like mean, variance, or median. So we need to find a work-around, since our data is categorical and we’re assigning it based on the Neighborhood's mode. grouped['SalePrice'].mode() AttributeError: 'SeriesGroupBy' object has no attribute 'mode'
🌐
Reddit
reddit.com › r/learnpython › using isin() on grouped data
r/learnpython on Reddit: Using isin() on grouped data
December 1, 2021 -

Hi,

I want to filter based on whether a value is in another column. However this data needs to be grouped before the isin filter in applied. When I do this I get the error

'SeriesGroupBy' object has no attribute 'isin'

Example explaining what I'm trying to do:

 import pandas as pd

dict = {'AttributeName': {0: 'John', 1: 'John', 2: 'John', 3: 'John', 4: 'Sally', 5: 'Sally'}, 'Lineage Step': {0: 1, 1: 2, 2: 3, 3: 4, 4:1, 5:2}, 'From Country': {0: 'Spain', 1: 'Scotland', 2: 'England', 3: 'England', 4: 'Scotland', 5:'England'}, 'From Town': {0: 'Madrid', 1: 'Edinburgh', 2: 'London', 3: 'London', 4: 'Edinburgh', 5: 'Manchester'}, 'FromStreet': {0: 'Spanish St', 1: 'Main St', 2: 'Lower St', 3: 'Middle St', 4: 'London St', 5: 'Scotland St'}, 'ToCountry': {0: 'Scotland', 1: 'England', 2: 'England', 3: 'England', 4: 'England', 5: 'England'}, 'ToTown': {0: 'Edinburgh', 1: 'London', 2: 'London', 3: 'London', 4: 'Liverpool', 5: 'London'}, 'ToStreet': {0: 'Lower St', 1: 'Middle St', 2: 'Upper St', 3: 'Upper St', 4: 'new St', 5: 'Old St'}}
sample_data = pd.DataFrame.from_dict(dict)

#example data set. I want to find every unique 'fromCountry' for both John and Sally. So For John we would just have the first row, where he enters from Spain to Scotland. The second row would be filtered as Scotland appears in the 'ToCountry' column. Sally would just have the 'FromCountry' Edinburgh row. 

I have tried to do like this:

sample_grouped = sample_data.groupby('AttributeName')
sample_grouped[~sample_grouped['From Country'].isin(sample_grouped['ToCountry'])]

but I get there error 'SeriesGroupBy' object has no attribute 'isin'

Does anyone know how to use the isin (or comparable) function on grouped by data?

Thanks

Find elsewhere
🌐
GitHub
github.com › pandas-dev › pandas › issues › 40139
ENH:AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis' · Issue #40139 · pandas-dev/pandas
March 1, 2021 - Code gives error: AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis' instead of correct group-wise kurtosis.
Author   rsuhada
🌐
Google Groups
groups.google.com › g › pydata › c › 4ZjOP0Lfjdc
Problem with groupby and nth in pandas 0.18.1
July 5, 2016 - I noticed that you can also have the original behaviour of 0.17 by passing as_index=False: In [13]: df.groupby('device', as_index=False)['timestamp'].nth(0) Out[13]: 0 0 3 1 Name: timestamp, dtype: int64 Are you sure the transform('idxmin') works? I get an error when I try that (both on 0.17.1 as 0.18.1): AttributeError: 'SeriesGroupBy' object has no attribute 'idxmim' Whoops, there was a typo in your code, which is the cause that it failed: idxmim of course does not work, but idxmin does :-)  ·
🌐
Narkive
pydata.narkive.com › tGpE0UQk › boxplot-and-groupby
boxplot and groupby
Permalink Hi, I'm new to pandas and mybe my question does not make much sense. But here it goes. I'm using groupby to aggregate my data by the columns "VESSEL" (a string with 8 names) and I'm getting the "START" (start time of each boat haul) like this. d = df.groupby('VESSEL')['START'] I can use .describe(), .mean (), but when I try .boxplot() I get: AttributeError: 'SeriesGroupBy' object has no attribute 'boxplot' Is there another way to get a boxplot of "START" for each "VESSEL"?
🌐
Reddit
reddit.com › r/dataanalysis › data analysis in python
r/dataanalysis on Reddit: Data Analysis in Python
December 19, 2022 -

Hello everyone!
I am a newbie at python and I looked up some problems associated with the Data Expo 2009: Airline on time data from the Harvard Dataverse (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/HG7NV7).
I am currently working on the following question:

  1. When is the best time of day, day of the week, and time of year to fly to minimize delays?

All libraries are imported and the data is cleared up (empty columns and duplicate rows are dropped).
What I was intending to do is to plot a bar chart with "Months" on the x-axis and "ArrDelay" (arrival delays) on the y-axis.

My code looks the following way (I'm using jupyter notebook):

import pandas as pd 
dataair = pd.read_csv("/Users/issakovakamilla/Desktop/2000.csv.bz2")
dataair.dropna(how='all', axis=1, inplace=True)
dataair
import matplotlib.pyplot as plt
df = pd.DataFrame(dataair)
X = list(df.iloc[:, 0])
Y = list(df.iloc[:, 1])
plt.bar(X, Y, color='g')
plt.title("stats")
plt.xlabel("Month")
plt.ylabel("ArrDelay")
plt.show()

Somehow I don't get a plot - its been executing for 10 minutes now (I get * near input). Could anyone help me with this?

🌐
YouTube
youtube.com › watch
'SeriesGroupBy' object has no attribute 'pct'? - YouTube
Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.
Published   February 10, 2022
🌐
GitHub
github.com › pandas-dev › pandas › issues › 15082
Unclear ValueError on core.groupby · Issue #15082 · pandas-dev/pandas
def test_groupby_aggregate_item_by_item(self): def test_df(): s = pd.DataFrame(np.array([[13, 14, 15, 16]]), index=[0], columns=['b', 'c', 'd', 'e']) num = np.array([[s, s, s, datetime.strptime('2016-12-28', "%Y-%m-%d"), 'asdf', 24], [s, s, s, datetime.strptime('2016-12-28', "%Y-%m-%d"), 'asdf', 6]]) columns = ['a', 'b', 'c', 'd', 'e', 'f'] idx = [x for x in xrange(0, len(num))] return pd.DataFrame(num, index=idx, columns=columns) c = [test_df().sort_values(['d', 'e', 'f']), test_df().sort_values(['d', 'e', 'f'])] df = pd.concat(c) df = df[["e", "a"]].copy().reset_index(drop=True) df["e_idx"] = df["e"] what = [0, 0.5, 0.5, 1] def x(): df.groupby(["e_idx", "e"])["a"].quantile(what) self.assertRaisesRegexp(ValueError, "'SeriesGroupBy' object has no attribute '_aggregate_item_by_item'", x)
Author   ghost
🌐
GitHub
github.com › pandas-dev › pandas › issues › 49907
BUG: SeriesGroupBy.apply sets name attribute if result is DataFrame · Issue #49907 · pandas-dev/pandas
November 25, 2022 - I have checked that this issue has not already been reported. I have confirmed this bug exists on the latest version of pandas. I have confirmed this bug exists on the main branch of pandas. def f(piece): return DataFrame({"value": piece, "demeaned": piece - piece.mean()}) df = pd.DataFrame({'a': [1,2,3], 'b': [4, 5, 5]}) res = df.groupby('a')['b'].apply(f) print(res.name) This prints 'b'. But .name shouldn't be an attribute of DataFrame... AttributeError: 'DataFrame' object has no attribute 'name'
Author   MarcoGorelli
🌐
Stack Overflow
stackoverflow.com › questions › 46534653 › error-attributeerror-dataframegroupby-object-has-no-attribute-while-groupby
python - Error 'AttributeError: 'DataFrameGroupBy' object has no attribute' while groupby functionality on dataframe - Stack Overflow
Thanks for this...but I'm getting "AttributeError: 'SeriesGroupBy' object has no attribute 'sample'" at "df_sample = df.groupby("persons").sample(frac=percentage_to_flag, random_state=random_state)". If I can figure out why, maybe it'll work for me...
Top answer
1 of 2
2

A quick clarification rather than an answer: the meta parameter is used in the .agg() method, to specify the column data types you expect, best expressed as a zero-length pandas dataframe. Dask will supply dummy data to your function otherwise, to try to guess those types, but this doesn't always work.

2 of 2
2

The issue that you're running into, is that the separate stages of the aggregation can't be the same function applied recursively, as in the custom_sum example that you're looking at.

I've modified code from this answer, leaving comments from @ user8570642, because they are very helpful. Note that this method will solve for a list of groupby keys: https://stackoverflow.com/a/46082075/3968619

def chunk(s):
    # for the comments, assume only a single grouping column, the 
    # implementation can handle multiple group columns.
    #
    # s is a grouped series. value_counts creates a multi-series like 
    # (group, value): count
    return s.value_counts()


def agg(s):
#     print('agg',s.apply(lambda s: s.groupby(level=-1).sum()))
    # s is a grouped multi-index series. In .apply the full sub-df will passed
    # multi-index and all. Group on the value level and sum the counts. The
    # result of the lambda function is a series. Therefore, the result of the 
    # apply is a multi-index series like (group, value): count
    return s.apply(lambda s: s.groupby(level=-1).sum())

    # faster version using pandas internals
    s = s._selected_obj
    return s.groupby(level=list(range(s.index.nlevels))).sum()


def finalize(s):
    # s is a multi-index series of the form (group, value): count. First
    # manually group on the group part of the index. The lambda will receive a
    # sub-series with multi index. Next, drop the group part from the index.
    # Finally, determine the index with the maximum value, i.e., the mode.
    level = list(range(s.index.nlevels - 1))
    return (
        s.groupby(level=level)
        .apply(lambda s: s.reset_index(level=level, drop=True).idxmax())
    )

max_occurence = dd.Aggregation('mode', chunk, agg, finalize)

chunk will count the values for the groupby object in each partition. agg will take the results from chunk and groupy the original groupby command and sum the value counts, so that we have the value counts for every group. finalize will take the multi-index series provided by agg and return the most frequently occurring value of B for each group from Z.

Here's a test case:

df = dd.from_pandas(
    pd.DataFrame({"A":[1,1,1,1,2,2,3]*10,"B":[5,5,5,5,1,1,1]*10,
                  'Z':['mike','amy','amy','amy','chris','chris','sandra']*10}), npartitions=10)
res = df.groupby(['Z']).agg({'B': mode}).compute()
print(res)