value_counts is a Series method rather than a DataFrame method (and you are trying to use it on a DataFrame, clean). You need to perform this on a specific column:

clean[column_name].value_counts()

It doesn't usually make sense to perform value_counts on a DataFrame, though I suppose you could apply it to every entry by flattening the underlying values array:

pd.value_counts(df.values.flatten())
Answer from Andy Hayden on Stack Overflow
🌐
GitHub
github.com › pandas-dev › pandas › issues › 43564
ENH: DataFrameGroupby.value_counts · Issue #43564 · pandas-dev/pandas
September 14, 2021 - import pandas as pd df = pd.DataFrame({'gender': ['female', 'male', 'female', 'male', 'female', 'male'], 'education': ['low', 'medium', 'high', 'low', 'high', 'low'], 'country': ['US', 'FR', 'US', 'FR', 'FR', 'US']}) # These all work fine: # value_counts() on a DataFrame df[['gender', 'education']].value_counts(normalize=True) df['gender'].value_counts(normalize=True) # on a Series df.groupby('country')['gender'].value_counts(normalize=True) # on a SeriesGroupby # but fails on DataFrameGroupBy df.groupby('country')[['gender', 'education']].value_counts(normalize=True) -------------------------
Author   corriebar
Discussions

Errors after deploying the app
Hi I created my first app using streamlit. It works without any errors in my own pc (Python 3.9.7) but when I deploy it on streamlit cloud (Python 3.9) I get this error: AttributeError: 'DataFrameGroupBy' object has no attribute 'value_counts' The line which this error refers to is: df_grouped ... More on discuss.streamlit.io
🌐 discuss.streamlit.io
1
1
March 2, 2022
python - Error 'AttributeError: 'DataFrameGroupBy' object has no attribute' while groupby functionality on dataframe - Stack Overflow
Communities for your favorite technologies. Explore all Collectives · Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work More on stackoverflow.com
🌐 stackoverflow.com
Pandas AttributeError: 'DataFrame' object has no attribute 'group_by'

I mean, isn't it groupby(), not group_by()?

More on reddit.com
🌐 r/learnpython
3
3
February 28, 2018
python 3.x - How to count_values and then groupby - Stack Overflow
Im trying to: count_values to count the occurrences of the same value under invoice_line_normal_price and groupby shop_name This here is my df: I'm trying to end up with: So far i know I can use:... More on stackoverflow.com
🌐 stackoverflow.com
🌐
GitHub
github.com › pandas-dev › pandas › issues › 6540
Change groupby value_counts (from fall through behaviour) · Issue #6540 · pandas-dev/pandas
March 4, 2014 - In [132]: df = pd.DataFrame([['a_link', 'dofollow'], ['a_link', 'dofollow'], ['a_link', 'nofollow'], ['b_link', 'javascript']], columns=['link', 'type']) In [133]: g = df.groupby(['link', 'type']) In [134]: g.value_counts() AttributeError: 'DataFrameGroupBy' object has no attribute 'value_counts' In [135]: g.link.value_counts() # redundant level Out[135]: link type a_link dofollow a_link 2 nofollow a_link 1 b_link javascript b_link 1 dtype: int64 Following would make sense for DataFrameGroupby to: In [136]: pd.Series([len(g.groups[i]) for i in g.grouper.result_index], g.grouper.result_index) Out[136]: link type a_link dofollow 2 nofollow 1 b_link javascript 1 dtype: int64
Author   hayd
🌐
Pandas
pandas.pydata.org › pandas-docs › version › 1.1 › reference › groupby.html
GroupBy — pandas 1.1.5 documentation
GroupBy objects are returned by groupby calls: pandas.DataFrame.groupby(), pandas.Series.groupby(), etc. The following methods are available in both SeriesGroupBy and DataFrameGroupBy objects, but may differ slightly, usually in that the DataFrameGroupBy version usually permits the specification of an axis argument, and often an argument indicating whether to restrict application to columns of a specific data type.
🌐
Reddit
reddit.com › r/learnpython › pandas attributeerror: 'dataframe' object has no attribute 'group_by'
r/learnpython on Reddit: Pandas AttributeError: 'DataFrame' object has no attribute 'group_by'
February 28, 2018 -

Hello,

Has anyone ever come across this before?

I'm trying to group some data in a dataframe and getting this error. The steps I've taken are:

  1. in a for loop:

read in a csv from an api using pd.read_csv() replaced some values in a column using a for loop and .loc[] appended the resulting data frame to a list

2) concatenated the list of dataframes using pd.concat()

3) added a calculated column to the new DF by multiplying another column

4) added two empty columns

5) filtered the DF using .loc[] based on a value within a column

6) filtered the DF using .loc[] based on a value in a different column

7) tried to use this code:

new_DF = old_df.group_by(['col1', 'col_2', 'col_3', 'adgroup', 'col_4', 'col5', 'col6'], as_index=False)[['col7', 'col8', 
'col9']].sum()

The DF seems to behaving normally for example I can do dtypes and columns on it and add columns which are calculated from other columns. What is super frustrating is that I can do pd.to_csv() and then pd.read_csv() on the DF and then I'm able to do the grouping I want (however this isn't ideal which is why I'm posting).

Any advice would be appreciated.

Cheers

🌐
Stack Overflow
stackoverflow.com › questions › 67179590 › how-to-count-values-and-then-groupby
python 3.x - How to count_values and then groupby - Stack Overflow
Thats good for me @jezrael. Just need to know how to have the shop_name showing on every line.. tried: nr.groupby((['shop_name']),as_index=False)['invoice_line_normal_price'].value_counts() but I get AttributeError: 'DataFrameGroupBy' object has no attribute 'value_counts'
Find elsewhere
🌐
Reddit
reddit.com › r/learnpython › error when using value_counts on a series during a loop
r/learnpython on Reddit: Error when using value_counts on a series during a loop
December 16, 2023 -

Good evening,

I've been trying to work with this data.

data = {
    'ID': [1,1,1,2,2,3,3,3,3,4,4],
    'Value': [23,22,11,98,34,42,30,36,77,58,5]
}

df = pd.DataFrame(data)
df


        ID	Value
0	1	23
1	1	22
2	1	11
3	2	98
4	2	34
5	3	42
6	3	30
7	3	36
8	3	77
9	4	58
10	4	5

Now for this, I'm trying to keep only the last two rows for each ID. Before I share my code, here's my intended result:

1	1	22
2	1	11
3	2	98
4	2	34
7	3	36
8	3	77
9	4	58
10	4	5

The code I tried to use is this (along with resulting error):

for i, row in df.iterrows():
    while row['ID'].value_counts() > 2:
        df.drop(i)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-286-15a45b0e4fb7> in <module>
      1 for i, row in df.iterrows():
----> 2     while row['ID'].value_counts() > 2:
      3         df.drop(i)

AttributeError: 'numpy.int64' object has no attribute 'value_c

I know iterrows is really frowned upon, but speed isn't an issue and I'm still practicing my data modeling skills and I hope to drop it once I get more experienced. I feel like this code executes without the series, but as you can see, I want them grouped by ID.

Thanks!

🌐
GitHub
github.com › pandas-dev › pandas › issues › 11558
groupby categorical column fails with unstack · Issue #11558 · pandas-dev/pandas
November 9, 2015 - Replicating example In [1]: df = pd.DataFrame([[1,2],[3,4]],columns=pd.CategoricalIndex(list('AB'))) In [2]: df.describe() AttributeError: 'DataFrame' object has no attribute 'value_counts' The behaviour in this notebook seems like a bug...
Author   mikepqr
🌐
Kaggle
kaggle.com › code › hashbanger › grouping-sorting-in-pandas
Checking your browser - reCAPTCHA
Checking your browser before accessing www.kaggle.com · Click here if you are not automatically redirected after 5 seconds
🌐
Python Forum
python-forum.io › thread-28242.html
A couple of questions about counts/value_counts and pandas?
July 10, 2020 - why does this work df[df['TotalPayBenefits']== df['TotalPayBenefits'].max()]but this gives an error df[df['TotalPayBenefits'].max()]Error:--------------------------------------------------------------------------- During handling of the above exc...
🌐
Kaggle
kaggle.com › questions-and-answers › 511815
Convert DataFrameGroupBy object to a DataFrame
Checking your browser before accessing www.kaggle.com · Click here if you are not automatically redirected after 5 seconds
🌐
GitHub
github.com › pandas-dev › pandas › issues › 11640
BUG AttributeError: 'DataFrameGroupBy' object has no attribute '_obj_with_exclusions' · Issue #11640 · pandas-dev/pandas
November 18, 2015 - In [5]: df.groupby('a').mean() --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-29-a830c6135818> in <module>() ----> 1 df.groupby('a').mean() /home/nicolas/Git/pandas/pandas/core/groupby.py in mean(self) 764 self._set_selection_from_grouper() 765 f = lambda x: x.mean(axis=self.axis) --> 766 return self._python_agg_general(f) 767 768 def median(self): /home/nicolas/Git/pandas/pandas/core/groupby.py in _python_agg_general(self, func, *args, **kwargs) 1245 output[name] = self._try_cast(values[mask], result)
Author   nbonnotte
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.Series.groupby.html
pandas.Series.groupby — pandas 3.0.2 documentation
The implementation of groupby is hash-based, meaning in particular that objects that compare as equal will be considered to be in the same group. An exception to this is that pandas has special handling of NA values: any NA values will be collapsed to a single group, regardless of how they compare.
🌐
Databricks Community
community.databricks.com › t5 › data-engineering › attributeerror-dataframe-object-has-no-attribute › td-p › 61132
AttributeError: 'DataFrame' object has no attribut... - Databricks Community - 61132
February 19, 2024 - Hello, I have some trouble deduplicating rows on the "id" column, with the method "dropDuplicatesWithinWatermark" in a pipeline. When I run this pipeline, I get the error message: "AttributeError: 'DataFrame' object has no attribute 'dropDuplicatesWithinWatermark'" Here is part of the code: @dl...
Top answer
1 of 5
2

"sklearn.datasets" is a scikit package, where it contains a method load_iris().

load_iris(), by default return an object which holds data, target and other members in it. In order to get actual values you have to read the data and target content itself.

Whereas 'iris.csv', holds feature and target together.

FYI: If you set return_X_y as True in load_iris(), then you will directly get features and target.

from sklearn import datasets
data,target = datasets.load_iris(return_X_y=True)
2 of 5
1

The Iris Dataset from Sklearn is in Sklearn's Bunch format:

print(type(iris))
print(iris.keys())

output:

<class 'sklearn.utils.Bunch'>
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])

So, that's why you can access it as:

x=iris.data
y=iris.target

But when you read the CSV file as DataFrame as mentioned by you:

iris = pd.read_csv('iris.csv',header=None).iloc[:,2:4]
iris.head()

output is:

    2   3
0   petal_length    petal_width
1   1.4 0.2
2   1.4 0.2
3   1.3 0.2
4   1.5 0.2

Here the column names are '1' and '2'.

First of all you should read the CSV file as:

df = pd.read_csv('iris.csv')

you should not include header=None as your csv file includes the column names i.e. the headers.

So, now what you can do is something like this:

X = df.iloc[:, [2, 3]] # Will give you columns 2 and 3 i.e 'petal_length' and 'petal_width'
y = df.iloc[:, 4] # Label column i.e 'species'

or if you want to use the column names then:

X = df[['petal_length', 'petal_width']]
y = df.iloc['species']

Also, if you want to convert labels from string to numerical format use sklearn LabelEncoder

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
y = le.fit_transform(y)