You should call pivot on gruped data, so first you need to group by date and then pivot by p_category:

Copy>>> df.groupBy('date').pivot('p_category').agg(countDistinct('pid').alias('cnt')).show()
+----------+----+----------+                                                    
|      date|flat|ultra_thin|
+----------+----+----------+
|2016-09-30|   1|         1|
+----------+----+----------+
Answer from Mariusz on Stack Overflow
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.DataFrame.pivot_table.html
pandas.DataFrame.pivot_table — pandas 3.0.2 documentation
DataFrame.pivot_table(values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All', observed=True, sort=True, **kwargs)[source]# Create a spreadsheet-style pivot table as a DataFrame. The levels in the pivot table will be stored in MultiIndex objects (hierarchical indexes) on the index and columns of the result DataFrame.
Discussions

Use .pivot() on LazyFrame
Description I am trying to use the .pivot() function on a LazyFrame dataset, but I get the following error: "AttributeError: 'LazyFrame' object has no attribute 'pivot'" H... More on github.com
🌐 github.com
3
September 11, 2023
BUG: `DataFrame.pivot` fails with `pyarrow` backend if variable is already encoded as `dictionary`
Pandas version checks I have checked that this issue has not already been reported. I have confirmed this bug exists on the latest version of pandas. I have confirmed this bug exists on the main br... More on github.com
🌐 github.com
6
May 3, 2023
BUG/API: pivot_table with multi-index columns only
There was an error while loading. Please reload this page · No error, symmetrical between rows/columns and single/multi case More on github.com
🌐 github.com
11
July 20, 2017
Python Pandas Pivot Table Attribute Error
If anyone has any insight into why my code isn’t working, please help! In my IDE, I am trying to make a pivot table with pandas with the following code: category_pivot = category_counts.pivot(columns = ‘conservation_status’, index = ‘category’, values = ‘scientific_name’).res... More on discuss.codecademy.com
🌐 discuss.codecademy.com
0
0
September 10, 2018
Top answer
1 of 1
1

You can only do .pivot on objects having pivot attribute (method or property). You tried to do df.pivot, so it would only work if df had such attribute. You can inspect all the attributes of df (it's an object of pyspark.sql.DataFrame class) here. You see many attributes there, but none of them is called pivot. That's why you get an attribute error.

pivot is a method of pyspark.sql.GroupedData object. It means, in order to use it, you must somehow create pyspark.sql.GroupedData object from your pyspark.sql.DataFrame object. In your case, it's by using .groupBy():

df.groupBy("user_id").pivot("item_id")

This creates yet another pyspark.sql.GroupedData object. In order to make a dataframe out of it you would want to use one of the methods of GroupedData class. agg is the method that you need. Inside it, you will have to provide Spark's aggregation function which you will use for all the grouped elements (e.g. sum, first, etc.).

df.groupBy("user_id").pivot("item_id").agg(F.sum("watched_pct"))

Full example:

from pyspark.sql import functions as F

df = spark.createDataFrame(
    [(1, 1, '2021-05-11', 4250, 72),
     (1, 2, '2021-05-11', 80, 99),
     (2, 3, '2021-05-11', 1000, 80),
     (2, 4, '2021-05-11', 5000, 40)],
    ['user_id', 'item_id', 'last_watch_dt', 'total_dur', 'watched_pct'])

df = df.groupBy("user_id").pivot("item_id").agg(F.sum("watched_pct"))
df.show()
# +-------+----+----+----+----+
# |user_id|   1|   2|   3|   4|
# +-------+----+----+----+----+
# |      1|  72|  99|null|null|
# |      2|null|null|  80|  40|
# +-------+----+----+----+----+

If you want to replace nulls with 0, use fillna of pyspark.sql.DataFrame class.

df = df.fillna(0)
df.show()
# +-------+---+---+---+---+
# |user_id|  1|  2|  3|  4|
# +-------+---+---+---+---+
# |      1| 72| 99|  0|  0|
# |      2|  0|  0| 80| 40|
# +-------+---+---+---+---+
🌐
GitHub
github.com › pola-rs › polars › issues › 11045
Use .pivot() on LazyFrame · Issue #11045 · pola-rs/polars
September 11, 2023 - I am trying to use the .pivot() function on a LazyFrame dataset, but I get the following error: "AttributeError: 'LazyFrame' object has no attribute 'pivot'" ... df = pl.DataFrame( { "foo": ["one", "one", "two", "two", "one", "two"], "bar": ["y", "y", "y", "x", "x", "x"], "baz": [1, 2, 3, 4, 5, 6], } ) df = df.lazy() df.pivot(values="baz", index="foo", columns="bar", aggregate_function="sum")
Author   angelos-stergiou
🌐
GitHub
github.com › pandas-dev › pandas › issues › 53051
BUG: `DataFrame.pivot` fails with `pyarrow` backend if variable is already encoded as `dictionary` · Issue #53051 · pandas-dev/pandas
May 3, 2023 - This bug is caused by apache/arrow#34890: currently, pyarrow's dictionary_encode is not idempotent, i.e. it fails with ArrowNotImplementedError instead of returning the data as-is if the array is already of dictionary type. So, either one waits until it is fixed upstream by pyarrow, or an additional check needs to be added to test whether the series is already of dictionary data type. Pivot should work with categorical data when using the pyarrow backend.
Author   randolf-scholz
🌐
Cloudera Community
community.cloudera.com › t5 › Support-Questions › How-to-display-pivoted-dataframe-with-PSark-Pyspark › m-p › 149375
How to display pivoted dataframe with PSark, Pyspa... - Cloudera Community - 149375
January 28, 2017 - Traceback (most recent call last): ... AttributeError: 'GroupedData' object has no attribute 'show' ... After pivoting you need to run an aggregate function (e.g. sum) to get back a DataFrame/Dataset....
🌐
GitHub
github.com › pandas-dev › pandas › issues › 17038
BUG/API: pivot_table with multi-index columns only · Issue #17038 · pandas-dev/pandas
July 20, 2017 - In [21]: df = pd.DataFrame({'k': [1, 2, 3], 'v': [4, 5, 6]}) In [22]: df.pivot_table(values='v', columns='k') Out[22]: k 1 2 3 v 4 5 6 In [23]: df.pivot_table(values='v', index='k') Out[23]: v k 1 4 2 5 3 6 In [24]: df2 = pd.DataFrame({'k1': [1, 2, 3], 'k2': [1, 2, 3], 'v': [4, 5, 6]}) In [25]: df2.pivot_table(values='v', index=('k1','k2')) Out[25]: v k1 k2 1 1 4 2 2 5 3 3 6 In [26]: df2.pivot_table(values='v', columns=('k1','k2')) --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-26-80d7fdeb9743> in <mod
Author   chris-b1
🌐
Codecademy Forums
discuss.codecademy.com › get help › python
Python Pandas Pivot Table Attribute Error - Python - Codecademy Forums
September 10, 2018 - Instead of creating a new dataframe, I’m receiving the following error message: AttributeError: Cannot access callable attribute ‘pivot’ of ‘DataFrameGroupBy’ objects, try using the ‘apply’ method
Find elsewhere
🌐
Edureka Community
edureka.co › community › 42320 › python-pandas-attributeerror-dataframe-object-attribute
Python Pandas error AttributeError DataFrame object has ...
March 28, 2019 - Host '172.31.27.232' is blocked because of many connection errors; unblock with 'mariadb-admin flush-hosts'
🌐
Quora
quora.com › How-do-you-fix-pandas-that-have-no-attribute-dataframe
How to fix pandas that have no attribute dataframe - Quora
is similar to stack method, It also works with multi-index objects in dataframe, producing a reshaped DataFrame with a new inner-most level of column labels. ... Melt in pandas reshape dataframe from wide format to long format. It uses the “id_vars[‘col_names’]” for melt the dataframe by column names. ... Reshape data (produce a “pivot” table) based on column values.
🌐
GitHub
github.com › scikit-learn-contrib › imbalanced-learn › issues › 666
AttributeError: 'DataFrame' object has no attribute 'name' · Issue #666 · scikit-learn-contrib/imbalanced-learn
December 18, 2019 - AttributeError Traceback (most recent call last) <ipython-input-32-8d38637b6cc6> in <module> 6 7 oversampler=SMOTE(random_state=42) ----> 8 smote_train, smote_target = oversampler.fit_resample(X,y) 9 10 print("Before OverSampling, counts of label '0', '1':", smote_target['label'].value_counts()) ~\Anaconda3\lib\site-packages\imblearn\base.py in fit_resample(self, X, y) 73 """ 74 check_classification_targets(y) ---> 75 X, y, binarize_y = self._check_X_y(X, y) 76 77 self.sampling_strategy_ = check_sampling_strategy( ~\Anaconda3\lib\site-packages\imblearn\base.py in _check_X_y(self, X, y, accept_
Author   islrnd
🌐
Pandas
pandas.pydata.org › docs › user_guide › reshaping.html
Reshaping and pivot tables — pandas 3.0.2 documentation
To perform time series operations with each unique variable, a better representation would be where the columns are the unique variables and an index of dates identifies individual observations. To reshape the data into this form, we use the DataFrame.pivot() method (also implemented as a top level function pivot()):
🌐
Reddit
reddit.com › r/learnpython › "'dataframe' object has no attribute" issue
r/learnpython on Reddit: "'DataFrame' object has no attribute" Issue
October 30, 2020 -

I am in university and am taking a special topics class regarding AI. I have zero knowledge about Python, how it works, or what anything means.

A project for the class involves manipulating Bayesian networks to predict how many and which individuals die upon the sinking of a ship. This is the code I am supposed to manipulate:

##EDIT VARIABLES TO THE VARIABLES OF INTEREST
train_var = train.loc[:,['Survived','Sex']]  
test_var = test.loc[:,['Sex']]  
BayesNet = BayesianModel([('Sex','Survived')])

I am supposed to add another variable, 'Pclass,' to the mix, paying attention to the order for causation. I have added that variable to every line of this code in every way imaginable and consistently get an error from this line:

predictions = pandas.DataFrame({'PassengerId': test.PassengerId,'Survived': hypothesis.Survived.tolist()})
predictions

For example, the error I get for this version of the code:

train_var = train.loc[:,['Survived','Pclass','Sex']]  
test_var = test.loc[:,['Pclass']]  
BayesNet = BayesianModel([('Sex','Pclass','Survived')])

is this:

AttributeError                            Traceback (most recent call last)
<ipython-input-98-16d9eb9451f7> in <module>
----> 1 predictions = pandas.DataFrame({'PassengerId': test.PassengerId,'Survived': hypothesis.Survived.tolist()})
      2 predictions

/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in __getattr__(self, name)
   5137             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5138                 return self[name]
-> 5139             return object.__getattribute__(self, name)
   5140 
   5141     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'Survived'

Honestly, I have no idea wtf any of this means. I have tried googling this issue and have come up with nothing.

Any help would be greatly appreciated. I know it's a lot.

🌐
Reddit
reddit.com › r/learnpython › attributeerror: 'dataframe' object has no attribute 'data'
r/learnpython on Reddit: AttributeError: 'DataFrame' object has no attribute 'data'
September 29, 2021 -
wine = pd.read_csv("combined.csv", header=0).iloc[:-1]
df = pd.DataFrame(wine)
df
dataset = pd.DataFrame(df.data, columns =df.feature_names)
dataset['target']=df.target
dataset

ERROR:

<ipython-input-27-64122078da92> in <module>
----> 1 dataset = pd.DataFrame(df.data, columns =df.feature_names)
      2 dataset['target']=df.target
      3 dataset

D:\Anaconda\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   5463             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5464                 return self[name]
-> 5465             return object.__getattribute__(self, name)
   5466 
   5467     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'data'

I'm trying to set up a target to proceed with my Multi Linear Regression Project, but I can't even do that. I've already downloaded the CSV file and have it uploaded on a Jupyter Notebook. What I'm I doing wrong?

🌐
GitHub
github.com › pandas-dev › pandas › issues › 11640
BUG AttributeError: 'DataFrameGroupBy' object has no attribute '_obj_with_exclusions' · Issue #11640 · pandas-dev/pandas
November 18, 2015 - In [2]: df = pd.DataFrame(columns=['a','b','c','d'], data=[[1,'b1','c1',3], [1,'b2','c2',4]]) In [3]: df = df.pivot_table(index='a', columns=['b','c'], values='d').reset_index() In [4]: df Out[28]: b a b1 b2 c c1 c2 0 1 3 4 · Now, the exception raised: In [5]: df.groupby('a').mean() --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-29-a830c6135818> in <module>() ----> 1 df.groupby('a').mean() /home/nicolas/Git/pandas/pandas/core/groupby.py in mean(self) 764 self._set_selection_from_grouper() 765 f = lamb
Author   nbonnotte
🌐
Dataquest Community
community.dataquest.io › q&a › dq courses
Using groupby function and agg function, also, sorting pivot tables using a dictionary - DQ Courses - Dataquest Community
October 31, 2022 - Screen Link: [Learn data science with Python and R projects](https://I-94 project, with some additional exploration) There’s more detail in the version saved on my own computer, including a later attempt to do some stuff with a pivot table. Here it is. I-94 project plus some exploration.ipynb (210.7 KB) My Code: traffic_by_day=(day.groupby(['day']))['day','traffic_volume'].agg({'day','sum'}) What I expected to happen: I expected this to produce the sum of traffic volume for each...
🌐
Stack Overflow
stackoverflow.com › questions › 46534653 › error-attributeerror-dataframegroupby-object-has-no-attribute-while-groupby
python - Error 'AttributeError: 'DataFrameGroupBy' object has no attribute' while groupby functionality on dataframe - Stack Overflow
piv = df.pivot_table(index="persons", columns="flag", values="data", aggfunc='count', fill_value=0) piv = piv.apivend(piv.sum().rename('Total')).assign(Total=lambda x: x.sum(1)) piv['% selected'] = 100 * piv.yes/piv.Total print(piv) OUTPUT: flag no yes Total % selected persons Alex 2088 233 2321 10.038776 Bob 8352 929 9281 10.009697 Chuck 1810 202 2012 10.039761 Doug 30710 3413 34123 10.002051 Total 42960 4777 47737 10.006913
🌐
Incorta Community
community.incorta.com › t5 › data-schemas-knowledgebase › issue-with-converting-a-pandas-dataframe-to-a-spark-dataframe › ta-p › 5279
Issue with converting a Pandas DataFrame to a Spar... - Incorta Community
November 15, 2023 - Symptoms You received the error when trying to convert a Pandas DataFrame to Spark DataFrame in a PySpark MV. Here is the error.- INC_03070101: Transformation error Error 'DataFrame' object has no attribute 'iteritems' AttributeError : 'DataFrame' object has no attribute 'iteritems' Diagnosis Since...