You can't reference a second spark DataFrame inside a function, unless you're using a join. IIUC, you can do the following to achieve your desired result.

Suppose that means is the following:

#means.show()
#+---+---------+
#| id|avg(col1)|
#+---+---------+
#|  1|     12.0|
#|  3|    300.0|
#|  2|     21.0|
#+---+---------+

Join df and means on the id column, then apply your when condition

from pyspark.sql.functions import when

df.join(means, on="id")\
    .withColumn(
        "col1",
        when(
            (df["col1"].isNull()), 
            means["avg(col1)"]
        ).otherwise(df["col1"])
    )\
    .select(*df.columns)\
    .show()
#+---+-----+
#| id| col1|
#+---+-----+
#|  1| 12.0|
#|  1| 12.0|
#|  1| 14.0|
#|  1| 10.0|
#|  3|300.0|
#|  3|300.0|
#|  2| 21.0|
#|  2| 22.0|
#|  2| 20.0|
#+---+-----+

But in this case, I'd actually recommend using a Window with pyspark.sql.functions.mean:

from pyspark.sql import Window
from pyspark.sql.functions import col, mean

df.withColumn(
    "col1",
    when(
        col("col1").isNull(), 
        mean("col1").over(Window.partitionBy("id"))
    ).otherwise(col("col1"))
).show()
#+---+-----+
#| id| col1|
#+---+-----+
#|  1| 12.0|
#|  1| 10.0|
#|  1| 12.0|
#|  1| 14.0|
#|  3|300.0|
#|  3|300.0|
#|  2| 22.0|
#|  2| 20.0|
#|  2| 21.0|
#+---+-----+
Answer from pault on Stack Overflow
🌐
Mbse-capella
forum.mbse-capella.org › scripting
Object has no attribute '_get_object_id' - Scripting - Eclipse Capella Forum
May 23, 2022 - requirements Packages (pkg): def get_reqPKG3(self): “”" “”" return create_e_list(self.get_java_object().getOwnedRequirementPkgs(), RequirementPkg) I do obtain a java list containing the requirements packages successfully: print(se.get_physical_architecture().get_reqPKG3()) → The...
🌐
GitHub
github.com › mwaskom › seaborn › issues › 3379
AttributeError: 'DataFrame' object has no attribute 'get' · Issue #3379 · mwaskom/seaborn
June 6, 2023 - 3185 p = _CategoricalPlotter() 3186 p.require_numeric = plotter_class.require_numeric -> 3187 p.establish_variables(x_, y_, hue, data, orient, order, hue_order) 3188 if ( 3189 order is not None 3190 or (sharex and p.orient == "v") 3191 or (sharey and p.orient == "h") 3192 ): 3193 # Sync categorical axis between facets to have the same categories 3194 order = p.group_names ... --> 532 x = data.get(x, x) 533 y = data.get(y, y) 534 hue = data.get(hue, hue) AttributeError: 'DataFrame' object has no attribute 'get'
Author   nick-youngblut
Discussions

python - pyspark 'DataFrame' object has no attribute '_get_object_id' - Stack Overflow
I am trying to run some code, but getting error: 'DataFrame' object has no attribute '_get_object_id' The code: items = [(1,12),(1,float('Nan')),(1,14),(1,10),(2,22),(2,20),(2,float('Nan')),(3,3... More on stackoverflow.com
🌐 stackoverflow.com
"'DataFrame' object has no attribute" Issue

Double check if there's a space in the column name. 'Survived ' vs 'Survived' It happens more often than you'd think especially with CSV data source.

More on reddit.com
🌐 r/learnpython
8
1
October 30, 2020
AttributeError: 'numpy.int64' object has no attribute '_get_object_id'
AttributeError: 'numpy.int64' object has no attribute '_get_object_id' More on github.com
🌐 github.com
2
May 19, 2021
AttributeError: 'DataFrame' object has no attribute 'name'; Various stack overflow / github suggested fixes not working
What happened: I perform a pipeline of transformations on a Dask dataframe originating from dd.read_sql_table() from a view in an oracle DB. In one stage that follows many successful stages, I try ... More on github.com
🌐 github.com
10
January 26, 2022
🌐
Databricks Community
community.databricks.com › t5 › data-engineering › attributeerror-dataframe-object-has-no-attribute › td-p › 61132
AttributeError: 'DataFrame' object has no attribut... - Databricks Community - 61132
February 19, 2024 - Hello, I have some trouble deduplicating rows on the "id" column, with the method "dropDuplicatesWithinWatermark" in a pipeline. When I run this pipeline, I get the error message: "AttributeError: 'DataFrame' object has no attribute 'dropDuplicatesWithinWatermark'" Here is part of the code: @dl...
🌐
Itsourcecode
itsourcecode.com › home › ‘dataframe’ object has no attribute ‘_get_object_id’ [fixed]
'dataframe' object has no attribute '_get_object_id' [FIXED]
March 28, 2023 - import pandas as pd s_df = ... '_get_object_id' in Python can be easily solved by upgrading your Pandas or using the reset_index() method instead of _get_object_id() method....
🌐
Reddit
reddit.com › r/learnpython › "'dataframe' object has no attribute" issue
r/learnpython on Reddit: "'DataFrame' object has no attribute" Issue
October 30, 2020 -

I am in university and am taking a special topics class regarding AI. I have zero knowledge about Python, how it works, or what anything means.

A project for the class involves manipulating Bayesian networks to predict how many and which individuals die upon the sinking of a ship. This is the code I am supposed to manipulate:

##EDIT VARIABLES TO THE VARIABLES OF INTEREST
train_var = train.loc[:,['Survived','Sex']]  
test_var = test.loc[:,['Sex']]  
BayesNet = BayesianModel([('Sex','Survived')])

I am supposed to add another variable, 'Pclass,' to the mix, paying attention to the order for causation. I have added that variable to every line of this code in every way imaginable and consistently get an error from this line:

predictions = pandas.DataFrame({'PassengerId': test.PassengerId,'Survived': hypothesis.Survived.tolist()})
predictions

For example, the error I get for this version of the code:

train_var = train.loc[:,['Survived','Pclass','Sex']]  
test_var = test.loc[:,['Pclass']]  
BayesNet = BayesianModel([('Sex','Pclass','Survived')])

is this:

AttributeError                            Traceback (most recent call last)
<ipython-input-98-16d9eb9451f7> in <module>
----> 1 predictions = pandas.DataFrame({'PassengerId': test.PassengerId,'Survived': hypothesis.Survived.tolist()})
      2 predictions

/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in __getattr__(self, name)
   5137             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5138                 return self[name]
-> 5139             return object.__getattribute__(self, name)
   5140 
   5141     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'Survived'

Honestly, I have no idea wtf any of this means. I have tried googling this issue and have come up with nothing.

Any help would be greatly appreciated. I know it's a lot.

🌐
GitHub
github.com › databricks › koalas › issues › 2161
AttributeError: 'numpy.int64' object has no attribute '_get_object_id' · Issue #2161 · databricks/koalas
May 19, 2021 - a = ks.DataFrame({'source': [1,2,3,4,5]}) a.source.isin([np.int64(1), np.int64(2)]) AttributeError: 'numpy.int64' object has no attribute '_get_object_id' But this is ok a = ks.DataFrame({'source': [1,2,3,4,5]}) a.source.isin([1, 2]) Ful...
Published   May 19, 2021
Author   lepmik
Find elsewhere
🌐
GitHub
github.com › dask › dask › issues › 8624
AttributeError: 'DataFrame' object has no attribute 'name'; Various stack overflow / github suggested fixes not working · Issue #8624 · dask/dask
January 26, 2022 - To do this, I add a method to a class. The method takes a string as an argument, returns a Dask series representing the state_id,city_id, and district_id found when querying that district (returns a row of the intended resultant dataframe).
Author   david-thrower
🌐
Apache
lists.apache.org › thread › 4fkyq539n629kdy7mdcjww2jrfh3snd5
Re: pyspark not working for me...
Email display mode: · Modern rendering · Legacy rendering · This site requires JavaScript enabled. Please enable it
🌐
Databricks
kb.databricks.com › python › function-object-no-attribute
AttributeError: ‘function’ object has no attribute - Databricks
May 19, 2022 - If a column in your DataFrame uses a protected keyword as the column name, you will get an error message. For example, summary is a protected keyword. If you use summary as a column name, you will see the error message. This sample code uses summary as a column name and generates the error ...
🌐
Originlab
my.originlab.com › forum › topic.asp
The Origin Forum - AttributeError: 'DataFrame' object has no attribut
April 25, 2023 - Over 1 Million registered users across corporations, universities and government research labs worldwide, rely on Origin to import, graph, explore, analyze and interpret their data. With a point-and-click interface and tools for batch operations, Origin helps them optimize their daily workflow.
🌐
Reddit
reddit.com › r/learnpython › attributeerror: 'dataframe' object has no attribute 'data'
r/learnpython on Reddit: AttributeError: 'DataFrame' object has no attribute 'data'
September 29, 2021 -
wine = pd.read_csv("combined.csv", header=0).iloc[:-1]
df = pd.DataFrame(wine)
df
dataset = pd.DataFrame(df.data, columns =df.feature_names)
dataset['target']=df.target
dataset

ERROR:

<ipython-input-27-64122078da92> in <module>
----> 1 dataset = pd.DataFrame(df.data, columns =df.feature_names)
      2 dataset['target']=df.target
      3 dataset

D:\Anaconda\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   5463             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5464                 return self[name]
-> 5465             return object.__getattribute__(self, name)
   5466 
   5467     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'data'

I'm trying to set up a target to proceed with my Multi Linear Regression Project, but I can't even do that. I've already downloaded the CSV file and have it uploaded on a Jupyter Notebook. What I'm I doing wrong?

🌐
Brainly
brainly.com › computers and technology › high school › how to resolve attributeerror: 'dataframe' object has no attribute?
[FREE] How to resolve AttributeError: 'DataFrame' object has no attribute? - brainly.com
The AttributeError: 'DataFrame' object has no attribute is an error message that indicates that the code is trying to access an attribute or method that does not exist on the DataFrame object.
Top answer
1 of 5
2

"sklearn.datasets" is a scikit package, where it contains a method load_iris().

load_iris(), by default return an object which holds data, target and other members in it. In order to get actual values you have to read the data and target content itself.

Whereas 'iris.csv', holds feature and target together.

FYI: If you set return_X_y as True in load_iris(), then you will directly get features and target.

from sklearn import datasets
data,target = datasets.load_iris(return_X_y=True)
2 of 5
1

The Iris Dataset from Sklearn is in Sklearn's Bunch format:

print(type(iris))
print(iris.keys())

output:

<class 'sklearn.utils.Bunch'>
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])

So, that's why you can access it as:

x=iris.data
y=iris.target

But when you read the CSV file as DataFrame as mentioned by you:

iris = pd.read_csv('iris.csv',header=None).iloc[:,2:4]
iris.head()

output is:

    2   3
0   petal_length    petal_width
1   1.4 0.2
2   1.4 0.2
3   1.3 0.2
4   1.5 0.2

Here the column names are '1' and '2'.

First of all you should read the CSV file as:

df = pd.read_csv('iris.csv')

you should not include header=None as your csv file includes the column names i.e. the headers.

So, now what you can do is something like this:

X = df.iloc[:, [2, 3]] # Will give you columns 2 and 3 i.e 'petal_length' and 'petal_width'
y = df.iloc[:, 4] # Label column i.e 'species'

or if you want to use the column names then:

X = df[['petal_length', 'petal_width']]
y = df.iloc['species']

Also, if you want to convert labels from string to numerical format use sklearn LabelEncoder

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
y = le.fit_transform(y)
🌐
Saturn Cloud
saturncloud.io › blog › solving-the-dataframe-object-has-no-attribute-name-error-in-pandas
Solving the 'DataFrame Object Has No Attribute 'name' Error in Pandas | Saturn Cloud Blog
November 2, 2023 - Before we delve into the solution, let’s understand the error. The 'DataFrame' object has no attribute 'name' error typically occurs when you try to access a DataFrame’s 'name' attribute, which doesn’t exist.
🌐
GitHub
github.com › YosefLab › Compass › issues › 92
AttributeError: 'DataFrame' object has no attribute 'iteritems' · Issue #92 · YosefLab/Compass
April 5, 2023 - "Evaluating Reaction Pentalties:" Traceback (most recent call last): File "/home/ubuntu/.local/bin/compass", line 8, in sys.exit(entry()) File "/home/ubuntu/.local/lib/python3.8/site-packages/compass/main.py", line 588, in entry penalties = eval_reaction_penalties(args['data'], args['model'], File "/home/ubuntu/.local/lib/python3.8/site-packages/compass/compass/penalties.py", line 96, in eval_reaction_penalties reaction_penalties = eval_reaction_penalties_shared( File "/home/ubuntu/.local/lib/python3.8/site-packages/compass/compass/penalties.py", line 158, in eval_reaction_penalties_shared for name, expression_data in expression.iteritems(): File "/home/ubuntu/.local/lib/python3.8/site-packages/pandas/core/generic.py", line 5989, in getattr return object.getattribute(self, name) AttributeError: 'DataFrame' object has no attribute 'iteritems'
Author   JoelHaas
🌐
Databricks Community
community.databricks.com › t5 › data-engineering › attributeerror-dataframe-object-has-no-attribute-rename › td-p › 28109
Solved: AttributeError: 'DataFrame' object has no attribut... - Databricks Community - 28109
January 2, 2024 - https://stackoverflow.com/questions/28163439/attributeerror-dataframe-object-has-no-attribute-height... https://stackoverflow.com/questions/38134643/data-frame-object-has-no-attribute ... If df_boston is a DataFrame, but you still face issues, try an alternative syntax: df_boston = df_boston.rename(columns={'zn': 'Zoning'}).