A short, clean, scalable solution
Change some columns, leave the rest untouched

import pyspark.sql.functions as F

# That's not part of the solution, just a creation of a sample dataframe
# df = spark.createDataFrame([(10, 1,2,3,4),(20, 5,6,7,8)],'Id int, Revenue int ,GROSS_PROFIT int ,Net_Income int ,Enterprise_Value int')

cols_to_cast = ["Revenue" ,"GROSS_PROFIT" ,"Net_Income" ,"Enterprise_Value"]
df = df.select([F.col(c).cast('double') if c in cols_to_cast else c for c in df.columns])

df.printSchema()

root
 |-- Id: integer (nullable = true)
 |-- Revenue: double (nullable = true)
 |-- GROSS_PROFIT: double (nullable = true)
 |-- Net_Income: double (nullable = true)
 |-- Enterprise_Value: double (nullable = true)
Answer from David דודו Markovitz on Stack Overflow
🌐
Reddit
reddit.com › r/learnpython › "'dataframe' object has no attribute" issue
r/learnpython on Reddit: "'DataFrame' object has no attribute" Issue
October 30, 2020 -

I am in university and am taking a special topics class regarding AI. I have zero knowledge about Python, how it works, or what anything means.

A project for the class involves manipulating Bayesian networks to predict how many and which individuals die upon the sinking of a ship. This is the code I am supposed to manipulate:

##EDIT VARIABLES TO THE VARIABLES OF INTEREST
train_var = train.loc[:,['Survived','Sex']]  
test_var = test.loc[:,['Sex']]  
BayesNet = BayesianModel([('Sex','Survived')])

I am supposed to add another variable, 'Pclass,' to the mix, paying attention to the order for causation. I have added that variable to every line of this code in every way imaginable and consistently get an error from this line:

predictions = pandas.DataFrame({'PassengerId': test.PassengerId,'Survived': hypothesis.Survived.tolist()})
predictions

For example, the error I get for this version of the code:

train_var = train.loc[:,['Survived','Pclass','Sex']]  
test_var = test.loc[:,['Pclass']]  
BayesNet = BayesianModel([('Sex','Pclass','Survived')])

is this:

AttributeError                            Traceback (most recent call last)
<ipython-input-98-16d9eb9451f7> in <module>
----> 1 predictions = pandas.DataFrame({'PassengerId': test.PassengerId,'Survived': hypothesis.Survived.tolist()})
      2 predictions

/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in __getattr__(self, name)
   5137             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5138                 return self[name]
-> 5139             return object.__getattribute__(self, name)
   5140 
   5141     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'Survived'

Honestly, I have no idea wtf any of this means. I have tried googling this issue and have come up with nothing.

Any help would be greatly appreciated. I know it's a lot.

Discussions

AttributeError: 'DataFrame' object has no attribute 'dtype'
This is necessary when loading ... return object.__getattribute__(self, name) 5275 5276 def __setattr__(self, name: str, value) -> None: AttributeError: 'DataFrame' object has no attribute 'dtype'... More on github.com
🌐 github.com
14
June 3, 2020
AttributeError: 'DataFrame' object has no attribute 'get'
3185 p = _CategoricalPlotter() ... is not None 3190 or (sharex and p.orient == "v") 3191 or (sharey and p.orient == "h") 3192 ): 3193 # Sync categorical axis between facets to have the same categories 3194 order = p.group_names ... --> 532 x = data.get(x, x) 533 y = data.get(y, y) 534 hue = data.get(hue, hue) AttributeError: 'DataFrame' object has no attribute ... More on github.com
🌐 github.com
2
June 6, 2023
AttributeError: 'DataFrame' object has no attribute '_get_object_id'
Using the Zeppilin notebook server, I have written the following script. The initialization is taken from the template created in glue, but the rest of it is custom. I'm getting the error: AttributeError: 'DataFrame' object has no attribute '_get_object_id' More on repost.aws
🌐 repost.aws
3
0
October 11, 2018
[FEA] as_type() for Dataframes
I am trying to cast values in a dataframe to another datatype, and get the error AttributeError: 'DataFrame' object has no attribute 'astype'. I think as_type() is available for Ser... More on github.com
🌐 github.com
1
August 15, 2019
Top answer
1 of 5
2

"sklearn.datasets" is a scikit package, where it contains a method load_iris().

load_iris(), by default return an object which holds data, target and other members in it. In order to get actual values you have to read the data and target content itself.

Whereas 'iris.csv', holds feature and target together.

FYI: If you set return_X_y as True in load_iris(), then you will directly get features and target.

from sklearn import datasets
data,target = datasets.load_iris(return_X_y=True)
2 of 5
1

The Iris Dataset from Sklearn is in Sklearn's Bunch format:

print(type(iris))
print(iris.keys())

output:

<class 'sklearn.utils.Bunch'>
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])

So, that's why you can access it as:

x=iris.data
y=iris.target

But when you read the CSV file as DataFrame as mentioned by you:

iris = pd.read_csv('iris.csv',header=None).iloc[:,2:4]
iris.head()

output is:

    2   3
0   petal_length    petal_width
1   1.4 0.2
2   1.4 0.2
3   1.3 0.2
4   1.5 0.2

Here the column names are '1' and '2'.

First of all you should read the CSV file as:

df = pd.read_csv('iris.csv')

you should not include header=None as your csv file includes the column names i.e. the headers.

So, now what you can do is something like this:

X = df.iloc[:, [2, 3]] # Will give you columns 2 and 3 i.e 'petal_length' and 'petal_width'
y = df.iloc[:, 4] # Label column i.e 'species'

or if you want to use the column names then:

X = df[['petal_length', 'petal_width']]
y = df.iloc['species']

Also, if you want to convert labels from string to numerical format use sklearn LabelEncoder

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
y = le.fit_transform(y)
🌐
Edureka Community
edureka.co › community › 42320 › python-pandas-attributeerror-dataframe-object-attribute
Python Pandas error AttributeError DataFrame object has ...
March 28, 2019 - Host '172.31.27.232' is blocked because of many connection errors; unblock with 'mariadb-admin flush-hosts'
🌐
GitHub
github.com › pycaret › pycaret › issues › 195
AttributeError: 'DataFrame' object has no attribute 'dtype' · Issue #195 · pycaret/pycaret
June 3, 2020 - This is necessary when loading ... return object.__getattribute__(self, name) 5275 5276 def __setattr__(self, name: str, value) -> None: AttributeError: 'DataFrame' object has no attribute 'dtype'...
Author   sorenwacker
🌐
GitHub
github.com › mwaskom › seaborn › issues › 3379
AttributeError: 'DataFrame' object has no attribute 'get' · Issue #3379 · mwaskom/seaborn
June 6, 2023 - 3185 p = _CategoricalPlotter() 3186 p.require_numeric = plotter_class.require_numeric -> 3187 p.establish_variables(x_, y_, hue, data, orient, order, hue_order) 3188 if ( 3189 order is not None 3190 or (sharex and p.orient == "v") 3191 or (sharey and p.orient == "h") 3192 ): 3193 # Sync categorical axis between facets to have the same categories 3194 order = p.group_names ... --> 532 x = data.get(x, x) 533 y = data.get(y, y) 534 hue = data.get(hue, hue) AttributeError: 'DataFrame' object has no attribute 'get' One must convert the polars dataframe to pandas, which requires extra computational overhead: p = sns.catplot( data=iris.to_pandas(), x="species", y="sepal_width", kind="box", height=3, aspect=1.3 ) No one assigned ·
Author   nick-youngblut
Find elsewhere
🌐
AWS re:Post
repost.aws › questions › QUvWrsRjenSrqHLJqLpy4DWg › attributeerror-dataframe-object-has-no-attribute-get-object-id
AttributeError: 'DataFrame' object has no attribute '_get_object_id' | AWS re:Post
October 11, 2018 - Using the Zeppilin notebook server, I have written the following script. The initialization is taken from the template created in glue, but the rest of it is custom. I'm getting the error: AttributeError: 'DataFrame' object has no attribute '_get_object_id'
🌐
Cloudera Community
community.cloudera.com › t5 › Support-Questions › Pyspark-issue-AttributeError-DataFrame-object-has-no › m-p › 78093
Pyspark issue AttributeError: 'DataFrame' object has no attribute 'saveAsTextFile'
January 2, 2024 - As the error message states, the object, either a DataFrame or List does not have the saveAsTextFile() method. result.write.save() or result.toJavaRDD.saveAsTextFile() shoud do the work, or you can refer to DataFrame or RDD api: https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.sql.DataFrameWriter
🌐
GitHub
github.com › rapidsai › cudf › issues › 2600
[FEA] as_type() for Dataframes · Issue #2600 · rapidsai/cudf
August 15, 2019 - I am trying to cast values in a dataframe to another datatype, and get the error AttributeError: 'DataFrame' object has no attribute 'astype'. I think as_type() is available for Series, but it would be good to have it for DataFrames (lik...
Published   Aug 15, 2019
🌐
Spark By {Examples}
sparkbyexamples.com › home › hbase › attributeerror: ‘dataframe’ object has no attribute ‘map’ in pyspark
AttributeError: 'DataFrame' object has no attribute 'map' in PySpark - Spark By {Examples}
March 27, 2024 - PySpark DataFrame doesn’t have a map() transformation instead it’s present in RDD hence you are getting the error AttributeError: ‘DataFrame’ object has no attribute ‘map’
🌐
Data Science Learner
datasciencelearner.com › attributeerror-dataframe-object-has-no-attribute-tolist-solved
AttributeError: dataframe object has no attribute tolist ( Solved )
October 2, 2023 - In this tutorial, you have learned how to use the tolist() function to solve the error AttributeError : dataframe object has no attribute tolist error. Instead of using the tolist() function on the entire dataframe use it on a specific column.
🌐
Reddit
reddit.com › r/learnpython › attributeerror: 'dataframe' object has no attribute 'date'
r/learnpython on Reddit: AttributeError: 'DataFrame' object has no attribute 'date'
October 13, 2018 -

The last line in the code below gives an AttributeError (see desc in title). The code below is immediately followed by a loop. Any ideas how O can properly refer to the 'Date' column which is the first column in the csv file. PS.: I tried d0=p('Date') and that does give me anything either. help, please.

import numpy as np
import pandas as pd
import scipy as sp
import statsmodels.api as sm
from datetime import date

ticker = 'IBM'
begdate= date(2012,1,1)
enddate= date(2016,12,31)


p = pd.read_csv('/Users/myname/Downloads/IBM_M.csv',
                     index_col=0,
                     parse_dates=["Date"])
        

print(p.head())
#calculate log returns

p['log_ret'] = np.log(p['Adj Close']) - np.log(p['Adj Close'].shift(1))
logret = p['log_ret']  

print(logret.head())

ddate=[]
d0=p.date

🌐
GitHub
github.com › scikit-learn-contrib › imbalanced-learn › issues › 666
AttributeError: 'DataFrame' object has no attribute 'name' · Issue #666 · scikit-learn-contrib/imbalanced-learn
December 18, 2019 - AttributeError Traceback (most ... -> 5067 return object.__getattribute__(self, name) 5068 5069 def __setattr__(self, name, value): AttributeError: 'DataFrame' object has no attribute 'name'...
Author   islrnd
🌐
Saturn Cloud
saturncloud.io › blog › solving-the-dataframe-object-has-no-attribute-name-error-in-pandas
Solving the 'DataFrame Object Has No Attribute 'name' Error in Pandas | Saturn Cloud Blog
November 2, 2023 - Pandas is a powerful data manipulation library in Python, widely used by data scientists and analysts. However, it's not uncommon to encounter errors while working with it. One such error is the 'DataFrame object has no attribute 'name'' error. This blog post will guide you through understanding ...
🌐
Dask Forum
dask.discourse.group › dask dataframe
AttributeError: 'DataFrame' object has no attribute 'repartition' - Dask DataFrame - Dask Forum
January 12, 2022 - Hey I am a bit new to dask so apologies if its a very basic question. I have been trying parallelize my workflow which goes along the lines of read in a big dataset → filter it → convert a few columns to tensors. While trying to use dask dataframes to filter, I found there was no way to use .iloc to filter for the rows.
🌐
GitHub
github.com › dask › dask › issues › 8624
AttributeError: 'DataFrame' object has no attribute 'name'; Various stack overflow / github suggested fixes not working · Issue #8624 · dask/dask
January 26, 2022 - { File "[redacted]/pandas/core/generic.py", line 5487, in __getattr__ return object.__getattribute__(self, name) AttributeError: 'DataFrame' object has no attribute 'name'. Did you mean: 'rename'? }
Author   david-thrower