attributeerror: 'dataframe' object has no attribute category

Pandas error "AttributeError: 'DataFrame' object has no attribute 'add_categories'" when trying to add catorical values?

stackoverflow.com › questions › 63567662 › pandas-error-attributeerror-dataframe-object-has-no-attribute-add-categorie

The error occurs because pandas.Series.cat.add_categories is a Series method, and df[['embarked', 'sex', 'pclass']] is a DataFrame.
Use pd.Categorical
pandas: Categorical data
Some of the titanic dataset columns contain NaNs, which can't be categories.
- Use .dropna() when creating the categories.

single column

df['embarked'] = pd.Categorical(df['embarked'], categories=df['embarked'].dropna().unique())

multiple columns

# looping through the columns
for col in ['embarked', 'sex', 'pclass']:
    df[col] = pd.Categorical(df[col], categories=df[col].dropna().unique())

# alternatively with .apply
df[['embarked', 'sex', 'pclass']] = df[['embarked', 'sex', 'pclass']].apply(lambda x: pd.Categorical(x, x.dropna().unique(), ordered=True))

Appending new categories

# create a sample series
s = pd.Series(["a", "b", "c", "a"], dtype="category")

# add a category
s = s.cat.add_categories([4])

Answer from Trenton McKinney on Stack Overflow

reddit.com › r/learnpython › "'dataframe' object has no attribute" issue

r/learnpython on Reddit: "'DataFrame' object has no attribute" Issue

October 30, 2020 -

I am in university and am taking a special topics class regarding AI. I have zero knowledge about Python, how it works, or what anything means.

A project for the class involves manipulating Bayesian networks to predict how many and which individuals die upon the sinking of a ship. This is the code I am supposed to manipulate:

##EDIT VARIABLES TO THE VARIABLES OF INTEREST
train_var = train.loc[:,['Survived','Sex']]  
test_var = test.loc[:,['Sex']]  
BayesNet = BayesianModel([('Sex','Survived')])

I am supposed to add another variable, 'Pclass,' to the mix, paying attention to the order for causation. I have added that variable to every line of this code in every way imaginable and consistently get an error from this line:

predictions = pandas.DataFrame({'PassengerId': test.PassengerId,'Survived': hypothesis.Survived.tolist()})
predictions

For example, the error I get for this version of the code:

train_var = train.loc[:,['Survived','Pclass','Sex']]  
test_var = test.loc[:,['Pclass']]  
BayesNet = BayesianModel([('Sex','Pclass','Survived')])

is this:

AttributeError                            Traceback (most recent call last)
<ipython-input-98-16d9eb9451f7> in <module>
----> 1 predictions = pandas.DataFrame({'PassengerId': test.PassengerId,'Survived': hypothesis.Survived.tolist()})
      2 predictions

/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in __getattr__(self, name)
   5137             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5138                 return self[name]
-> 5139             return object.__getattribute__(self, name)
   5140 
   5141     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'Survived'

Honestly, I have no idea wtf any of this means. I have tried googling this issue and have come up with nothing.

Any help would be greatly appreciated. I know it's a lot.

Top answer

1 of 2

Double check if there's a space in the column name. 'Survived ' vs 'Survived' It happens more often than you'd think especially with CSV data source.

2 of 2

It's an issue with how you're calling the data and if it's actually there.

train.loc[:,['Survived','Sex']]

tells me that there's a DataFrame (which is from pandas, hence the error) called train and this line is trying to access parts of that dataframe (it's just a type of an array). Specifically, it's trying to access columns named Survived and Sex.

Similarly, this line tells me there's another dataframe (df) known as test with a column named Sex and this is access that data.

test.loc[:,['Sex']]

The error code also informs me of some things

predictions = pandas.DataFrame({'PassengerId': test.PassengerId,'Survived': hypothesis.Survived.tolist()})

There's another df called predictions that's of dict type which is trying to access information from the another hypothesis df. The attribute it's tryin to access in the second key of the dict is

hypothesis.Survived.tolist()

which is a way of calling a column from that df. That is, when the predictions line is executed, it's trying to pull all the values from the Survived column of the hypothesis df.

The error is that the df doesn't actually have a column named Survived. So either there's missing data, or you're calling it wrong, or there's a missing reference.

Without knowing more about your code and your question, I can't really extrapolate much more.

Stack Overflow

stackoverflow.com › questions › 63567662 › pandas-error-attributeerror-dataframe-object-has-no-attribute-add-categorie

python - Pandas error "AttributeError: 'DataFrame' object has no attribute 'add_categories'" when trying to add catorical values? - Stack Overflow

Top answer

1 of 1

The error occurs because pandas.Series.cat.add_categories is a Series method, and df[['embarked', 'sex', 'pclass']] is a DataFrame.
Use pd.Categorical
pandas: Categorical data
Some of the titanic dataset columns contain NaNs, which can't be categories.
- Use .dropna() when creating the categories.

single column

df['embarked'] = pd.Categorical(df['embarked'], categories=df['embarked'].dropna().unique())

multiple columns

# looping through the columns
for col in ['embarked', 'sex', 'pclass']:
    df[col] = pd.Categorical(df[col], categories=df[col].dropna().unique())

# alternatively with .apply
df[['embarked', 'sex', 'pclass']] = df[['embarked', 'sex', 'pclass']].apply(lambda x: pd.Categorical(x, x.dropna().unique(), ordered=True))

Appending new categories

# create a sample series
s = pd.Series(["a", "b", "c", "a"], dtype="category")

# add a category
s = s.cat.add_categories([4])

Discussions

AttributeError: 'DataFrame' object has no attribute 'cat' when running lgb.train()

Hello guys. It's my first time using LightGBM and I have been stcuk on this issue for a couple of days. I'm trying to replicate a solution of a Kaggle competition and this solution uses Lig... More on github.com

github.com

July 31, 2023

AttributeError: 'DataFrame' object has no attribute 'name'; Various stack overflow / github suggested fixes not working

What happened: I perform a pipeline of transformations on a Dask dataframe originating from dd.read_sql_table() from a view in an oracle DB. In one stage that follows many successful stages, I try ... More on github.com

github.com

January 26, 2022

python - Pandas not recognizing the `.cat` command when changing column to categorical data - Stack Overflow

I have a data frame with six categorical columns that I would like to change to categorical codes. I use to use the following: cat_columns = ['col1', 'col2', 'col3'] df[cat_columns] = df[cat_colum... More on stackoverflow.com

stackoverflow.com

python - I got the following error : 'DataFrame' object has no attribute 'data' - Data Science Stack Exchange

I am trying to get the 'data' and the 'target' of the iris setosa database, but I can't. For example, when I load the iris setosa directly from sklearn datasets I get a good result: Program: from More on datascience.stackexchange.com

datascience.stackexchange.com

August 26, 2018

Stack Overflow

stackoverflow.com › questions › 38134643 › how-to-resolve-attributeerror-dataframe-object-has-no-attribute

python - How to resolve AttributeError: 'DataFrame' object has no attribute - Stack Overflow

Top answer

1 of 7

Check your DataFrame with data.columns

It should print something like this

Index([u'regiment', u'company',  u'name',u'postTestScore'], dtype='object')

Check for hidden white spaces..Then you can rename with

data = data.rename(columns={'Number ': 'Number'})

2 of 7

I think the column name that contains "Number" is something like " Number" or "Number ". I'm assuming you might have a residual space in the column name. Please run print "<{}>".format(data.columns[1]) and see what you get. If it's something like < Number>, it can be fixed with:

data.columns = data.columns.str.strip()

See pandas.Series.str.strip

In general, AttributeError: 'DataFrame' object has no attribute '...', where ... is some column name, is caused because . notation has been used to reference a nonexistent column name or pandas method.

pandas methods are accessed with a .. pandas columns can also be accessed with a . (e.g. data.col) or with brackets (e.g. ['col'] or [['col1', 'col2']]).

data.columns = data.columns.str.strip() is a fast way to quickly remove leading and trailing spaces from all column names. Otherwise verify the column or attribute is correctly spelled.

GitHub

github.com › microsoft › LightGBM › issues › 6015

AttributeError: 'DataFrame' object has no attribute 'cat' when running lgb.train() · Issue #6015 · microsoft/LightGBM

July 31, 2023 - AttributeError: 'DataFrame' object has no attribute 'cat' when running lgb.train()#6015 · Copy link · Labels · question · EnricDataS · opened · on Jul 31, 2023 · Issue body actions · Hello guys. It's my first time using LightGBM and I have been stcuk on this issue for a couple of days.

Author EnricDataS

GitHub

github.com › dask › dask › issues › 8624

AttributeError: 'DataFrame' object has no attribute 'name'; Various stack overflow / github suggested fixes not working · Issue #8624 · dask/dask

January 26, 2022 - The method being applied should return a Dask dataframe of 3 np.int64 columns: state_id, city_id, district_id. inspect.getmembers(my_dask_df) was expected to return a list of object names and values to introspect the dask dataframe object.

Author david-thrower

Stack Overflow

stackoverflow.com › questions › 62585059 › pandas-not-recognizing-the-cat-command-when-changing-column-to-categorical-da

python - Pandas not recognizing the `.cat` command when changing column to categorical data - Stack Overflow

Top answer

1 of 1

The .cat is not applicable for Dataframe, so you have to apply for each column separately as series. You can use .apply() and apply cat as a lambda function

df[cat_columns] = df[cat_columns].apply(lambda x: x.cat.codes)

Or loop through the columns and use the cat funtion

for col in cat_columns:
    df[col] = df[col].cat.codes

Stack Exchange

datascience.stackexchange.com › questions › 37435 › i-got-the-following-error-dataframe-object-has-no-attribute-data

python - I got the following error : 'DataFrame' object has no attribute 'data' - Data Science Stack Exchange

Top answer

1 of 5

"sklearn.datasets" is a scikit package, where it contains a method load_iris().

load_iris(), by default return an object which holds data, target and other members in it. In order to get actual values you have to read the data and target content itself.

Whereas 'iris.csv', holds feature and target together.

FYI: If you set return_X_y as True in load_iris(), then you will directly get features and target.

from sklearn import datasets
data,target = datasets.load_iris(return_X_y=True)

2 of 5

The Iris Dataset from Sklearn is in Sklearn's Bunch format:

print(type(iris))
print(iris.keys())

output:

<class 'sklearn.utils.Bunch'>
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])

So, that's why you can access it as:

x=iris.data
y=iris.target

But when you read the CSV file as DataFrame as mentioned by you:

iris = pd.read_csv('iris.csv',header=None).iloc[:,2:4]
iris.head()

output is:

    2   3
0   petal_length    petal_width
1   1.4 0.2
2   1.4 0.2
3   1.3 0.2
4   1.5 0.2

Here the column names are '1' and '2'.

First of all you should read the CSV file as:

df = pd.read_csv('iris.csv')

you should not include header=None as your csv file includes the column names i.e. the headers.

So, now what you can do is something like this:

X = df.iloc[:, [2, 3]] # Will give you columns 2 and 3 i.e 'petal_length' and 'petal_width'
y = df.iloc[:, 4] # Label column i.e 'species'

or if you want to use the column names then:

X = df[['petal_length', 'petal_width']]
y = df.iloc['species']

Also, if you want to convert labels from string to numerical format use sklearn LabelEncoder

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
y = le.fit_transform(y)

Databricks Community

community.databricks.com › t5 › data-engineering › attributeerror-dataframe-object-has-no-attribute › td-p › 61132

AttributeError: 'DataFrame' object has no attribut... - Databricks Community - 61132

February 19, 2024 - Hello, I have some trouble deduplicating rows on the "id" column, with the method "dropDuplicatesWithinWatermark" in a pipeline. When I run this pipeline, I get the error message: "AttributeError: 'DataFrame' object has no attribute 'dropDuplicatesWithinWatermark'" Here is part of the code: @dl...

Find elsewhere

Google Bing Mojeek

GitHub

github.com › scikit-learn › scikit-learn › issues › 32155

ColumnTransformer.fit() fails on polars.DataFrame: AttributeError: 'DataFrame' object has no attribute 'size' · Issue #32155 · scikit-learn/scikit-learn

September 11, 2025 - Describe the bug Fitting a sklearn.compose.ColumnTransformer with more than one transformer on a polars.DataFrame yields the error: AttributeError: 'DataFrame' object has no attribute '...

Author ph-ll-pp

Brainly

brainly.com › computers and technology › high school › how to resolve attributeerror: 'dataframe' object has no attribute?

[FREE] How to resolve AttributeError: 'DataFrame' object has no attribute? - brainly.com

The 'AttributeError: 'DataFrame' object has no attribute' suggests that a nonexistent attribute is being accessed on a DataFrame. Check the DataFrame's structure and ensure columns are correctly typed.

GitHub

github.com › pandas-dev › pandas › issues › 28640

AttributeError: 'Categorical' object has no attribute '_ordered' · Issue #28640 · pandas-dev/pandas

September 26, 2019 - Trying to assign a variable an ordered category type failes with AttributeError: 'Categorical' object has no attribute '_ordered'

Author jorwoods

Stack Overflow

stackoverflow.com › questions › 19392226 › attributeerror-dataframe-object-has-no-attribute

python - AttributeError: 'DataFrame' object has no attribute - Stack Overflow

Top answer

1 of 6

value_counts is a Series method rather than a DataFrame method (and you are trying to use it on a DataFrame, clean). You need to perform this on a specific column:

clean[column_name].value_counts()

It doesn't usually make sense to perform value_counts on a DataFrame, though I suppose you could apply it to every entry by flattening the underlying values array:

pd.value_counts(df.values.flatten())

2 of 6

To get all the counts for all the columns in a dataframe, it's just df.count()

Databricks Community

community.databricks.com › t5 › data-engineering › attributeerror-dataframe-object-has-no-attribute-rename › td-p › 28109

Solved: AttributeError: 'DataFrame' object has no attribut... - Databricks Community - 28109

January 2, 2024 - Hello, I am doing the Data Science and Machine Learning course. The Boston housing has unintuitive column names. I want to rename them, e.g. so 'zn' becomes 'Zoning'. When I run this command: df_bostonLegible = df_boston.rename({'zn':'Zoning'}, axis='columns') Then I get the error "AttributeError: '...

Stack Exchange

datascience.stackexchange.com › questions › 46149 › dataframe-object-has-no-attribute-to-dataframe

'DataFrame' object has no attribute 'to_dataframe' - Data Science Stack Exchange

Top answer

1 of 2

The function pd.read_csv() is already a DataFrame and thus that kind of object does not support calling .to_dataframe().

You can check the type of your variable ds using print(type(ds)), you will see that it is a pandas DataFrame type.

2 of 2

According to what I understand. You are loading loanapp_c.csv in ds using this code:

ds = pd.read_csv('desktop/python ML/loanapp_c.csv')

ds over here is a DataFrame object. What you are doing is calling to_dataframe on an object which a DataFrame already.

Removing this dataset = ds.to_dataframe() from your code should solve the error

Cloudera Community

community.cloudera.com › t5 › Support-Questions › Pyspark-issue-AttributeError-DataFrame-object-has-no › m-p › 78093

Pyspark issue AttributeError: 'DataFrame' object has no attribute 'saveAsTextFile'

January 2, 2024 - #%% import findspark findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7') from pyspark.sql import SparkSession spark = SparkSession.builder.appName('ops').getOrCreate() df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/Person_Person.csv',inferSchema=True,header=True) df.createOrReplaceTempView('Person_Person') myresults = spark.sql("""SELECT PersonType ,COUNT(PersonType) AS `Person Count` FROM Person_Person GROUP BY PersonType""") myresults.collect() result = myresults.collect() result result.saveAsTextFile("test") However, I'm now getting the following error message: AttributeError: 'list' object has no attribute 'saveAsTextFile'

Saturn Cloud

saturncloud.io › blog › solving-the-dataframe-object-has-no-attribute-name-error-in-pandas

Solving the 'DataFrame Object Has No Attribute 'name' Error in Pandas | Saturn Cloud Blog

November 2, 2023 - The 'DataFrame' object has no attribute 'name' error typically occurs when you try to access a DataFrame’s 'name' attribute, which doesn’t exist. import pandas as pd # Create a dataframe with 2 columns named 'A' and 'B' df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) print(df.name) Running this code will result in an AttributeError: 'DataFrame' object has no attribute 'name'.

Edureka Community

edureka.co › community › 42320 › python-pandas-attributeerror-dataframe-object-attribute

Python Pandas error AttributeError DataFrame object has ...

March 28, 2019 - Host '172.31.27.232' is blocked because of many connection errors; unblock with 'mariadb-admin flush-hosts'

reddit.com › r/learnpython › attributeerror: 'dataframe' object has no attribute 'data'

r/learnpython on Reddit: AttributeError: 'DataFrame' object has no attribute 'data'

September 29, 2021 -

wine = pd.read_csv("combined.csv", header=0).iloc[:-1]
df = pd.DataFrame(wine)
df
dataset = pd.DataFrame(df.data, columns =df.feature_names)
dataset['target']=df.target
dataset

ERROR:

<ipython-input-27-64122078da92> in <module>
----> 1 dataset = pd.DataFrame(df.data, columns =df.feature_names)
      2 dataset['target']=df.target
      3 dataset

D:\Anaconda\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   5463             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5464                 return self[name]
-> 5465             return object.__getattribute__(self, name)
   5466 
   5467     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'data'

I'm trying to set up a target to proceed with my Multi Linear Regression Project, but I can't even do that. I've already downloaded the CSV file and have it uploaded on a Jupyter Notebook. What I'm I doing wrong?

Top answer

1 of 3

I would recommend you start off by reading the 10min pandas Quick Guide . This should get you on the right track (at the very least you will be able to load data into pandas). Afterwards, you should also be able to tell why your current code doesn't make much sense.

2 of 3

Is there a column in your file called data? you already have a df from the first line. Maybe you just want to rename the columns. df.rename(columns={})

Python Forum

python-forum.io › thread-33991.html

AttributeError: 'DataFrame' object has no attribute 'Articles'

Purposes I want to plot feathers importance for data prediction and training and testing Running Time Error: AttributeError: 'DataFrame' object has no attribute 'Articles' Error:Traceback (most recent call last): File 'D:/Clustering/text-cluster...

Stack Overflow

stackoverflow.com › questions › 52498934 › attributeerror-dataframe-object-has-no-attribute-class

python - AttributeError: 'DataFrame' object has no attribute 'Class' - Stack Overflow

Top answer

1 of 3

[...] The author of this link (kaggle.com/dstuerzer/optimized-logistic-regression) has used it and it is working fine with his code.

In the link you mentioned, example, the author's database has a column named "Class" but the database that you have shown does not. As a result, the Class attribute does not exist in your database and therefore cannot be accessed.

Dominik Stuerzer:

   Time        V1        V2        V3        V4        V5        V6        V7  \
0   0.0 -1.359807 -0.072781  2.536347  1.378155 -0.338321  0.462388  0.239599   
1   0.0  1.191857  0.266151  0.166480  0.448154  0.060018 -0.082361 -0.078803   
2   1.0 -1.358354 -1.340163  1.773209  0.379780 -0.503198  1.800499  0.791461   

         V8        V9  ...         V21       V22       V23       V24  \
0  0.098698  0.363787  ...   -0.018307  0.277838 -0.110474  0.066928   
1  0.085102 -0.255425  ...   -0.225775 -0.638672  0.101288 -0.339846   
2  0.247676 -1.514654  ...    0.247998  0.771679  0.909412 -0.689281   

        V25       V26       V27       V28  Amount  Class  
0  0.128539 -0.189115  0.133558 -0.021053  149.62      0  
1  0.167170  0.125895 -0.008983  0.014724    2.69      0  
2 -0.327642 -0.139097 -0.055353 -0.059752  378.66      0  

[3 rows x 31 columns]

A class of 0 means that the transaction was in order, and a class of 1 means that the transaction was fraudulent. From personal experience we expect frauds to make up only a tiny fraction of all transactions. Indeed, in this dataset, for every fraud there are almost 600 non-fraudulent transactions: [...]

2 of 3

Try this,

df = df.drop(['Class'],axis=1)
df = df.drop(['Time'],axis=1) # optional