You need to have an instance of the DeltaTable class, but you're passing the DataFrame instead. For this you need to create it using the DeltaTable.forPath (pointing to a specific path) or DeltaTable.forName (for a named table), like this:

DEV_Delta = DeltaTable.forPath(spark, 'some path')
DEV_Delta.alias("t").merge(df_from_pbl.alias("s"),condition_dev)\
  .whenMatchedUpdateAll() \
  .whenNotMatchedInsertAll()\
  .execute()

If you have data as DataFrame only, you need to write them first.

See documentation for more details.

Answer from Alex Ott on Stack Overflow
🌐
Stack Overflow
stackoverflow.com › questions › 72468947 › how-to-merge-a-column-from-df1-to-df2-pyspark
How to merge a column from df1 to df2 pyspark> - Stack Overflow
I have tried df1.merge(df2) but no luck with this. throws an error AttributeError: 'DataFrame' object has no attribute 'merge'
Discussions

python - I got the following error : 'DataFrame' object has no attribute 'data' - Data Science Stack Exchange
I am trying to get the 'data' and the 'target' of the iris setosa database, but I can't. For example, when I load the iris setosa directly from sklearn datasets I get a good result: Program: from More on datascience.stackexchange.com
🌐 datascience.stackexchange.com
August 26, 2018
DataFrame.merge raises with AttributeError
There was an error while loading. Please reload this page More on github.com
🌐 github.com
3
April 27, 2020
pandas - PySpark : AttributeError: 'DataFrame' object has no attribute 'values' - Stack Overflow
22 ---> 23 api_param_df = ... Column(jc) AttributeError: 'DataFrame' object has no attribute 'values' The full script is as follow, and explanations are commented for using regex to apply on the certain column http_path in df to parse api and param and merge/concat them ... More on stackoverflow.com
🌐 stackoverflow.com
[BUG] DeltaMergeBuilder' object has no attribute 'whenNotMatchedBySource for 2.3.0
Bug DeltaMergeBuilder' object has no attribute 'whenNotMatchedBySource for version 2.3.0 of delta lake and Spark 3.3.2 Describe the problem import pandas as pd import pyspark from pyspark.s... More on github.com
🌐 github.com
4
April 26, 2023
🌐
Cloudera Community
community.cloudera.com › t5 › Support-Questions › Pyspark-issue-AttributeError-DataFrame-object-has-no › m-p › 78093
Pyspark issue AttributeError: 'DataFrame' object has no attribute 'saveAsTextFile'
January 2, 2024 - #%% import findspark findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7') from pyspark.sql import SparkSession spark = SparkSession.builder.appName('ops').getOrCreate() df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/Person_Person.csv',inferSchema=True,header=True) df.createOrReplaceTempView('Person_Person') myresults = spark.sql("""SELECT PersonType ,COUNT(PersonType) AS `Person Count` FROM Person_Person GROUP BY PersonType""") myresults.collect() result = myresults.collect() result result.saveAsTextFile("test") However, I'm now getting the following error message: AttributeError: 'list' object has no attribute 'saveAsTextFile'
Top answer
1 of 5
2

"sklearn.datasets" is a scikit package, where it contains a method load_iris().

load_iris(), by default return an object which holds data, target and other members in it. In order to get actual values you have to read the data and target content itself.

Whereas 'iris.csv', holds feature and target together.

FYI: If you set return_X_y as True in load_iris(), then you will directly get features and target.

from sklearn import datasets
data,target = datasets.load_iris(return_X_y=True)
2 of 5
1

The Iris Dataset from Sklearn is in Sklearn's Bunch format:

print(type(iris))
print(iris.keys())

output:

<class 'sklearn.utils.Bunch'>
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])

So, that's why you can access it as:

x=iris.data
y=iris.target

But when you read the CSV file as DataFrame as mentioned by you:

iris = pd.read_csv('iris.csv',header=None).iloc[:,2:4]
iris.head()

output is:

    2   3
0   petal_length    petal_width
1   1.4 0.2
2   1.4 0.2
3   1.3 0.2
4   1.5 0.2

Here the column names are '1' and '2'.

First of all you should read the CSV file as:

df = pd.read_csv('iris.csv')

you should not include header=None as your csv file includes the column names i.e. the headers.

So, now what you can do is something like this:

X = df.iloc[:, [2, 3]] # Will give you columns 2 and 3 i.e 'petal_length' and 'petal_width'
y = df.iloc[:, 4] # Label column i.e 'species'

or if you want to use the column names then:

X = df[['petal_length', 'petal_width']]
y = df.iloc['species']

Also, if you want to convert labels from string to numerical format use sklearn LabelEncoder

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
y = le.fit_transform(y)
🌐
GitHub
github.com › dask › dask › issues › 6142
DataFrame.merge raises with AttributeError · Issue #6142 · dask/dask
April 27, 2020 - --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-2-d6a6fee6936c> in <module> 12 13 r = df1.merge(df2, on="A") ---> 14 r.compute() ~/sandbox/dask/dask/base.py in compute(self, **kwargs) 164 dask.base.compute 165 """ --> 166 (result,) = compute(self, traverse=False, **kwargs) 167 return result 168 ~/sandbox/dask/dask/base.py in compute(*args, **kwargs) 436 postcomputes = [x.__dask_postcompute__() for x in collections] 437 results = schedule(dsk, keys, **kwargs) --> 438 return repack([f(r, *a) for r, (f, a)
Author   TomAugspurger
Find elsewhere
🌐
Databricks Community
community.databricks.com › t5 › data-engineering › attributeerror-dataframe-object-has-no-attribute › td-p › 61132
AttributeError: 'DataFrame' object has no attribut... - Databricks Community - 61132
February 19, 2024 - Hello, I have some trouble deduplicating rows on the "id" column, with the method "dropDuplicatesWithinWatermark" in a pipeline. When I run this pipeline, I get the error message: "AttributeError: 'DataFrame' object has no attribute 'dropDuplicatesWithinWatermark'" Here is part of the code: @dl...
🌐
Cumulative Sum
cumsum.wordpress.com › 2020 › 10 › 10 › pyspark-attributeerror-dataframe-object-has-no-attribute-_get_object_id
[pyspark] AttributeError: ‘DataFrame’ object has no attribute ‘_get_object_id’
October 10, 2020 - AttributeError: ‘DataFrame’ object has no attribute ‘_get_object_id’ · The reason being that isin expects actual local values or collections but df2.select('id') returns a data frame.
🌐
Apache
spark.apache.org › docs › latest › api › python › _modules › pyspark › sql › dataframe.html
pyspark.sql.dataframe — PySpark 4.1.1 documentation
# # mypy: disable-error-code="empty-body" import sys import random from typing import ( Any, Callable, Dict, Iterator, List, Optional, Sequence, Tuple, Union, overload, TYPE_CHECKING, ) from pyspark import _NoValue from pyspark._globals import _NoValueType from pyspark.util import is_remote_only from pyspark.storagelevel import StorageLevel from pyspark.resource import ResourceProfile from pyspark.sql.column import Column from pyspark.sql.readwriter import DataFrameWriter, DataFrameWriterV2 from pyspark.sql.merge import MergeIntoWriter from pyspark.sql.streaming import DataStreamWriter from py
🌐
GitHub
github.com › delta-io › delta › issues › 1724
[BUG] DeltaMergeBuilder' object has no attribute 'whenNotMatchedBySource for 2.3.0 · Issue #1724 · delta-io/delta
April 26, 2023 - def handle_merge(path, updates, current_dt): delta_state = DeltaTable.forPath(spark, path) delta_state.alias("s").merge(updates.alias("c"), "s.key = c.key and s.is_current = true") \ .whenMatchedUpdate( condition = "s.is_current = true AND s.value <> c.value", set = { "is_current": "false", "valid_to": "c.date" } ).whenNotMatchedInsert( values = { "key": "c.key", "value": "c.value", "value2": "c.value2", "is_current": "true", "date": "c.date", "valid_to": "null" } ).whenNotMatchedBySource(condition = "c.valid_to IS NULL").update(set = { "is_current": "false", "valid_to": current_dt }).whenNotMatchedBySource().delete().execute() delta_state = spark.read.format("delta").load("dummy_delta") delta_state.show(50) handle_merge('dummy_delta', d2, current_dt=2)
Author   geoHeil
🌐
Incorta Community
community.incorta.com › t5 › data-schemas-knowledgebase › issue-with-converting-a-pandas-dataframe-to-a-spark-dataframe › ta-p › 5279
Issue with converting a Pandas DataFrame to a Spar... - Incorta Community
November 15, 2023 - Symptoms You received the error when trying to convert a Pandas DataFrame to Spark DataFrame in a PySpark MV. Here is the error.- INC_03070101: Transformation error Error 'DataFrame' object has no attribute 'iteritems' AttributeError : 'DataFrame' object has no attribute 'iteritems' Diagnosis Since...
🌐
GitHub
github.com › microsoft › FLAML › issues › 625
AttributeError: 'DataFrame' object has no attribute 'copy' · Issue #625 · microsoft/FLAML
July 2, 2022 - I m using autoML(FLAML) with Spark on large data. The error image is given below train = spark.read.parquet("./train.parquet") test = spark.read.parquet("./test.parquet") input_cols = [c for c in train.columns if c != 'target'] vectorAss...
Author   Shafi2016
🌐
GitHub
github.com › flrs › spark_practice_tests_code › blob › main › 1 › 34.ipynb
spark_practice_tests_code/1/34.ipynb at main · flrs/spark_practice_tests_code
fg\">(self, name)</span>\n<span class=\"ansi-green-intense-fg ansi-bold\"> 1664</span> &#34;&#34;&#34;\n<span class=\"ansi-green-intense-fg ansi-bold\"> 1665</span> <span class=\"ansi-green-fg\">if</span> name <span class=\"ansi-green-fg\">not</span> <span class=\"ansi-green-fg\">in</span> self<span class=\"ansi-blue-fg\">.</span>columns<span class=\"ansi-blue-fg\">:</span>\n<span class=\"ansi-green-fg\">-&gt; 1666</span><span class=\"ansi-red-fg\"> raise AttributeError(\n</span><span class=\"ansi-green-intense-fg ansi-bold\"> 1667</span> &#34;
Author   flrs
🌐
Reddit
reddit.com › r/snowflake › snowpark merge failing
r/snowflake on Reddit: Snowpark merge failing
January 19, 2024 -

So my python is not the best but pulling my hair out with why merge statement will keeps getting errors. Simple stored proc that reads from a table then sends out an email formatted nicely. Then it goes to update the table with the current datetime of the email sent:

table_df["EMAIL_SENT"] = datetime.now()
    t = snowpark_session.table("AUDIT_STATUS")

    t.merge(table_df, (t["AUDIT_ID"] == table_df["AUDIT_ID"]),[when_matched().update({"EMAIL_SENT": table_df["EMAIL_SENT"]})])

here is the error

Traceback (most recent call last):
  File "_udf_code.py", line 42, in main
  File "/usr/lib/python_udf/5142c1f0a7c7d953c07ad96fb4bb1de20b165ade342c596f36bbd920e92cf56f/lib/python3.10/site-packages/snowflake/snowpark/column.py", line 265, in __eq__
    right = Column._to_expr(other)
  File "/usr/lib/python_udf/5142c1f0a7c7d953c07ad96fb4bb1de20b165ade342c596f36bbd920e92cf56f/lib/python3.10/site-packages/snowflake/snowpark/column.py", line 751, in _to_expr
    return Literal(expr)
  File "/usr/lib/python_udf/5142c1f0a7c7d953c07ad96fb4bb1de20b165ade342c596f36bbd920e92cf56f/lib/python3.10/site-packages/snowflake/snowpark/_internal/analyzer/expression.py", line 211, in __init__
    raise SnowparkClientExceptionMessages.PLAN_CANNOT_CREATE_LITERAL(
snowflake.snowpark.exceptions.SnowparkPlanException: (1206): Cannot create a Literal for <class 'pandas.core.series.Series'>
 in function SEND_FORMATTED_EMAIL with handler main

i'm at a loss.

🌐
Built In
builtin.com › articles › attributeerror-dataframe-object-has-no-attribute-append
AttributeError: ‘DataFrame’ Object Has No Attribute ‘Append’ Solved | Built In
July 8, 2024 - AttributeError: ‘DataFrame’ object has no attribute ‘append’ can be solved by replacing append() with the concat() method to merge two or more Pandas DataFrames together.