You need to have an instance of the DeltaTable class, but you're passing the DataFrame instead. For this you need to create it using the DeltaTable.forPath (pointing to a specific path) or DeltaTable.forName (for a named table), like this:
DEV_Delta = DeltaTable.forPath(spark, 'some path')
DEV_Delta.alias("t").merge(df_from_pbl.alias("s"),condition_dev)\
.whenMatchedUpdateAll() \
.whenNotMatchedInsertAll()\
.execute()
If you have data as DataFrame only, you need to write them first.
See documentation for more details.
Answer from Alex Ott on Stack Overflowpython - I got the following error : 'DataFrame' object has no attribute 'data' - Data Science Stack Exchange
BUG: merge_asof with non-unique on, left_on, or right_on, raises AttributeError: 'DataFrame' object has no attribute 'dtype'
merge - How to fix AttributeError: 'DataFrame' object has no attribute 'assign' with out updating Pandas? - Stack Overflow
python - GeoPandas: AttributeError: 'DataFrame' object has no attribute 'to_file'. Did you mean: 'to_pickle'? - Geographic Information Systems Stack Exchange
"sklearn.datasets" is a scikit package, where it contains a method load_iris().
load_iris(), by default return an object which holds data, target and other members in it. In order to get actual values you have to read the data and target content itself.
Whereas 'iris.csv', holds feature and target together.
FYI: If you set return_X_y as True in load_iris(), then you will directly get features and target.
from sklearn import datasets
data,target = datasets.load_iris(return_X_y=True)
The Iris Dataset from Sklearn is in Sklearn's Bunch format:
print(type(iris))
print(iris.keys())
output:
<class 'sklearn.utils.Bunch'>
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])
So, that's why you can access it as:
x=iris.data
y=iris.target
But when you read the CSV file as DataFrame as mentioned by you:
iris = pd.read_csv('iris.csv',header=None).iloc[:,2:4]
iris.head()
output is:
2 3
0 petal_length petal_width
1 1.4 0.2
2 1.4 0.2
3 1.3 0.2
4 1.5 0.2
Here the column names are '1' and '2'.
First of all you should read the CSV file as:
df = pd.read_csv('iris.csv')
you should not include header=None as your csv file includes the column names i.e. the headers.
So, now what you can do is something like this:
X = df.iloc[:, [2, 3]] # Will give you columns 2 and 3 i.e 'petal_length' and 'petal_width'
y = df.iloc[:, 4] # Label column i.e 'species'
or if you want to use the column names then:
X = df[['petal_length', 'petal_width']]
y = df.iloc['species']
Also, if you want to convert labels from string to numerical format use sklearn LabelEncoder
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
y = le.fit_transform(y)