SQLAlchemy supports ON CONFLICT with two methods on_conflict_do_update() and on_conflict_do_nothing().
Copying from the documentation:
from sqlalchemy.dialects.postgresql import insert
stmt = insert(my_table).values(user_email='[email protected]', data='inserted data')
stmt = stmt.on_conflict_do_update(
index_elements=[my_table.c.user_email],
index_where=my_table.c.user_email.like('%@gmail.com'),
set_=dict(data=stmt.excluded.data)
)
conn.execute(stmt)
Answer from P.R. on Stack OverflowSQLAlchemy supports ON CONFLICT with two methods on_conflict_do_update() and on_conflict_do_nothing().
Copying from the documentation:
from sqlalchemy.dialects.postgresql import insert
stmt = insert(my_table).values(user_email='[email protected]', data='inserted data')
stmt = stmt.on_conflict_do_update(
index_elements=[my_table.c.user_email],
index_where=my_table.c.user_email.like('%@gmail.com'),
set_=dict(data=stmt.excluded.data)
)
conn.execute(stmt)
SQLAlchemy does have a "save-or-update" behavior, which in recent versions has been built into session.add, but previously was the separate session.saveorupdate call. This is not an "upsert" but it may be good enough for your needs.
It is good that you are asking about a class with multiple unique keys; I believe this is precisely the reason there is no single correct way to do this. The primary key is also a unique key. If there were no unique constraints, only the primary key, it would be a simple enough problem: if nothing with the given ID exists, or if ID is None, create a new record; else update all other fields in the existing record with that primary key.
However, when there are additional unique constraints, there are logical issues with that simple approach. If you want to "upsert" an object, and the primary key of your object matches an existing record, but another unique column matches a different record, then what do you do? Similarly, if the primary key matches no existing record, but another unique column does match an existing record, then what? There may be a correct answer for your particular situation, but in general I would argue there is no single correct answer.
That would be the reason there is no built in "upsert" operation. The application must define what this means in each particular case.
Videos
I'm currently learning Python, specifically the Polars API and the interaction with SQLAlchemy.
There are functions to read in and write data to a database (pl.read_databaae and pl.write_database). Now, I'm wondering if it's possible to further specify the import logic and if so, how would I do it? Specifically, I wan to perform an Upsert (insert or update) and as a table operation I want to define 'Create table if not exists'.
There is another function 'pl.write_delta', in which it's possible via multiple parameters to define the exact import logic to Delta Lake:
.when_matched_update_all() \ .when_not_matched_insert_all() \ .execute()
I assume it wasn't possible to generically include these parameters in write_database because all RDBMS handle Upsets differently? ...
So, what would be the recommended/best-practice way of upserting data to SQL Server? Can I do it with SQLAlchemy taking a Polars dataframe as an input?
The complete data pipeline looks like this:
-
read in flat file (xlsx/CSV/JSON) with Polars
-
perform some data wrangling operations with Polars
-
upsert data to SQL Server (with table operation 'Create table if not exists')
What I also found in a Stackoverflow post regarding Upserts with Polars:
df1 = ( df_new .join(df_old, on = ["group","id"], how="inner") .select(df_new.columns) ) df2 = ( df_new .join(df_old, on = ["group","id"], how="anti") ) df3 = ( df_old .join(df_new, on = ["group","id"], how="anti") ) df_all = pl.concat([df1, df2, df3])
Or with pl.update() I could perform an Upsert inside Polars:
df.update(new_df, left_on=["A"], right_on=["C"], how="full")
With both options though, I would have to read in the respective table from the database first, perform the Upsert with Polars and then write the output to the database again. This feels like 'overkill' to me?...
Anyways, thanks in advance for any help/suggestions!