You need to use table EXCLUDED instead of value literals in your ON CONFLICT statement:
The
SETandWHEREclauses inON CONFLICT DO UPDATEhave access to the existing row using the table's name (or an alias), and to the row proposed for insertion using the specialexcludedtable.
You also don't need to re-set the conflicting values, only the rest.
INSERT INTO table (col1, col2, col3)
VALUES
(value1, value2, value3),
(value4, value5, value6)
ON CONFLICT (col1) DO UPDATE
SET (col2, col3) = (EXCLUDED.col2, EXCLUDED.col3);
For readability, you can format your in-line SQLs if you triple-quote your f-strings. I'm not sure if and which IDEs can detect it's an in-line SQL in Python and switch syntax highlighting, but I find indentation helpful enough.
upsert_statement = f"""
INSERT INTO table (col1, col2, col3)
VALUES
({value1}, {value2}, {value3}),
({value4}, {value5}, {value6})
ON CONFLICT (col1) DO UPDATE
SET (col2, col3) = (EXCLUDED.col2, EXCLUDED.col3)"""
Here's a test at db<>fiddle:
drop table if exists test_70066823 cascade;
create table test_70066823 (
id integer primary key,
text_column_1 text,
text_column_2 text);
insert into test_70066823 values
(1,'first','first')
,(2,'second','second') returning *;
| id | text_column_1 | text_column_2 |
|---|---|---|
| 1 | first | first |
| 2 | second | second |
insert into test_70066823
values (1, 'third','first'),
(3, 'fourth','third'),
(4, 'fifth','fourth'),
(2, 'sixth','second')
on conflict (id) do update
set text_column_1=EXCLUDED.text_column_1,
text_column_2=EXCLUDED.text_column_2
returning *;
| id | text_column_1 | text_column_2 |
|---|---|---|
| 1 | third | first |
| 3 | fourth | third |
| 4 | fifth | fourth |
| 2 | sixth | second |
You can refer to this for improved insert performance. Inserts with a simple string-based execute or execute_many are the top 2 slowest approaches mentioned there.
python - Upsert multiple rows in PostgreSQL with psycopg2 and errors logging - Stack Overflow
python - psycopg2: insert multiple rows with one query - Stack Overflow
Postgres 9.5 upsert command in pandas or psycopg2? - Stack Overflow
python - psycopg2: Update multiple rows in a table with values from a tuple of tuples - Stack Overflow
I built a program that inserts multiple lines to a server that was located in another city.
I found out that using this method was about 10 times faster than executemany. In my case tup is a tuple containing about 2000 rows. It took about 10 seconds when using this method:
args_str = ','.join(cur.mogrify("(%s,%s,%s,%s,%s,%s,%s,%s,%s)", x) for x in tup)
cur.execute("INSERT INTO table VALUES " + args_str)
and 2 minutes when using this method:
cur.executemany("INSERT INTO table VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s)", tup)
As of 2026 Psycopg 3
The executemany() method is the suggested solution in the psycopg documentation. It is shown below in the execute_many function. The alternative, demonstrated in the execute_array function, is to make psycopg adapt lists to arrays and execute() as a single query, hopefully faster.
Given the table t as:
create table t (i int, s text)
import psycopg
def execute_many(cur, the_list):
query = '''insert into t (i, s) values (%s, %s)'''
print('execute_many queries:\n')
print('\n'.join([
psycopg.ClientCursor(conn).mogrify(query, params) for params in the_list
]))
cur.executemany(query, params_seq=the_list)
def execute_array(cur, the_list):
# must be passed as lists
i, s = [list(e) for e in list(zip(*the_list))]
query = '''
insert into t (i, s)
select i, s
from unnest(%(i)s, %(s)b::text[]) s (i, s)
'''
print('execute_array query')
print(psycopg.ClientCursor(conn).mogrify(query, dict(i=i,s=s)))
cur.execute(query, dict(i=i,s=s))
the_list = [(1,'a'),(2,'b'),(3,'c,w"')]
conn = psycopg.connect('')
cur = conn.cursor()
execute_array(cur, the_list)
execute_many(cur, the_list)
conn.commit()
conn.close()
Output:
execute_array query
insert into t (i, s)
select i, s
from unnest('{1,2,3}'::int2[], E'{a,b,"c,w\\""}'::text[]) s (i, s)
execute_many queries:
insert into t (i, s) values (1, 'a')
insert into t (i, s) values (2, 'b')
insert into t (i, s) values (3, 'c,w"')
To use the array query the string list must be passed as binary to psycopg and hard adapted as a text array with %(s)b::text[]
Answer from 2015
New execute_values method in Psycopg 2.7:
data = [(1,'x'), (2,'y')]
insert_query = 'insert into t (a, b) values %s'
psycopg2.extras.execute_values (
cursor, insert_query, data, template=None, page_size=100
)
The pythonic way of doing it in Psycopg 2.6:
data = [(1,'x'), (2,'y')]
records_list_template = ','.join(['%s'] * len(data))
insert_query = 'insert into t (a, b) values {}'.format(records_list_template)
cursor.execute(insert_query, data)
Explanation: If the data to be inserted is given as a list of tuples like in
data = [(1,'x'), (2,'y')]
then it is already in the exact required format as
the
valuessyntax of theinsertclause expects a list of records as ininsert into t (a, b) values (1, 'x'),(2, 'y')Psycopgadapts a Pythontupleto a Postgresqlrecord.
The only necessary work is to provide a records list template to be filled by psycopg
# We use the data list to be sure of the template length
records_list_template = ','.join(['%s'] * len(data))
and place it in the insert query
insert_query = 'insert into t (a, b) values {}'.format(records_list_template)
Printing the insert_query outputs
insert into t (a, b) values %s,%s
Now to the usual Psycopg arguments substitution
cursor.execute(insert_query, data)
Or just testing what will be sent to the server
print (cursor.mogrify(insert_query, data).decode('utf8'))
Output:
insert into t (a, b) values (1, 'x'),(2, 'y')
Here is my code for bulk insert & insert on conflict update query for postgresql from pandas dataframe:
Lets say id is unique key for both postgresql table and pandas df and you want to insert and update based on this id.
import pandas as pd
from sqlalchemy import create_engine, text
engine = create_engine(postgresql://username:pass@host:port/dbname)
query = text(f"""
INSERT INTO schema.table(name, title, id)
VALUES {','.join([str(i) for i in list(df.to_records(index=False))])}
ON CONFLICT (id)
DO UPDATE SET name= excluded.name,
title= excluded.title
""")
engine.execute(query)
Make sure that your df columns must be same order with your table.
FYI, this is the solution I am using currently.
It seems to work fine for my purposes. I had to add a line to replace null (NaT) timestamps with None though, because I was getting an error when I was loading each row into the database.
def create_update_query(table):
"""This function creates an upsert query which replaces existing data based on primary key conflicts"""
columns = ', '.join([f'{col}' for col in DATABASE_COLUMNS])
constraint = ', '.join([f'{col}' for col in PRIMARY_KEY])
placeholder = ', '.join([f'%({col})s' for col in DATABASE_COLUMNS])
updates = ', '.join([f'{col} = EXCLUDED.{col}' for col in DATABASE_COLUMNS])
query = f"""INSERT INTO {table} ({columns})
VALUES ({placeholder})
ON CONFLICT ({constraint})
DO UPDATE SET {updates};"""
query.split()
query = ' '.join(query.split())
return query
def load_updates(df, table, connection):
conn = connection.get_conn()
cursor = conn.cursor()
df1 = df.where((pd.notnull(df)), None)
insert_values = df1.to_dict(orient='records')
for row in insert_values:
cursor.execute(create_update_query(table=table), row)
conn.commit()
row_count = len(insert_values)
logging.info(f'Inserted {row_count} rows.')
cursor.close()
del cursor
conn.close()
This post pointed me in the right direction. The documentation for extras.execute_values also contains a great example using the UPDATE clause.
c = db.cursor()
update_query = """UPDATE my_table AS t
SET name = e.name
FROM (VALUES %s) AS e(name, id)
WHERE e.id = t.id;"""
psycopg2.extras.execute_values (
c, update_query, new_values, template=None, page_size=100
)
Additionally for @tdnelson2 answer - if you use some kind of list comprehension to generate parameters for SET, you can occur a problem with timestamps (this problem doesn't appear when you work with psycopg2 insert). Then you will need to additionally add ::timestamp to set block. In my case it looked like this:
UPDATE ods.dict_adr_spb as t
SET adrspb_house_id=e.adrspb_house_id, adrspb_t_creation_date=e.adrspb_t_creation_date::timestamp
FROM (VALUES %s) as e(adrspb_house_id, adrspb_t_creation_date)
WHERE e.adrspb_house_id = t.adrspb_house_id;
To quote from psycopg2's documentation:
Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
Now, for an upsert operation you can do this:
insert_sql = '''
INSERT INTO tablename (col1, col2, col3, col4)
VALUES (%s, %s, %s, %s)
ON CONFLICT (col1) DO UPDATE SET
(col2, col3, col4) = (EXCLUDED.col2, EXCLUDED.col3, EXCLUDED.col4);
'''
cur.execute(insert_sql, (val1, val2, val3, val4))
Notice that the parameters for the query are being passed as a tuple to the execute statement (this assures psycopg2 will take care of adapting them to SQL while shielding you from injection attacks).
The EXCLUDED bit allows you to reuse the values without the need to specify them twice in the data parameter.
Using:
INSERT INTO members (member_id, customer_id, subscribed, customer_member_id, phone, cust_atts) VALUES (%s, %s, %s, %s, %s, %s) ON CONFLICT (customer_member_id) DO UPDATE SET (phone) = (EXCLUDED.phone);
I received the following error:
psycopg2.errors.FeatureNotSupported: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression
LINE 1: ...ICT (customer_member_id) DO UPDATE SET (phone) = (EXCLUDED.p...
Changing to:
INSERT INTO members (member_id, customer_id, subscribed, customer_member_id, phone, cust_atts) VALUES (%s, %s, %s, %s, %s, %s) ON CONFLICT (customer_member_id) DO UPDATE SET (phone) = ROW(EXCLUDED.phone);
Solved the issue.
Python super noob here.
This is my first python script that deals with a database. However, I am wondering if there is a way to do inserts with out having to open and close a postgres connection each time an insert is done. I understand I may be able to use a bulk insert but I am interested in individual inserts. Below is my code.
#!/usr/bin/python
import psycopg2
counter = 0
#Connect to DB
conn = psycopg2.connect("dbname=testing user=postgres")
cur = conn.cursor()
cur.execute("CREATE TABLE test (id serial NOT NULL PRIMARY KEY, f_name varchar(20), l_name varchar(20))");
conn.commit()
cur.close()
conn.close()
while counter < 1000:
conn = psycopg2.connect("dbname=testing user=postgres")
cur = conn.cursor()
cur.execute("INSERT INTO test (f_name, l_name) VALUES (%s, %s)", ("Bob", "Babaluba"))
conn.commit()
cur.close()
conn.close()
counter = counter + 1Thank you folks :)