From version 1.2 SQLAlchemy will support on_duplicate_key_update for MySQL
There is also examples of how to use it:
from sqlalchemy.dialects.mysql import insert insert_stmt = insert(my_table).values( id='some_existing_id', data='inserted value') on_duplicate_key_stmt = insert_stmt.on_duplicate_key_update( data=insert_stmt.values.data, status='U' ) conn.execute(on_duplicate_key_stmt)
From version 1.1 SQLAlchemy support on_conflict_do_update for PostgreSQL
Examples:
Answer from vishes_shell on Stack Overflowfrom sqlalchemy.dialects.postgresql import insert insert_stmt = insert(my_table).values( id='some_existing_id', data='inserted value') do_update_stmt = insert_stmt.on_conflict_do_update( constraint='pk_my_table', set_=dict(data='updated value') ) conn.execute(do_update_stmt)
From version 1.2 SQLAlchemy will support on_duplicate_key_update for MySQL
There is also examples of how to use it:
from sqlalchemy.dialects.mysql import insert insert_stmt = insert(my_table).values( id='some_existing_id', data='inserted value') on_duplicate_key_stmt = insert_stmt.on_duplicate_key_update( data=insert_stmt.values.data, status='U' ) conn.execute(on_duplicate_key_stmt)
From version 1.1 SQLAlchemy support on_conflict_do_update for PostgreSQL
Examples:
from sqlalchemy.dialects.postgresql import insert insert_stmt = insert(my_table).values( id='some_existing_id', data='inserted value') do_update_stmt = insert_stmt.on_conflict_do_update( constraint='pk_my_table', set_=dict(data='updated value') ) conn.execute(do_update_stmt)
You can try this
def get_or_increase_tag(tag_name):
tag = session.query(Tag).filter_by(tag=tag_name).first()
if not tag:
tag = Tag(tag_name)
else:
tag.cnt += 1
return tag
You can check the link https://stackoverflow.com/search?q=Insert+on+duplicate+update+sqlalchemy
python - SQLAlchemy insert or update example - Stack Overflow
if 'id' exists then update, else insert into MySQL database using Pandas dataframe
python - SQLAlchemy - performing a bulk upsert (if exists, update, else insert) in postgresql - Stack Overflow
python - sqlalchemy easy way to insert or update? - Stack Overflow
assuming certain column names...
INSERT one
newToner = Toner(toner_id = 1,
toner_color = 'blue',
toner_hex = '#0F85FF')
dbsession.add(newToner)
dbsession.commit()
INSERT multiple
newToner1 = Toner(toner_id = 1,
toner_color = 'blue',
toner_hex = '#0F85FF')
newToner2 = Toner(toner_id = 2,
toner_color = 'red',
toner_hex = '#F01731')
dbsession.add_all([newToner1, newToner2])
dbsession.commit()
UPDATE
q = dbsession.query(Toner)
q = q.filter(Toner.toner_id==1)
record = q.one()
record.toner_color = 'Azure Radiance'
dbsession.commit()
or using a fancy one-liner using MERGE
record = dbsession.merge(Toner( **kwargs))
I try lots of ways and finally try this:
def db_persist(func):
def persist(*args, **kwargs):
func(*args, **kwargs)
try:
session.commit()
logger.info("success calling db func: " + func.__name__)
return True
except SQLAlchemyError as e:
logger.error(e.args)
session.rollback()
return False
finally:
session.close()
return persist
and :
@db_persist
def insert_or_update(table_object):
return session.merge(table_object)
I'm having trouble with the updating part, I have figured out how to use sqlalchemy and pandas to do this df.to_sql('example_table', engine, if_exists='append') but this does as you would expect and appends the database, creating duplicates. This causes issues when i want to have unique keys and if the unique key is in the database already i want to just update the row with the new data. Is there a way to do this?
There is an upsert-esque operation in SQLAlchemy:
db.session.merge()
After I found this command, I was able to perform upserts, but it is worth mentioning that this operation is slow for a bulk "upsert".
The alternative is to get a list of the primary keys you would like to upsert, and query the database for any matching ids:
# Imagine that post1, post5, and post1000 are posts objects with ids 1, 5 and 1000 respectively
# The goal is to "upsert" these posts.
# we initialize a dict which maps id to the post object
my_new_posts = {1: post1, 5: post5, 1000: post1000}
for each in posts.query.filter(posts.id.in_(my_new_posts.keys())).all():
# Only merge those posts which already exist in the database
db.session.merge(my_new_posts.pop(each.id))
# Only add those posts which did not exist in the database
db.session.add_all(my_new_posts.values())
# Now we commit our modifications (merges) and inserts (adds) to the database!
db.session.commit()
You can leverage the on_conflict_do_update variant. A simple example would be the following:
from sqlalchemy.dialects.postgresql import insert
class Post(Base):
"""
A simple class for demonstration
"""
id = Column(Integer, primary_key=True)
title = Column(Unicode)
# Prepare all the values that should be "upserted" to the DB
values = [
{"id": 1, "title": "mytitle 1"},
{"id": 2, "title": "mytitle 2"},
{"id": 3, "title": "mytitle 3"},
{"id": 4, "title": "mytitle 4"},
]
stmt = insert(Post).values(values)
stmt = stmt.on_conflict_do_update(
# Let's use the constraint name which was visible in the original posts error msg
constraint="post_pkey",
# The columns that should be updated on conflict
set_={
"title": stmt.excluded.title
}
)
session.execute(stmt)
See the Postgres docs for more details about ON CONFLICT DO UPDATE.
See the SQLAlchemy docs for more details about on_conflict_do_update.
Side-Note on duplicated column names
The above code uses the column names as dict keys both in the values list and the argument to set_. If the column-name is changed in the class-definition this needs to be changed everywhere or it will break. This can be avoided by accessing the column definitions, making the code a bit uglier, but more robust:
coldefs = Post.__table__.c
values = [
{coldefs.id.name: 1, coldefs.title.name: "mytitlte 1"},
...
]
stmt = stmt.on_conflict_do_update(
...
set_={
coldefs.title.name: stmt.excluded.title
...
}
)
Unfortunately I am not sure what is performance of merge, but regarding to this post (which I highly recommend) the best option would be to use upsert and then just read from the table
from sqlalchemy.dialects.postgresql import insert
insert_stmt = insert(my_table).values(id='id',data='data')
do_nothing_stmt = insert_stmt.on_conflict_do_nothing(index_elements=['id'])
db.query(table).get("id")
It is possible to do it one call. It wasn't easy because insert.values(**items) doesn't allow a where clause combined with NOT EXISTS. We need a subquery but then we also need literals for every column and the keys for the insert.Also the literal for datetime caused a insert issue (formatting, maybe just postgres). I am wondering if there is a better way for the date issue.
In below example item_dict is a dictionary with field: value entries. It is only inserted if the invoice_number AND invoice_line_number don't already exist.
# Create literals for select. Datetime conversion was needed because insert failed without it due to formatting issues.
literals = [literal(val) if key != 'invoice_date' else literal(val.strftime("%Y-%m-%d")) for key,val in item_dict.items() ]
# NOT exists subquery providing the literals.
insert_select = select(literals).where(
~exists(literal(1)).where(
(performance_row_table.c.invoice_number == item_dict['invoice_number']) &
(performance_row_table.c.invoice_line_number == item_dict['invoice_line_number'])
)
)
keys = [key for key in item_dict.keys()]
# Insert using the subquery
insert_statement = performance_row_table.insert().from_select(keys, insert_select).returning(performance_row_table)
compiled = insert_statement.compile(compile_kwargs={"literal_binds": True})
print(compiled)
insert_result = session.execute(insert_statement).fetchone()
... art = models.Articles(title = art_title, date = datum, description = words, url = adresse, status = 1) db.session.add(art) db.session.commit()
Hi there
My code adds my entries but duplicates evrything if i reload my page. I would like to have a condition IF NOT EXISTS.
Sounds very simple but cant find a solution :(
obi wan kenobi you're my only hope...
thanks
Once your form is validated etc,
To add a new record:
new_provider = Provider(form.rssfeed.data)
db.session.add(new_provider)
db.session.commit()
To update an existing record:
existing_provider = Provider.query.get(1) # or whatever
# update the rssfeed column
existing_provider.rssfeed = form.rssfeed.data
db.session.commit()
The trick in updating is that you just have to change the particular field and do a commit. rest is taken care by the db session. I think you are using the merge function which is now deprecated in SQLAlchemy.
I have managed to solve the problem doing these changes to the view.py file:
@app.route('/settings', methods=["GET","POST"])
def settings():
""" show settings """
provider = Provider.query.get(1)
form = SettingsForm(request.form,obj=provider)
if request.method == "POST" and form.validate():
if provider:
provider.rssfeed = form.rssfeed.data
db.session.merge(provider)
db.session.commit()
flash("Settings changed")
return redirect(url_for("index"))
else:
provider = Provider(form.rssfeed.data)
db.session.add(provider)
db.session.commit()
flash("Settings added")
return redirect(url_for("index"))
return render_template("settings.html", form=form)
Hello, I am scraping a website with a pagination, inserting all the information into the datebase . I am using Scrapy pipelines in my scrapy app, sqlite3 database and sqlalchemy. Here are my codes and afterwards you will see my question.
items.py
import scrapy
class IndiSpiderItem(scrapy.Item):
my_url = scrapy.Field()
my_title = scrapy.Field()
that_description = scrapy.Field()
creator_name = scrapy.Field()
backers_money = scrapy.Field()
all_backers = scrapy.Field()
left_days = scrapy.Field()
tag_location = scrapy.Field()
users_image = scrapy.Field()models.py
import sqlalchemy
DeclarativeBase = declarative_base()
import datetime
def db_connect():
conn = sqlalchemy.create_engine("sqlite:///startups_gogo")
return conn
def create_table(engine):
DeclarativeBase.metadata.create_all(engine)
class QuoteDB(DeclarativeBase):
__tablename__ = "camp_table"
id = Column(Integer, primary_key=True)
url = Column('url', String(200))
title = Column('title', Text())
description = Column('description', Text())
creators_name = Column('creators_name', String(120))
backed_money = Column ('backed_money', String(20))
backers = Column('backers', String(25))
days_left = Column("days_left", String(15))
tag_location = Column('tag_location', String(250))
image = Column('image', String(250) )
date = Column(DateTime)
# date = Column('date', DateTime, default=datetime.datetime.utcnow, nullable=True)pipelines.py
from sqlalchemy.orm import sessionmaker
from .models import QuoteDB, db_connect, create_table
import sqlalchemy
import datetime as dt
from datetime import date, timedelta
class IndiSpiderPipeline(object):
def __init__(self):
engine = sqlalchemy.create_engine("sqlite:///startup_gogo")
create_table(engine)
self.Session = sessionmaker(bind=engine)
def process_item(self, item, spider):
session = self.Session()
quotedb = QuoteDB()
quotedb.url = item['url']
quotedb.description = item['description']
quotedb.creators_name = item['creators_name']
quotedb.backed_money = item['backed_money']
quotedb.backers = item['backers']
quotedb.days_left = item['days_left']
quotedb.tag_location = item['tag_location']
quotedb.image = item['image']
quotedb.date = dt.datetime.now()
try:
session.add(quotedb)
session.commit()
except:
session.rollback()
raise
finally:
session.close()
return itemWell, all the data goes into the database correctly, but when i will scrape the second time the website, i will need to see , if the url exists in db , i will need to update that row with that new scraped information..., if the url doesn't exist in db i will need to insert ,so i need to check it.
I know there is some idea called upsert in sqlalchemy, but i cannot find certain tutorials in the internet which might help me to do that. i know there are some built in methods that sqlalchemy has for mysql and postgresql like
from sqlalchemy.dialects.postgresql import insert
But sqlite doesn't have such system.
How to write " Update if exists the 'url' , else insert into db".