Brave Search

Writing to MySQL database with pandas using SQLAlchemy, to_sql

stackoverflow.com › questions › 30631325 › writing-to-mysql-database-with-pandas-using-sqlalchemy-to-sql

Using the engine in place of the raw_connection() worked:

import pandas as pd
import mysql.connector
from sqlalchemy import create_engine

engine = create_engine('mysql+mysqlconnector://[user]:[pass]@[host]:[port]/[schema]', echo=False)
data.to_sql(name='sample_table2', con=engine, if_exists = 'append', index=False)

Not clear on why when I tried this yesterday it gave me the earlier error.

Answer from AsAP_Sherb on Stack Overflow

Medium

medium.com › @heyamit10 › pandas-sqlalchemy-practical-guide-for-beginners-b2b5cd708587

pandas sqlalchemy (Practical Guide for Beginners) | by Hey Amit | Medium

March 6, 2025 - Think of it as your ticket to talk to the database—without it, Pandas has no idea where to send your data. ... Let’s start with SQLite because it’s lightweight and doesn’t require extra setup. Here’s how you create an in-memory database (a temporary database that disappears when your script stops running): from sqlalchemy import create_engine import pandas as pd # Create an SQLite in-memory database engine = create_engine('sqlite:///:memory:', echo=True)

GeeksforGeeks

geeksforgeeks.org › python › connecting-pandas-to-a-database-with-sqlalchemy

Connecting Pandas to a Database with SQLAlchemy - GeeksforGeeks

January 26, 2022 - For this example, we can use a PostgreSQL database, which is one of the easiest ways to do things, but then the procedure is just the same for all the other databases supported by SQLAlchemy. You can download the sample dataset here. Let us first Import the necessary dataset. Now, let's Establish the connection with the PostgreSQL database and make it interactable to python using the psycopg2 driver. Next, we shall load the dataframe to be pushed to our SQLite database using the to_sql() function as shown. ... # import necessary packages import pandas import psycopg2 from sqlalchemy import cre

Discussions

python - Writing to MySQL database with pandas using SQLAlchemy, to_sql - Stack Overflow

trying to write pandas dataframe to MySQL table using to_sql. Previously been using flavor='mysql', however it will be depreciated in the future and wanted to start the transition to using SQLAlch... More on stackoverflow.com

stackoverflow.com

Using SQLAlchemy and Pandas to create a database for your Flask app

I use more of the numerical/scientific computing side of the Python community and I have a project that I want to serve up as a web app. This is perfect for me since the project contains heavy use of pandas. Thanks for this!

Videos

15:35

YouTube

Querying a SQL database with Python using SQLAlchemy and pandas.

December 11, 2024

05:07

YouTube

Exporting Python DataFrames to SQL Pandas vs SQLAlchemy (Part 1) ...

March 3, 2024

4.01K

youtube.com

Create SQL Database From Pandas DataFrame in Python with SQL Alchemy ...

November 6, 2022

03:43

YouTube

SQLAlchemy ORM conversion to pandas DataFrame - YouTube

Connect to MySQL with Python, SQLAlchemy, Pandas and Heroku - YouTube

June 15, 2022

27:37

YouTube

SQL, Pandas, SQL Alchemy, Python , Docker, Jupyter Notebook, MySQL, ...

pandas.pydata.org › docs › reference › api › pandas.DataFrame.to_sql.html

pandas.DataFrame.to_sql — pandas 3.0.1 documentation

Databases supported by SQLAlchemy [1] are supported. Tables can be newly created, appended to, or overwritten. ... The pandas library does not attempt to sanitize inputs provided via a to_sql call.

Stack Overflow

stackoverflow.com › questions › 30631325 › writing-to-mysql-database-with-pandas-using-sqlalchemy-to-sql

python - Writing to MySQL database with pandas using SQLAlchemy, to_sql - Stack Overflow

Top answer

1 of 4

81

Using the engine in place of the raw_connection() worked:

import pandas as pd
import mysql.connector
from sqlalchemy import create_engine

engine = create_engine('mysql+mysqlconnector://[user]:[pass]@[host]:[port]/[schema]', echo=False)
data.to_sql(name='sample_table2', con=engine, if_exists = 'append', index=False)

Not clear on why when I tried this yesterday it gave me the earlier error.

2 of 4

15

Alternatively, use pymysql package...

import pymysql
from sqlalchemy import create_engine
cnx = create_engine('mysql+pymysql://[user]:[pass]@[host]:[port]/[schema]', echo=False)

data = pd.read_sql('SELECT * FROM sample_table', cnx)
data.to_sql(name='sample_table2', con=cnx, if_exists = 'append', index=False)

CoderPad

coderpad.io › blog › development › how-to-manipulate-sql-data-using-sqlalchemy-and-pandas

How To Manipulate SQL Data Using SQLAlchemy and Pandas - CoderPad

June 7, 2023 - We discussed how to import data from SQLAlchemy to Pandas DataFrame using read_sql, how to export Pandas DataFrame to the database using to_sql, and how to load a CSV file to get a DataFrame that can be shipped to the database.

Hackers and Slackers

hackersandslackers.com › connecting-pandas-to-a-sql-database-with-sqlalchemy

Connecting Pandas to a Database with SQLAlchemy

January 19, 2022 - Save Pandas DataFrames into SQL database tables, or create DataFrames from SQL using Pandas' built-in SQLAlchemy integration.

Find elsewhere

Google Bing Mojeek

SQL Shack

sqlshack.com › introduction-to-sqlalchemy-in-pandas-dataframe

Introduction to SQLAlchemy in Pandas Dataframe

August 20, 2020 - In this article, I have explained in detail about the SQLAlchemy module that is used by pandas in order to read and write data from various databases. This module can be installed when you install pandas on your machine.

KDnuggets

kdnuggets.com › using-sql-with-python-sqlalchemy-and-pandas

Using SQL with Python: SQLAlchemy and Pandas - KDnuggets

June 12, 2024 - We will learn how to connect to databases, execute SQL queries using SQLAlchemy, and analyze and visualize data using Pandas.

Pandas

pandas.pydata.org › docs › reference › api › pandas.read_sql.html

pandas.read_sql — pandas 3.0.1 documentation

>>> from sqlalchemy import text >>> sql = text( ... "SELECT int_column, date_column FROM test_data WHERE int_column=:int_val" ...

reddit.com › r/python › using sqlalchemy and pandas to create a database for your flask app

r/Python on Reddit: Using SQLAlchemy and Pandas to create a database for your Flask app

December 16, 2013 -

Had an issue with this today and figured others might benefit from the solution.

I've been creating some of the tables for the Postgres database in my Flask app with Pandas to_sql method (the datasource is messy and Pandas handles all the issues very well with very little coding on my part). The rest of the tables are initialized with a SQLAlchemy model, which allows my to easily query the tables with session.query:

db.session.query(AppUsers).filter_by(id = uId).first()

I couldn't however query the tables created by Pandas because they weren't part of the model. Enter automap_base from sqlalchemy.ext.automap (tableNamesDict is a dict with only the Pandas tables):

metadata = MetaData()
metadata.reflect(db.engine, only=tableNamesDict.values())
Base = automap_base(metadata=metadata)
Base.prepare()

Which would have worked perfectly, except for one problem, automap requires the tables to have a primary key. Ok, no problem, I'm sure Pandas to_sql has a way to indicate the primary key... nope. This is where it gets a little hacky:

for df in dfs.keys():
    cols = dfs[df].columns
    cols = [str(col) for col in cols if 'id' in col.lower()]
    schema = pd.io.sql.get_schema(dfs[df],df, con=db.engine, keys=cols)
    db.engine.execute('DROP TABLE ' + df + ';')
    db.engine.execute(schema)
    dfs[df].to_sql(df,con=db.engine, index=False, if_exists='append')

I iterate thru the dict of DataFrames, get a list of the columns to use for the primary key (i.e. those containing id), use get_schema to create the empty tables then append the DataFrame to the table.

Now that you have the models, you can explicitly name and use them (i.e. User = Base.classes.user) with session.query or create a dict of all the classes with something like this:

alchemyClassDict = {}
for t in Base.classes.keys():
    alchemyClassDict[t] = Base.classes[t]

And query with:

res = db.session.query(alchemyClassDict['user']).first()

Hope this helps someone.

Top answer

1 of 3

5

I use more of the numerical/scientific computing side of the Python community and I have a project that I want to serve up as a web app. This is perfect for me since the project contains heavy use of pandas. Thanks for this!

2 of 3

2

I'm sure Pandas to_sql has a way to indicate the primary key... nope.

Pandas to_sql method does have an index_label parameter: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html

What version of Pandas are using?

Stack Overflow

stackoverflow.com › questions › 29525808 › sqlalchemy-orm-conversion-to-pandas-dataframe

python - SQLAlchemy ORM conversion to pandas DataFrame - Stack Overflow

Top answer

1 of 15

277

Below should work in most cases:

df = pd.read_sql(query.statement, query.session.bind)

See pandas.read_sql documentation for more information on the parameters.

2 of 15

158

Just to make this more clear for novice pandas programmers, here is a concrete example,

pd.read_sql(session.query(Complaint).filter(Complaint.id == 2).statement,session.bind)

Here we select a complaint from complaints table (sqlalchemy model is Complaint) with id = 2

DEV Community

dev.to › hackersandslackers › connecting-pandas-to-a-database-with-sqlalchemy-1mnf

Connecting Pandas to a Database with SQLAlchemy - DEV Community

June 12, 2020 - An "engine" is an object used to connect to databases using the information in our URI. Once we create an engine, downloading and uploading data is as simple as passing this object to Pandas: from os import environ from sqlalchemy import create_engine db_uri = environ.get('SQLALCHEMY_DATABASE_URI') self.engine = create_engine(db_uri, echo=True)

reddit.com › r/learnpython › sqlalchemy vs pandas

r/learnpython on Reddit: sqlalchemy vs pandas

November 12, 2021 -

I handles datas from our company using pandas just fine, and I always know sql is specifically use for data management, yet I don't know the differences between them. Queries works just fine with pandas and I'm quite confident at using pandas.

Not having to link to the cloud, is there any benefits store files using database instead of csvs? If so, which library should I be looking into?

Top answer

1 of 5

4

is there any benefits store files using database instead of csvs If your csv files contain numeric data your overhead will be dominated by converting the bulky text form data from the csv into the compact binary form python needs to interpret numbers. Databases store your data in binary form. Databases contain a lot of code to gracefully handle errors. If you use csv files you lose reliability in the face of inconsistent schema, power failure, crashes, disk full, unsynchronized concurrent access, etc. Often it will be faster to do your basic analysis in sql than in pandas, but pandas and numpy have more flexibility and a plethora of tools, like matplotlib, scipy, sckit-learn, etc.

2 of 5

3

If the dataframe is big enough, pandas will get slower and slower.

Stack Overflow

stackoverflow.com › questions › 78855763 › pandas-df-to-sql-and-sqlalchemy-engine-interactions

python - pandas df.to_sql() and SQLAlchemy engine interactions - Stack Overflow

In most cases, .to_sql(…, con=engine) is fine. Yes, the database connection will remain open if SQLAlchemy's default QueuePool connection pool is used; that connection will be closed when your application terminates (or when the Engine object is discarded).

Eric D. Brown, D.Sc.

ericbrown.com › home › quick tip: sqlalchemy for mysql and pandas

Quick Tip: SQLAlchemy for MySQL and Pandas -

December 7, 2018 - To improve performance – especially if you will have multiple calls to multiple tables, you can use SQLAlchemy with pandas. You’ll need to pip install sqlalchemy if you don’t have it installed already.

SQLAlchemy

docs.sqlalchemy.org › en › 20 › core › engines.html

Engine Configuration — SQLAlchemy 2.0 Documentation

It’s “home base” for the actual database and its DBAPI, delivered to the SQLAlchemy application through a connection pool and a Dialect, which describes how to talk to a specific kind of database/DBAPI combination.

PyPI

pypi.org › project › SQLAlchemy

SQLAlchemy · PyPI

A Core SQL construction system and DBAPI interaction layer. The SQLAlchemy Core is separate from the ORM and is a full database abstraction layer in its own right, and includes an extensible Python-based SQL expression language, schema metadata, ...

      » pip install SQLAlchemy

Published Mar 02, 2026

Version 2.0.48

Homepage https://www.sqlalchemy.org

Stack Overflow

stackoverflow.com › questions › 75175848 › how-to-convert-sql-query-to-pandas-dataframe-using-sqlalchemy-orm

python - How to convert SQL Query to Pandas DataFrame using SQLAlchemy ORM? - Stack Overflow

Top answer

1 of 1

2

Just reading the documentation of pandas.read_sql_query:

pandas.read_sql_query(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, chunksize=None, dtype=None)

Parameters:

sql: str SQL query or SQLAlchemy Selectable (select or text object) SQL query to be executed.

con: SQLAlchemy connectable, str, or sqlite3 connection Using SQLAlchemy makes it possible to use any DB supported by that library. If a DBAPI2 object, only sqlite3 is supported.

...

So pandas does allow a SQLAlchemy Selectable (e.g. select(MeterValue)) and a SQLAlchemy connectable (e.g. engine.connect()), so your code block is correct and pandas will handle the querying correctly.

with ENGINE.connect() as conn:
    df = pd.read_sql_query(
        sqlalchemy.select(MeterValue),
        conn,
    )

reddit.com › r/learnpython › sql alchemy & pandas performance

r/learnpython on Reddit: SQL Alchemy & Pandas Performance

December 28, 2022 -

I have several tables with millions of rows that need to be queried for varying criteria based on data research. Currently working on creating a programmatic approach to querying rather than raw-dogging SQL, causing database issues, and getting yelled at by DBAs.

Here's the issue: using pandas.read_sql_query is not being the most performant. Any insight on how to tinker and make it more efficient would be greatly appreciated.

Here's the setup:

Remote oracle database
- Partitioned table
Connection via SQL Alchemy using:
- User name/password, port, dialect, driver (CX_Oracle), host, and host name.
- Connection variables past into sqlalchemy.engine.create_engine()
- Resulting engine is used to .connect() along with:
  - Execute_options(stream_results = True) (my first effort at optimization, using a server side cursor)
Resulting connection is used in pandas.read_sql_query() using:
- Aforementioned streaming connection
- Chunksize of 10000 (I did some tinkering with various chunk sizes, smaller and larger. This is where I've found the most performant which is still ~60 seconds per 100k rows.)
The read_sql_query is then concat'd against an empty DataFrame repeatedly as the query iterates through the chunks.

Here's the code snippet of the query:

for chunk in read_sql_query(query, self.__connection, chunksize = 10000):
    self.queryResultContainer[database] = concat([self.queryResultContainer[database], chunk], ignore_index = True)

One noteworthy mention:

The query being past into read_sql_query performs within seconds as a standalone query.

Top answer

1 of 2

3

There is too much here to really know. I think you have to try to profile. SQLAlchemy, in my experience, is usually the problem. By its nature, it goes overboard, and weakrefs are all over the place, and its object update tracking can spiral out of control. It is possible that you might be able to resolve the problem by emitting the SQL and then executing the transactions later. You can probably fiddle with trying to optimize it the 'right way', but I have found better success by saving the transactions and executing them once SQLAlchemy is not in the way. It may also be a pandas bottleneck. So profiling the performance to determine the cause is always a good idea.

2 of 2

1

For posterity: This was optimized by changing from pandas.read_sql_query to sqlalchemy.engine.engine.connect.execute(sqlalchemy.text(query)). The code: with self.__connectionParms[database]["engine"].connect().execution_options(stream_results = True) as self.__connection: self.__queryResult = self.__connection.execute(text(query)) self.resultConversion = [ dict(row) for row in self.__queryResult ] self.queryResultContainer[database] = concat([self.queryResultContainer[database], self.resultConversion]) This reduce processing time for 100k rows from 64 seconds to 12 seconds. 1 million rows is still cumbersome. Further optimization may be done, TBD if necessary. My next step is learning how to query against Python variables. Currently I have a list of integers that I need to pass into a query to search against. This is done by passing a parameter dictionary into the execute method, but cx_Oracle is currently throwing an error that the list is not bound: "ORA-01484: arrays can only be bound to PL/SQL statements" Still troubleshooting that one.

GeeksforGeeks

geeksforgeeks.org › python › sqlalchemy-orm-conversion-to-pandas-dataframe

SQLAlchemy ORM conversion to Pandas DataFrame - GeeksforGeeks

June 22, 2022 - In this article, we will see how to convert an SQLAlchemy ORM to Pandas DataFrame using Python.