The read_sql docs say this params argument can be a list, tuple or dict (see docs).
To pass the values in the sql query, there are different syntaxes possible: ?, :1, :name, %s, %(name)s (see PEP249).
But not all of these possibilities are supported by all database drivers, which syntax is supported depends on the driver you are using (psycopg2 in your case I suppose).
In your second case, when using a dict, you are using 'named arguments', and according to the psycopg2 documentation, they support the %(name)s style (and so not the :name I suppose), see http://initd.org/psycopg/docs/usage.html#query-parameters.
So using that style should work:
df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
'where "Timestamp" BETWEEN %(dstart)s AND %(dfinish)s'),
db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
index_col=['Timestamp'])
Answer from joris on Stack OverflowThe read_sql docs say this params argument can be a list, tuple or dict (see docs).
To pass the values in the sql query, there are different syntaxes possible: ?, :1, :name, %s, %(name)s (see PEP249).
But not all of these possibilities are supported by all database drivers, which syntax is supported depends on the driver you are using (psycopg2 in your case I suppose).
In your second case, when using a dict, you are using 'named arguments', and according to the psycopg2 documentation, they support the %(name)s style (and so not the :name I suppose), see http://initd.org/psycopg/docs/usage.html#query-parameters.
So using that style should work:
df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
'where "Timestamp" BETWEEN %(dstart)s AND %(dfinish)s'),
db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
index_col=['Timestamp'])
I was having trouble passing a large number of parameters when reading from a SQLite Table. Then it turns out since you pass a string to read_sql, you can just use f-string. Tried the same with MSSQL pyodbc and it works as well.
For SQLite, it would look like this:
# write a sample table into memory
from sqlalchemy import create_engine
df = pd.DataFrame({'Timestamp': pd.date_range('2020-01-17', '2020-04-24', 10), 'Value1': range(10)})
engine = create_engine('sqlite://', echo=False)
df.to_sql('MyTable', engine);
# query the table using a query
tpl = (1, 3, 5, 8, 9)
query = f"""SELECT Timestamp, Value1 FROM MyTable WHERE Value1 IN {tpl}"""
df = pd.read_sql(query, engine)
If the parameters are datetimes, it's a bit more complicated but calling the datetime conversion function of the SQL dialect you're using should do the job.
start, end = '2020-01-01', '2020-04-01'
query = f"""SELECT Timestamp, Value1 FROM MyTable WHERE Timestamp BETWEEN STRFTIME("{start}") AND STRFTIME("{end}")"""
df = pd.read_sql(query, engine)
python - Binding list to params in Pandas read_sql_query with other params - Stack Overflow
How do i use a python variable in a where clause pd.read_sql_query query
Using %s or f-strings opens you up to SQL Injection.
The way to pass parameters I come across most often (see documentation for module sqlite3) is cursor.execute("SELECT * FROM People WHERE Name=?", (my_name, )).
However pandas.read_sql_query docs say that parameters are passed as via the params keyword argument, not as the 2nd positional arg.
Pandas Parametized Query
read_sql should accept a sql_params parameter
Break this up into three parts to help isolate the problem and improve readability:
- Build the SQL string
- Set parameter values
- Execute pandas.read_sql_query
Build SQL
First ensure ? placeholders are being set correctly. Use str.format with str.join and len to dynamically fill in ?s based on member_list length. Below examples assume 3 member_list elements.
Example
member_list = (1,2,3)
sql = """select member_id, yearmonth
from queried_table
where yearmonth between {0} and {0}
and member_id in ({1})"""
sql = sql.format('?', ','.join('?' * len(member_list)))
print(sql)
Returns
select member_id, yearmonth
from queried_table
where yearmonth between ? and ?
and member_id in (?,?,?)
Set Parameter Values
Now ensure parameter values are organized into a flat tuple
Example
# generator to flatten values of irregular nested sequences,
# modified from answers http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python
def flatten(l):
for el in l:
try:
yield from flatten(el)
except TypeError:
yield el
params = tuple(flatten((201601, 201603, member_list)))
print(params)
Returns
(201601, 201603, 1, 2, 3)
Execute
Finally bring the sql and params values together in the read_sql_query call
query = pd.read_sql_query(sql, db2conn, params)
WARNING! Although my proposed solution here works, it is prone to SQL injection attacks. Therefor, it should never be used directly in backend code! It is only safe for offline analysis.
If you're using python 3.6+ you could also use a formatted string litteral for your query (cf https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pep498)
start, end = 201601, 201603
selected_members = (111, 222, 333, 444, 555) # requires to be a tuple
query = f"""
SELECT member_id, yearmonth FROM queried_table
WHERE yearmonth BETWEEN {start} AND {end}
AND member_id IN {selected_members}
"""
df = pd.read_sql_query(query, db2conn)
Hi all,
As per heading I am creating a data frame using SQL that looks something like this:
SQL_Query = pd.read_sql_query( """SELECT TOP (1000) DATA.NAME, DATA.AGE, DATA.ADDRESS, DATA2.COUNTRY FROM DATA LEFT JOIN DATA2 ON DATA.NAME=DATA2.NAME WHERE COUNTRY = (PYTHON VARIABLE )
What is the proper way to do this? I have seen online using %s as below
ursor1.execute("SELECT * FROM People WHERE Name=%s", (myName,))but this doesn't seem to work. Not sure if it's because I'm using the pd.read_sql_query function.
Any help would be great :)
EDIT: worked it out :)
(Select * from data WHERE data.field = ?),con,Params=[variable])
Thank you all
Using %s or f-strings opens you up to SQL Injection.
The way to pass parameters I come across most often (see documentation for module sqlite3) is cursor.execute("SELECT * FROM People WHERE Name=?", (my_name, )).
However pandas.read_sql_query docs say that parameters are passed as via the params keyword argument, not as the 2nd positional arg.
When I want to use variable value in string I usually do it like this:
number = 10string = f'Ten = {number}'
Hello Guys,
I'm trying to query data with variables using the select statement with pandas and MySQL. Is there a way I can declare the date variable and pass them to the query during runtime. I haven't come across any ways that work from my online research.
Here is my code :
from datetime import datetime
from email import encoders
import smtplib
import pandas as pd
from sqlalchemy import create_engine
from urllib.parse import quote
import mysql.connector as sql
db = create_engine('mysql://root:%s@localhost:3306/store' % quote('Mypass@12!'))
now = datetime.now()
startdate = now.replace(day=1, hour=0, minute=0, second=0, microsecond=0).strftime("%y-%m-%d %H:%M:%S")
currentdate = now.replace(day=3,hour=23, minute=00, second=0).strftime("%y-%m-%d %H:%M:%S")
df1 = pd.read_sql("select * from orders where datecreated > %s and datecreated < %s ", con=db)
pdwriter = pd.ExcelWriter('report.xlsx', engine='xlsxwriter')
df1.to_excel(pdwriter, sheet_name='GEN')
pdwriter.save()I would like to pass the date variables to the query.
Any suggestion is greatly appreciated. Thank you.