The pandas module imports read_sql from a submodule; you could try getting it from the submodule:
df = pd.io.sql.read_sql('select * from databasetable', engine, index_col = index)
You could also print pd.__dict__ to see what's available in the pd module. Do you get AttributeError if you try to use other things from the pd module, for example, pd.Series()?
The pandas module imports read_sql from a submodule; you could try getting it from the submodule:
df = pd.io.sql.read_sql('select * from databasetable', engine, index_col = index)
You could also print pd.__dict__ to see what's available in the pd module. Do you get AttributeError if you try to use other things from the pd module, for example, pd.Series()?
Thanks a lot.
I have checked with pd.dict and the on the pandas-docs for the versions.
--> It seems in my version 0.11.0 'sql_read_frame' is the command to use and in Version 0.15.0 you can read sql with 'read_sql'
So I'm trying to create a python script that allows me to perform SQL manipulations on a dataframe (masterfile) I created using pandas. The dataframe draws its contents from the csv files found in a specific folder.
I was able to successfully create everything else, but I am having trouble with the SQL manipulation part. I am trying to use the dataframe as the "database" where I will pull the data using my SQL query but I am getting a "AttributeError: 'DataFrame' object has no attribute 'cursor' " error.
I'm not really seeing a lot of examples for pandas.read_sql_query() so I am having a difficult time on understanding how I will use my dataframe in it.
I appreciate any help I can get! I am a self-thought python programmer by the way, so I might be missing some general rule or something. Thanks!
import os
import glob
import pandas
os.chdir("SOMECENSOREDDIRECTORY")
all_csv = [i for i in glob.glob('*.{}'.format('csv')) if i != 'Masterfile.csv']
edited_files = []
for i in all_csv:
df = pandas.read_csv(i)
df["file_name"] = i.split('.')[0]
edited_files.append(df)
masterfile = pandas.concat(edited_files, sort=False)
print("Data fields are as shown below:")
print(masterfile.iloc[0])
sql_query = '''
SELECT Country, file_name as Year, Happiness_Score
FROM masterfile WHERE Country = 'Switzerland'
'''
output = pandas.read_sql_query(sql_query, masterfile)
output.to_csv('data_pull')I am trying to import a CSV file over to python but keep getting the following error. Any advice, I tried it on pycharm and juptyer notebook and get the same error. I am new to python
Looking at the pandas.io.sql source, there is no frame_query.
https://github.com/pydata/pandas/blob/master/pandas/io/sql.py
Documentation for pandas.io.sql is here: http://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries
I've looked at pandas documentation from 0.12.0 to latest and the only references to frame_query I've found has been to its deprecation.
I found this SO answer which may address your concerns: https://stackoverflow.com/a/14511960/1703772
However, if you are using pandas version ~ 0.10 when 0.18.1 is available, I have to ask why.
Such approach is deprecated,
https://pandas.pydata.org/pandas-docs/version/0.15.2/generated/pandas.io.sql.write_frame.html
use instead: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html
Solution for 2024
I know this question is old, but a recent change to Pandas warrants a new solution
Pandas 2.2.0 appears to have modified how a SQLAlchemy Engine object operates when passed to the con argument for pd.read_sql, pd.to_sql, etc. Unsure if this is intentional.
My project worked perfectly fine with Pandas 2.1.4.
When updated to 2.2.0, I receive the AttributeError due to the cursor not being present.
- Solution: Use the
connectionattribute of a SQLAlchemyConnectionobject instead of theEngineobject directly.
Peep this example:
from sqlalchemy import create_engine
import pandas as pd
# Engine for MS SQL Server using pymssql
conn_url = f"mssql+pymssql://{username}:{password}@{host}/{database}"
engine = create_engine(conn_url)
# Example query
query = 'select id from users'
# Works with pandas<2.2.0
df = pd.read_sql(
sql=query,
con=engine,
)
# Works with pandas==2.2.0
with engine.connect() as conn:
df = pd.read_sql(
sql=query,
con=conn.connection
)
As per comments below, this works with pandas 2.2 for .read_sql(), .read_sql_query(), etc., but does not work for .to_sql() with backends other than SQLite.
adding in a raw_connection() worked for me
from sqlalchemy import create_engine
sql_engine = create_engine('sqlite:///test.db', echo=False)
connection = sql_engine.raw_connection()
working_df.to_sql('data', connection, index=False, if_exists='append')
I had conda install sqlalchemy during my notebook session, so while it was accessible, since I already initiated pandas, it appeared as if there was no sqlalchemy. Restarting the session allowed me to remove the raw_connection().
from sqlalchemy import create_engine
sql_engine = create_engine('sqlite:///test.db', echo=False)
connection = sql_engine
working_df.to_sql('data', connection, index=False, if_exists='append')