To use dtype, pass a dictionary keyed to each data frame column with corresponding sqlalchemy types. Change keys to actual data frame column names:
import sqlalchemy
import pandas as pd
...
column_errors.to_sql('load_errors',push_conn,
if_exists = 'append',
index = False,
dtype={'datefld': sqlalchemy.DateTime(),
'intfld': sqlalchemy.types.INTEGER(),
'strfld': sqlalchemy.types.NVARCHAR(length=255)
'floatfld': sqlalchemy.types.Float(precision=3, asdecimal=True)
'booleanfld': sqlalchemy.types.Boolean})
You may even be able to dynamically create this dtype dictionary given you do not know column names or types beforehand:
def sqlcol(dfparam):
dtypedict = {}
for i,j in zip(dfparam.columns, dfparam.dtypes):
if "object" in str(j):
dtypedict.update({i: sqlalchemy.types.NVARCHAR(length=255)})
if "datetime" in str(j):
dtypedict.update({i: sqlalchemy.types.DateTime()})
if "float" in str(j):
dtypedict.update({i: sqlalchemy.types.Float(precision=3, asdecimal=True)})
if "int" in str(j):
dtypedict.update({i: sqlalchemy.types.INT()})
return dtypedict
outputdict = sqlcol(df)
column_errors.to_sql('load_errors',
push_conn,
if_exists = 'append',
index = False,
dtype = outputdict)
Answer from Parfait on Stack OverflowTo use dtype, pass a dictionary keyed to each data frame column with corresponding sqlalchemy types. Change keys to actual data frame column names:
import sqlalchemy
import pandas as pd
...
column_errors.to_sql('load_errors',push_conn,
if_exists = 'append',
index = False,
dtype={'datefld': sqlalchemy.DateTime(),
'intfld': sqlalchemy.types.INTEGER(),
'strfld': sqlalchemy.types.NVARCHAR(length=255)
'floatfld': sqlalchemy.types.Float(precision=3, asdecimal=True)
'booleanfld': sqlalchemy.types.Boolean})
You may even be able to dynamically create this dtype dictionary given you do not know column names or types beforehand:
def sqlcol(dfparam):
dtypedict = {}
for i,j in zip(dfparam.columns, dfparam.dtypes):
if "object" in str(j):
dtypedict.update({i: sqlalchemy.types.NVARCHAR(length=255)})
if "datetime" in str(j):
dtypedict.update({i: sqlalchemy.types.DateTime()})
if "float" in str(j):
dtypedict.update({i: sqlalchemy.types.Float(precision=3, asdecimal=True)})
if "int" in str(j):
dtypedict.update({i: sqlalchemy.types.INT()})
return dtypedict
outputdict = sqlcol(df)
column_errors.to_sql('load_errors',
push_conn,
if_exists = 'append',
index = False,
dtype = outputdict)
You can create this dict dynamically if you do not know the column names in advance:
from sqlalchemy.types import NVARCHAR
df.to_sql(...., dtype={col_name: NVARCHAR for col_name in df})
Note that you have to pass the sqlalchemy type object itself (or an instance to specify parameters like NVARCHAR(length=10)) and not a string as in your example.
Consider using the dtype argument of pandas.DataFrame.to_sql where you pass a dictionary of SQLAlchemy types to named columns:
import sqlalchemy
...
data.to_sql(name=table_name, con=engine, if_exists='replace', index=False,
dtype={'name_of_datefld': sqlalchemy.types.DateTime(),
'name_of_intfld': sqlalchemy.types.INTEGER(),
'name_of_strfld': sqlalchemy.types.VARCHAR(length=30),
'name_of_floatfld': sqlalchemy.types.Float(precision=3, asdecimal=True),
'name_of_booleanfld': sqlalchemy.types.Boolean}
I think this has more to do with how pandas handles the table if it exists. The "replace" value to the if_exists argument tells pandas to drop your table and recreate it. But when re-creating your table, it will do it based on its own terms (and the data stored in that particular DataFrame).
While providing column datatypes will work, doing it for every such case might be cumbersome. So I would rather truncate the table in a separate statement and then just append data to it, like so:
Instead of:
df.to_sql(name=table_name, con=engine, if_exists='replace',index=False)
I'd do:
with engine.connect() as con:
con.execute("TRUNCATE TABLE %s" % table_name)
df.to_sql(name=table_name, con=engine, if_exists='append',index=False)
The truncate statement basically drops and recreates your table too, but it's done internally by the database, and the table gets recreated with the same definition.