Creating variables with dynamic names is typically a bad practice.
I think the best solution for your problem is to store your dataframes into a dictionary and dynamically generate the name of the key to access each dataframe.
import copy
dict_of_df = {}
for ym in [201511, 201612, 201710]:
key_name = 'df_new_'+str(ym)
dict_of_df[key_name] = copy.deepcopy(df)
to_change = df['YearMonth']< ym
dict_of_df[key_name].loc[to_change, 'new_col'] = ym
dict_of_df.keys()
Out[36]: ['df_new_201710', 'df_new_201612', 'df_new_201511']
dict_of_df
Out[37]:
{'df_new_201511': A B ID t YearMonth new_col
0 -a a 1 2016-12-05 07:53:35.943 201612 201612
1 1 NaN 2 2016-12-05 07:53:35.943 201612 201612
2 a c 2 2016-12-05 07:53:35.943 201612 201612,
'df_new_201612': A B ID t YearMonth new_col
0 -a a 1 2016-12-05 07:53:35.943 201612 201612
1 1 NaN 2 2016-12-05 07:53:35.943 201612 201612
2 a c 2 2016-12-05 07:53:35.943 201612 201612,
'df_new_201710': A B ID t YearMonth new_col
0 -a a 1 2016-12-05 07:53:35.943 201612 201710
1 1 NaN 2 2016-12-05 07:53:35.943 201612 201710
2 a c 2 2016-12-05 07:53:35.943 201612 201710}
# Extract a single dataframe
df_2015 = dict_of_df['df_new_201511']
Answer from FLab on Stack OverflowCreating variables with dynamic names is typically a bad practice.
I think the best solution for your problem is to store your dataframes into a dictionary and dynamically generate the name of the key to access each dataframe.
import copy
dict_of_df = {}
for ym in [201511, 201612, 201710]:
key_name = 'df_new_'+str(ym)
dict_of_df[key_name] = copy.deepcopy(df)
to_change = df['YearMonth']< ym
dict_of_df[key_name].loc[to_change, 'new_col'] = ym
dict_of_df.keys()
Out[36]: ['df_new_201710', 'df_new_201612', 'df_new_201511']
dict_of_df
Out[37]:
{'df_new_201511': A B ID t YearMonth new_col
0 -a a 1 2016-12-05 07:53:35.943 201612 201612
1 1 NaN 2 2016-12-05 07:53:35.943 201612 201612
2 a c 2 2016-12-05 07:53:35.943 201612 201612,
'df_new_201612': A B ID t YearMonth new_col
0 -a a 1 2016-12-05 07:53:35.943 201612 201612
1 1 NaN 2 2016-12-05 07:53:35.943 201612 201612
2 a c 2 2016-12-05 07:53:35.943 201612 201612,
'df_new_201710': A B ID t YearMonth new_col
0 -a a 1 2016-12-05 07:53:35.943 201612 201710
1 1 NaN 2 2016-12-05 07:53:35.943 201612 201710
2 a c 2 2016-12-05 07:53:35.943 201612 201710}
# Extract a single dataframe
df_2015 = dict_of_df['df_new_201511']
There is a more easy way to accomplish this using exec method. The following steps can be done to create a dataframe at runtime.
1.Create the source dataframe with some random values.
import numpy as np
import pandas as pd
df = pd.DataFrame({'A':['-a',1,'a'],
'B':['a',np.nan,'c'],
'ID':[1,2,2]})
2.Assign a variable that holds the new dataframe name. You can even send this value as a parameter or loop it dynamically.
new_df_name = 'df_201612'
3.Create dataframe dynamically using exec method to copy data from source dataframe to the new dataframe dynamically and in the next line assign a value to new column.
exec(f'{new_df_name} = df.copy()')
exec(f'{new_df_name}["new_col"] = 123')
4.Now the dataframe df_201612 will be available on the memory and you can execute print statement along with eval to verify this.
print(eval(new_df_name))
#define list of fields to run match for
fieldlist = ['MATTER NUMBER','MATTER NAME','CLAIM NUMBER LISTING']
#loop through each field in fieldlist
for field in fieldlist:
#define dfname as the field with spaces replaced with underscores
dfname = '{}'.format(field.replace(' ','_'))
#create df with dfname
'{}'.format(dfname) = checkdf['{}'.format(field)].dropna()the error is on the last line. I also tried:
'{}'.format(dfname) = checkdf['{}'.format(field)].dropna()Videos
I have a data frame patient_info in a function . My global variables are year =20 and pat_type = ‘DD’ . I want to rename the data frame as DD_patient_info_20.
I have been using f strings and + but it does not let me assign as follows
name =pat_type+’patient_info_’+str(year)= patient_info
How can I rename the data frame based on the dynamic variables
Edit :
Took your suggestion of not using type so renamed that variable to pat_type
I ended up using : Global()[name]=patient_info and that worked for me . Is there anything wrong in this method ?
Going pseudo-code this out, perhaps somebody has encountered this sort of issue before. Have not had luck reading through stackoverflow posts.
I have a list of months and a df for each month with data that includes delivery volume and a time. These named like 'df_1701_unfiltered'.
I previously hardcoded my query logic, but on mobile now. That's not what I'm worried about so please disregard the pseudo aspect (I'm on mobile atm).
I want to create a new, separate dataframe for each month that is a filtered version of the original. Here is my thought process.
months = ['1701', '1702', '1703']
For month in month: "df_"+month+"filtered" = "df"+month+"_unfiltered".query("time > start and time < end")
I'm able to do something similar within a single dataframe using .apply to create dynamic columns. It throws an "cannot assign to operator" error each time.
Any idea how I can do this for entire dataframes?
Best way is to create a dict with the dynamic names as keys:
chunks = {f'{sub}{i}':chunk for i, chunk in enumerate(np.array_split(df, 10))}
If you absolutely insist on creating the frames as individual variables, then you could assign them to the globals() dictionary, but this method is NOT advised:
for i, chunk in enumerate(np.array_split(df, 10)):
globals()['{}{}'.format(sub, i)] = chunk
Why would you want to create variables in a loop?
- They are unnecessary: You can store everything in lists or any other type of collection
- They are hard to create and reuse: You have to use exec or globals()
Using a list is much easier:
subs = []
for chunk in np.array_split(df, 10):
print(chunk.head(2)) #just to check
print(chunk.tail(1)) #just to check
subs.append(chuck.copy())
I think the best is create dictionary of DataFrames:
d = {}
for i in range(12,0,-1):
d['t' + str(i)] = df.shift(i).add_suffix('_t' + str(i))
If need specify columns first:
d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
d['t' + str(i)] = df[cols].shift(i).add_suffix('_t' + str(i))
dict comprehension solution:
d = {'t' + str(i): df.shift(i).add_suffix('_t' + str(i)) for i in range(12,0,-1)}
print (d['t10'])
column1_t10 column2_t10
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 NaN NaN
7 NaN NaN
8 NaN NaN
9 NaN NaN
10 0.0 19.0
11 1.0 18.0
12 2.0 17.0
13 3.0 16.0
14 4.0 15.0
15 5.0 14.0
16 6.0 13.0
17 7.0 12.0
18 8.0 11.0
19 9.0 10.0
EDIT: Is it possible by globals, but much better is dictionary:
d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
globals()['df' + str(i)] = df[cols].shift(i).add_suffix('_t' + str(i))
print (df10)
column1_t10 column2_t10
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 NaN NaN
7 NaN NaN
8 NaN NaN
9 NaN NaN
10 0.0 19.0
11 1.0 18.0
12 2.0 17.0
13 3.0 16.0
14 4.0 15.0
15 5.0 14.0
16 6.0 13.0
17 7.0 12.0
18 8.0 11.0
19 9.0 10.0
for i in range(1, 16):
text=f"Version{i}=pd.DataFrame()"
exec(text)
A combination of exec and f"..." will help you do that.
If you need iterating or Versions of same variable above statement will help