python dynamic dataframe name in loop

How do I create dynamic variable names inside a loop in pandas

stackoverflow.com › questions › 45114945 › how-do-i-create-dynamic-variable-names-inside-a-loop-in-pandas

I think the best is create dict of objects - see How do I create a variable number of variables?

You can use dict of DataFrames by converting groupby object to dict:

d = dict(tuple(df.groupby('month')))
print (d)
{1:    month dest
0      1    a
1      1   bb, 2:    month dest
2      2   cc
3      2   dd, 3:    month dest
4      3   ee, 4:    month dest
5      4   bb}

print (d[1])
   month dest
0      1    a
1      1   bb

Another solution:

for i, x in df.groupby('month'):
    globals()['dataframe' + str(i)] = x

print (dataframe1)
   month dest
0      1    a
1      1   bb

Answer from jezrael on Stack Overflow

reddit.com › r/learnpython › dynamically assigning name of dataframe in a loop. stuck!

r/learnpython on Reddit: Dynamically assigning name of dataframe in a loop. Stuck!

February 2, 2018 -

Going pseudo-code this out, perhaps somebody has encountered this sort of issue before. Have not had luck reading through stackoverflow posts.

I have a list of months and a df for each month with data that includes delivery volume and a time. These named like 'df_1701_unfiltered'.

I previously hardcoded my query logic, but on mobile now. That's not what I'm worried about so please disregard the pseudo aspect (I'm on mobile atm).

I want to create a new, separate dataframe for each month that is a filtered version of the original. Here is my thought process.

months = ['1701', '1702', '1703']

For month in month: "df_"+month+"filtered" = "df"+month+"_unfiltered".query("time > start and time < end")

I'm able to do something similar within a single dataframe using .apply to create dynamic columns. It throws an "cannot assign to operator" error each time.

Any idea how I can do this for entire dataframes?

Top answer

1 of 1

This won't work. Use a dictionary.

Stack Overflow

stackoverflow.com › questions › 45114945 › how-do-i-create-dynamic-variable-names-inside-a-loop-in-pandas

python - How do I create dynamic variable names inside a loop in pandas - Stack Overflow

Top answer

1 of 2

I think the best is create dict of objects - see How do I create a variable number of variables?

You can use dict of DataFrames by converting groupby object to dict:

d = dict(tuple(df.groupby('month')))
print (d)
{1:    month dest
0      1    a
1      1   bb, 2:    month dest
2      2   cc
3      2   dd, 3:    month dest
4      3   ee, 4:    month dest
5      4   bb}

print (d[1])
   month dest
0      1    a
1      1   bb

Another solution:

for i, x in df.groupby('month'):
    globals()['dataframe' + str(i)] = x

print (dataframe1)
   month dest
0      1    a
1      1   bb

2 of 2

You can use a list of dataframes:

dataframe = []
dataframe.append(None)

group = org_dataframe.groupby('month')

for n,g in group:
    dataframe.append(g)

dataframe[1]

Output:

   month dest
0      1    a
1      1   bb

dataframe[2]

Output:

   month dest
2      2   cc
3      2   dd

Discussions

python - Create new dataframe in pandas with dynamic names also add new column - Stack Overflow

1 Dataframe is not defined when trying to concatenate in loop (Python - Pandas) 0 Can I build a loop function that automatically creates multiple data-frames at one go? 0 How to create variables and read several excel files in a loop with pandas? 0 Changing variable names and creating new ones dynamically... More on stackoverflow.com

stackoverflow.com

python - How to dynamically name a dataframe within this for loop - Stack Overflow

I want to be able to dynamically rename the results dataframe based on which dataset I am using. My code looks like this (where 'feature_cols' are the names of the chemicals): count=0 dataframe=[] #loop through the three datasets (In reality I have many more than three) for dataset in [first, ... More on stackoverflow.com

stackoverflow.com

python - Dynamic dataframe names creation - Stack Overflow

Hi I need my dataframe with different names. The dataframe is creating insde a for loop as shown below. for lambda_ in range(0,len(tuned_parameter)-1): print ('................................... More on stackoverflow.com

stackoverflow.com

trying to dynamically name and reference a dataframe, getting error 'SyntaxError: can't assign to function call'

General rule of thumb - if you need dynamically created variable names - you are doing something wrong. Even though it is possible, doing so would be an awfully bad idea, ie how will you then reference these variables later in your code? More on reddit.com

r/learnpython

May 5, 2017

Videos

youtube.com

How to Set DataFrame Names Dynamically in Python Using ...

youtube.com

How to Dynamically Name a DataFrame in Python's Pandas ...

01:50

YouTube

Dynamically Name DataFrames from a List in Python Using pandas ...

stackoverflow.com › questions › 30635145 › create-multiple-dataframes-in-loop

python - Create multiple dataframes in loop - Stack Overflow

Top answer

1 of 7

161

Just to underline my comment to @maxymoo's answer, it's almost invariably a bad idea ("code smell") to add names dynamically to a Python namespace. There are a number of reasons, the most salient being:

Created names might easily conflict with variables already used by your logic.
Since the names are dynamically created, you typically also end up using dynamic techniques to retrieve the data.

This is why dicts were included in the language. The correct way to proceed is:

d = {}
for name in companies:
    d[name] = pd.DataFrame()

Nowadays you can write a single dict comprehension expression to do the same thing, but some people find it less readable:

d = {name: pd.DataFrame() for name in companies}

Once d is created the DataFrame for company x can be retrieved as d[x], so you can look up a specific company quite easily. To operate on all companies you would typically use a loop like:

for name, df in d.items():
    # operate on DataFrame 'df' for company 'name'

In Python 2 you were better writing

for name, df in d.iteritems():

because this avoids instantiating the list of (name, df) tuples that .items() creates in the older version. That's now largely of historical interest, though there will of course be Python 2 applications still extant and requiring (hopefully occasional) maintenance.

2 of 7

You can do this (although obviously use exec with extreme caution if this is going to be public-facing code)

for c in companies:
     exec('{} = pd.DataFrame()'.format(c))

Python Forum

python-forum.io › thread-21151.html

dynamically create variables' names in python

May 14, 2021 - Hi guys, i want to create variables in the following way: assign a name (e.g. var1), then add the name to the prefix of the variable: name = 'var_1' this_is_+name = pd.DataFrame()the outcome i would l

Stack Overflow

stackoverflow.com › questions › 40973687 › create-new-dataframe-in-pandas-with-dynamic-names-also-add-new-column

python - Create new dataframe in pandas with dynamic names also add new column - Stack Overflow

Top answer

1 of 2

Creating variables with dynamic names is typically a bad practice.

I think the best solution for your problem is to store your dataframes into a dictionary and dynamically generate the name of the key to access each dataframe.

import copy

dict_of_df = {}
for ym in [201511, 201612, 201710]:

    key_name = 'df_new_'+str(ym)    

    dict_of_df[key_name] = copy.deepcopy(df)

    to_change = df['YearMonth']< ym
    dict_of_df[key_name].loc[to_change, 'new_col'] = ym   

dict_of_df.keys()
Out[36]: ['df_new_201710', 'df_new_201612', 'df_new_201511']

dict_of_df
Out[37]: 
{'df_new_201511':     A    B  ID                       t  YearMonth  new_col
 0  -a    a   1 2016-12-05 07:53:35.943     201612   201612
 1   1  NaN   2 2016-12-05 07:53:35.943     201612   201612
 2   a    c   2 2016-12-05 07:53:35.943     201612   201612,
 'df_new_201612':     A    B  ID                       t  YearMonth  new_col
 0  -a    a   1 2016-12-05 07:53:35.943     201612   201612
 1   1  NaN   2 2016-12-05 07:53:35.943     201612   201612
 2   a    c   2 2016-12-05 07:53:35.943     201612   201612,
 'df_new_201710':     A    B  ID                       t  YearMonth  new_col
 0  -a    a   1 2016-12-05 07:53:35.943     201612   201710
 1   1  NaN   2 2016-12-05 07:53:35.943     201612   201710
 2   a    c   2 2016-12-05 07:53:35.943     201612   201710}

 # Extract a single dataframe
 df_2015 = dict_of_df['df_new_201511']

2 of 2

There is a more easy way to accomplish this using exec method. The following steps can be done to create a dataframe at runtime.

1.Create the source dataframe with some random values.

import numpy as np
import pandas as pd
    
df = pd.DataFrame({'A':['-a',1,'a'], 
                   'B':['a',np.nan,'c'],
                   'ID':[1,2,2]})

2.Assign a variable that holds the new dataframe name. You can even send this value as a parameter or loop it dynamically.

new_df_name = 'df_201612'

3.Create dataframe dynamically using exec method to copy data from source dataframe to the new dataframe dynamically and in the next line assign a value to new column.

exec(f'{new_df_name} = df.copy()')
exec(f'{new_df_name}["new_col"] = 123')

4.Now the dataframe df_201612 will be available on the memory and you can execute print statement along with eval to verify this.

print(eval(new_df_name))

Stack Overflow

stackoverflow.com › questions › 73303294 › how-to-dynamically-name-a-dataframe-within-this-for-loop

python - How to dynamically name a dataframe within this for loop - Stack Overflow

Top answer

1 of 2

You can use a dictionary (here with dummy values) :

names = ['first', 'second', 'third', 'fourth', 'fifth', 'sixth']
pvalues = {}
for i in range(len(names)):
    pvalues["pvalues_" + names[i]] = i+1

print(pvalues)

Output:

{'pvalues_first': 1, 'pvalues_second': 2, 'pvalues_third': 3, 'pvalues_fourth': 4, 'pvalues_fifth': 5, 'pvalues_sixth': 6}

To access pvalues_third for example :

pvalues["pvalues_third"] = 20
print(pvalues)

**Output: **

{'pvalues_first': 1, 'pvalues_second': 2, 'pvalues_third': 20, 'pvalues_fourth': 4, 'pvalues_fifth': 5, 'pvalues_sixth': 6}

2 of 2

count=0
dataframe=[]

#loop through the three datasets (In reality I have many more than three)
names = ["first", "second", "third"]
for feature in feature_cols:
    #define the model and fit it
    mod = smf.ols(formula='Q(feature)'+'~material', data=dataset)
    res = mod.fit()

    #create a dataframe of the pvalues
    #I would like to be able to dynamically name pvalues so that when looping through
    #the chemicals of the first dataframe it is called 'pvalues_first' and so on.
    name_str = "pvalues"+str(names[count])
    pvalues = {'Intercept':[res.pvalues[0]], 'cap_type':[res.pvalues[1]]}
    name_str=pd.DataFrame(pvalues)
    count+=1

Stack Overflow

stackoverflow.com › questions › 54990451 › dynamic-dataframe-names-creation

python - Dynamic dataframe names creation - Stack Overflow

whole_dataframes = {} #k = int(np.floor(float(X.shape[0]) / number_folds)) weights = np.zeros((3,num_portfolios)) for lambda_ in range(0,len(tuned_parameter)-1): print ('....................................',lambda_) i=0 appended_data = [] for train_index, test_index in kf.split(X): print("Train:", train_index, "Validation:",test_index) X_train, X_test = X.iloc[train_index], X.iloc[test_index] print ('X train ............',X_train.shape) print ('X_test...............',X_test.shape) mean_returns_Train = X_train.mean() cov_matrix_Train=X_train.cov() mean_returns_Test = X_test.mean() cov_matrix_T

Find elsewhere

Google Bing Mojeek

reddit.com › r/learnpython › trying to dynamically name and reference a dataframe, getting error 'syntaxerror: can't assign to function call'

r/learnpython on Reddit: trying to dynamically name and reference a dataframe, getting error 'SyntaxError: can't assign to function call'

May 5, 2017 -

#define list of fields to run match for

fieldlist = ['MATTER NUMBER','MATTER NAME','CLAIM NUMBER LISTING']

#loop through each field in fieldlist
for field in fieldlist:
    #define dfname as the field with spaces replaced with underscores
    dfname = '{}'.format(field.replace(' ','_'))
    #create df with dfname
    '{}'.format(dfname) = checkdf['{}'.format(field)].dropna()

the error is on the last line. I also tried:

'{}'.format(dfname) = checkdf['{}'.format(field)].dropna()

Top answer

1 of 3

2 of 3

You'll probably want a dict of dataframes e.g. dfs = {} ... dfs[dfname] = checkdf[field].dropna() '{}'.format(string) is just the same as string you don't need '{}'.format()

IncludeHelp

includehelp.com › python › create-multiple-dataframes-in-loop.aspx

Create multiple dataframes in loop in Python

October 3, 2022 - Write a Python program to create multiple dataframes in loop · To create multiple dataframes in loop, you can create a list that contains the name of different fruits, and then loop over this list, and on each traversal of the element.

Stack Overflow

stackoverflow.com › questions › 67153826 › python-dynamic-dataframe-name

pandas - Python dynamic dataframe name - Stack Overflow

So at the end you have result as [{"aaa.csv": df},{"bbb.csv": df},] . If your file_name are not unique then need to find way to create key. ... Try this one. credits to this cool video: https://www.youtube.com/watch?v=eMOA1pPVUc4 · import os import pandas as pd path = "./path" files = [file for file in os.listdir(path) if not file.startswith('.')] # Ignore hidden files #Define the next table as Pandas Dataframe all_months_data = pd.DataFrame() #loop to check each file by file name e append to the previous for file in files: current_data = pd.read_csv(path+"/"+file) all_months_data = pd.concat([all_months_data, current_data]) #create a CSV DF with all data all_months_data.to_csv("all_data_copy.csv", index=False)

Stack Overflow

stackoverflow.com › questions › 55001472 › create-dynamic-dataframe-name-by-splitting-a-larger-dataframe › 55001592

python - Create dynamic dataframe name by splitting a larger dataframe - Stack Overflow

Top answer

1 of 2

Best way is to create a dict with the dynamic names as keys:

chunks = {f'{sub}{i}':chunk for i, chunk in enumerate(np.array_split(df, 10))}

If you absolutely insist on creating the frames as individual variables, then you could assign them to the globals() dictionary, but this method is NOT advised:

for i, chunk in enumerate(np.array_split(df, 10)):
    globals()['{}{}'.format(sub, i)] = chunk

2 of 2

Why would you want to create variables in a loop?

They are unnecessary: You can store everything in lists or any other type of collection
They are hard to create and reuse: You have to use exec or globals()

Using a list is much easier:

subs = []
for chunk in np.array_split(df, 10):
        print(chunk.head(2)) #just to check
        print(chunk.tail(1)) #just to check
        subs.append(chuck.copy())

Stack Overflow

stackoverflow.com › questions › 42977487 › how-to-name-dataframes-with-a-for-loop

python - How to name dataframes with a for loop? - Stack Overflow

Top answer

1 of 2

I think it is easy to handle the dataframes in a dictionary. Try the codes below:

review_categories = ["beauty", "pet"]
reviews = {}

for review in review_categories:
     df_name = review + '_reviews' # the name for the dataframe
     filename = "D:\\Library\\reviews_{}.json".format(review)

     reviews[df_name] = pd.read_json(path_or_buf=filename, lines=True)

In reviews, you will have a key with the respective dataframe to store the data. If you want to retrieve the data, just call:

reviews["beauty_reviews"]

Hope it helps.

2 of 2

You can first pack the files into a list

reviews = []
review_categories = ["beauty", "pet"]
for i in review_categories:
    filename = "D:\\Library\\reviews_{}.json".format(i)
    reviews.append(pd.read_json(path_or_buf=filename, lines=True))

and then unpack your results into the variable names you wanted:

beauty_reviews, pet_reviews = reviews

Posit Community

forum.posit.co › general

How do I assign a dataframe name dynamically - General - Posit Community

Top answer

1 of 1

[image] merrittr: zonename<-data %>% filter(zone==numzone) assign(zonename,data %>% filter(zone==numzone) )

GeeksforGeeks

geeksforgeeks.org › create-a-column-using-for-loop-in-pandas-dataframe

Create a column using for loop in Pandas Dataframe - GeeksforGeeks

June 9, 2022 - Converting lists to DataFrames is crucial in data analysis, Pandas enabling you to perform sophisticated data manipulations and analyses with ease. List to Dataframe Example# Simple listdata = [1, 2, 3, 4, 5]# Convert to DataFramedf = pd.DataFrame(data, columns=['Numbers'])Here we will discuss diffe ... Pandas is basically the library in Python used for Data Analysis and Manipulation.

KDnuggets

kdnuggets.com › 2022 › 08 › customize-data-frame-column-names-python.html

Customize Your Data Frame Column Names in Python - KDnuggets

After constructing the dictionary columnnames with the original and new column names we will then passing the dictionary to the rename method ... columnnames = {} count = 0 for i in df.columns: count += 1 columnnames[i] = f"WEEK_{count}_ATTENDANCE" columnnames ... We would then be using for loop to iterate over all the columns of the Data Frame, where in every iteration the first occurrence of the underscore will be replaced by no space.

Stack Overflow

stackoverflow.com › questions › 47097416 › python-attribute-dynamic-name-dataframe

pandas - Python - Attribute dynamic name DataFrame - Stack Overflow

Top answer

1 of 1

Assign your DataFrames to values in a dictionary and change the key for each DataFrame. Ex

    dict_of_dfs = {}
    l=0
    for key, value in dicstagefolder.items():
        for v in value:
            l=l+1
            name = key+"-"+str(l)
            dict_of_dfs.update(**{name: pd.read_csv(v)})

Also, try using for v in value in your inner loop. Otherwise, you might face problems with the value keyword getting overwritten.

Stack Overflow

stackoverflow.com › questions › 64502837 › dynamically-naming-saved-dataframes-in-loop

python - Dynamically naming saved dataframes in loop - Stack Overflow

Top answer

1 of 1

I used two answers from here. I think your best option is just storing the DataFrames in a dictionary but this should work to create your query_* variables.

query_dict = {}

for n, v in enumerate(state_abbrevs.values()):
    t = gtab.GTAB()
    t.set_options(pytrends_config={"geo": f"US-{v}", "timeframe": "2020-09-01 2020-10-01"})
    query = t.new_query("weather")
    key = "query_" + str(n)
    query_dict[key] = query

for k in query_dict.keys():
    exec("%s = query_dict['%s']" % (k,k))

Stack Overflow

stackoverflow.com › questions › 51896168 › create-new-data-frame-with-the-name-from-loop-number

python - Create new data frame with the name from loop number - Stack Overflow

Top answer

1 of 2

This isn't a pythonic thing to do, have you thought about instead creating a list of dataframes?

df=pd.DataFrame.copy(mef_list)
form=['','_M3','_M6','_M9','_M12','_LN','_C']
list_of_df = list()
for i in range(0, len(form)):
    df=pd.DataFrame.copy(mef_list)
    df['Variable_new']=df['Variable']+str(form[i])
    list_of_df.append(df)

Then you can access 'df0' as list_of_df[0]

You also don't need to iterate through a range, you can just loop through the form list itself:

form=['','_M3','_M6','_M9','_M12','_LN','_C']
list_of_df = list()
for i in form:
    df=pd.DataFrame.copy(mef_list)
    df['Variable_new']=df['Variable']+str(i) ## You can remove str() if everything in form is already a string
    list_of_df.append(df)

2 of 2

mef_list = ["UR", "CPI", "CEI", "Farm", "PCI", "durable", "C_CVM"]
form = ['', '_M3', '_M6', '_M9', '_M12', '_LN', '_C']
Variable_new = []
foo = 0
for variable in form:
     Variable_new.append(mef_list[foo]+variable)
     foo += 1
print(Variable_new)

Stack Overflow

stackoverflow.com › questions › 66485677 › create-dynamic-names-for-pyspark-dataframe-inside-a-for-loop

apache spark - create dynamic names for pyspark dataframe inside a for loop - Stack Overflow

Top answer

1 of 1

Using a dict is probably a better option for your need. You store each dataframe with the corresponding year as a key:

PROD_years = {}
year=int(datetime.datetime.today().year)

for i in range (year, 2016, -1 ):
  df = df_PROD.filter(col("Year") == i)
  if df.count() > 0:
    PROD_years[i] = df

print(PROD_years)

Stack Overflow

stackoverflow.com › questions › 43796074 › trying-to-dynamically-name-and-later-reference-dataframe-name-as-a-variable-get

python - trying to dynamically name and later reference dataframe name as a variable, getting error 'SyntaxError: can't assign to function call' - Stack Overflow

#define list of fields to run match for fieldlist = ['MATTER NUMBER','MATTER NAME','CLAIM NUMBER LISTING'] #loop through each field in fieldlist for field in fieldlist: #define dfname as the field with spaces replaced with underscores dfname = '{}'.format(field.replace(' ','_')) #create df with dfname '{}'.format(dfname) = checkdf['{}'.format(field)].dropna() ... edit: apologies if this was confusing. I'm attempting to create dataframe names and columns associated with each field in fieldlist from checkdf.