create multiple dataframes in for loop python using for loop

stackoverflow.com › questions › 30635145 › create-multiple-dataframes-in-loop

python - Create multiple dataframes in loop - Stack Overflow

askpython.com › home › multiple dataframes in a loop using python

1 of 7

161

Created names might easily conflict with variables already used by your logic.
Since the names are dynamically created, you typically also end up using dynamic techniques to retrieve the data.

This is why dicts were included in the language. The correct way to proceed is:

d = {}
for name in companies:
    d[name] = pd.DataFrame()

Nowadays you can write a single dict comprehension expression to do the same thing, but some people find it less readable:

d = {name: pd.DataFrame() for name in companies}

Once d is created the DataFrame for company x can be retrieved as d[x], so you can look up a specific company quite easily. To operate on all companies you would typically use a loop like:

for name, df in d.items():
    # operate on DataFrame 'df' for company 'name'

In Python 2 you were better writing

for name, df in d.iteritems():

2 of 7

You can do this (although obviously use exec with extreme caution if this is going to be public-facing code)

for c in companies:
     exec('{} = pd.DataFrame()'.format(c))

AskPython

Multiple Dataframes in a Loop Using Python - AskPython

March 31, 2023 - So, after printing the dictionary, we can see that empty dataframes are created for each element of the list. Here, we have not entered any data for each column so it’ll be printed as empty data columns. ... This way, we can create multiple data frames using a loop in Python language.

Discussions

python 3.x - Creating multiple dataframes with a loop - Stack Overflow

This undoubtedly reflects lack of knowledge on my part, but I can't find anything online to help. I am very new to programming. I want to load 6 csvs and do a few things to them before combining them More on stackoverflow.com

stackoverflow.com

February 20, 2018

Create a for loop to make multiple data frames?

I'm having trouble making a loop that will iterate through my data and create multiple data frames. Here's some dummy data: mydf More on forum.posit.co

forum.posit.co

June 3, 2021

python - How can I create a multiple new dataframes inside a for loop? - Stack Overflow

You should use a dict to save the dataframes you are creating, where "2011_pivot", "2012_pivot" and "2013_pivot" are the keys. ... Did an answer below help? If so, feel free to accept one, or ask for clarification. ... I would generally discourage you from creating lots of variables with related names which is a dangerous design pattern in Python ... More on stackoverflow.com

stackoverflow.com

Python Looping multiple dataframes

I recommend using matplotlib when plotting, specifically its objected-oriented approach. From here you should be able to save the axes object for later. More on reddit.com

r/learnpython

November 3, 2021

IncludeHelp

includehelp.com › python › create-multiple-dataframes-in-loop.aspx

Create multiple dataframes in loop in Python

October 3, 2022 - Python Loops: Loop is functionality ... a for loop to create DataFrames. Python Dictionaries: Dictionaries are used to store heterogeneous data. The data is stored in key:value pair. A dictionary is a collection that is mutable and ordered in nature and does not allow duplicates which mean there are unique keys in a dictionary. A dictionary key can have any type of data as its value, for example, a list, tuple, string, or dictionary itself. Write a Python program to create multiple dataframes ...

stackoverflow.com › questions › 48888001 › creating-multiple-dataframes-with-a-loop › 48888621

python 3.x - Creating multiple dataframes with a loop - Stack Overflow

I think you think your code is doing something that it is not actually doing.

Specifically, this line: df = pd.read_csv(file)

You might think that in each iteration through the for loop this line is being executed and modified with df being replaced with a string in dfs and file being replaced with a filename in files. While the latter is true, the former is not.

Each iteration through the for loop is reading a csv file and storing it in the variable df effectively overwriting the csv file that was read in during the previous for loop. In other words, df in your for loop is not being replaced with the variable names you defined in dfs.

The key takeaway here is that strings (e.g., 'df1', 'df2', etc.) cannot be substituted and used as variable names when executing code.

One way to achieve the result you want is store each csv file read by pd.read_csv() in a dictionary, where the key is name of the dataframe (e.g., 'df1', 'df2', etc.) and value is the dataframe returned by pd.read_csv().

list_of_dfs = {}
for df, file in zip(dfs, files):
    list_of_dfs[df] = pd.read_csv(file)
    print(list_of_dfs[df].shape)
    print(list_of_dfs[df].dtypes)
    print(list(list_of_dfs[df]))

You can then reference each of your dataframes like this:

print(list_of_dfs['df1'])
print(list_of_dfs['df2'])

You can learn more about dictionaries here:

https://docs.python.org/3.6/tutorial/datastructures.html#dictionaries

Create a for loop to make multiple data frames? - General - Posit Community

Use dictionary to store you DataFrames and access them by name

files = ('data1.csv', 'data2.csv', 'data3.csv', 'data4.csv', 'data5.csv', 'data6.csv')
dfs_names = ('df1', 'df2', 'df3', 'df4', 'df5', 'df6')
dfs ={}
for dfn,file in zip(dfs_names, files):
    dfs[dfn] = pd.read_csv(file)
    print(dfs[dfn].shape)
    print(dfs[dfn].dtypes)
print(dfs['df3'])

Use list to store you DataFrames and access them by index

files = ('data1.csv', 'data2.csv', 'data3.csv', 'data4.csv', 'data5.csv', 'data6.csv')
dfs = []
for file in  files:
    dfs.append( pd.read_csv(file))
    print(dfs[len(dfs)-1].shape)
    print(dfs[len(dfs)-1].dtypes)
print (dfs[2])

Do not store intermediate DataFrame, just process them and add to resulting DataFrame.

files = ('data1.csv', 'data2.csv', 'data3.csv', 'data4.csv', 'data5.csv', 'data6.csv')
df = pd.DataFrame()
for file in  files:
    df_n =  pd.read_csv(file)
    print(df_n.shape)
    print(df_n.dtypes)
    # do you want to do
    df = df.append(df_n)
print (df)

If you will process them differently, then you do not need a general structure to store them. Do it simply independent.

df = pd.DataFrame()
def do_general_stuff(d): #here we do common things with DataFrame
    print(d.shape,d.dtypes)

df1 = pd.read_csv("data1.csv")
# do you want to with df1

do_general_stuff(df1)
df = df.append(df1)
del df1

df2 = pd.read_csv("data2.csv")
# do you want to with df2

do_general_stuff(df2)
df = df.append(df2)
del df2

df3 = pd.read_csv("data3.csv")
# do you want to with df3

do_general_stuff(df3)
df = df.append(df3)
del df3

# ... and so on

And one geeky way, but don't ask how it works:)

from collections import namedtuple
files = ['data1.csv', 'data2.csv', 'data3.csv', 'data4.csv', 'data5.csv', 'data6.csv']

df = namedtuple('Cdfs',
                ['df1', 'df2', 'df3', 'df4', 'df5', 'df6']
               )(*[pd.read_csv(file) for file in files])

for df_n in df._fields:
    print(getattr(df, df_n).shape,getattr(df, df_n).dtypes)

print(df.df3)

Posit Community

forum.posit.co › general

June 3, 2021 - I'm having trouble making a loop that will iterate through my data and create multiple data frames. Here's some dummy data: mydf <- data.frame("color"=c("blue","yellow","red","green","pink","orange","cyan"), …

Databricks Community

community.databricks.com › t5 › data-engineering › python-generate-new-dfs-from-a-list-of-dataframes-using-for-loop › td-p › 21650

Python: Generate new dfs from a list of dataframes using for loop

December 2, 2022 - I have a list of dataframes (for this example 2) and want to apply a for-loop to the list of frames to generate 2 new dataframes. To start, here is my starting dataframe called df_final: First, I create 2 dataframes: df2_b2c_fast, df2_b2b_fast: for x in df_final['b2b_b2c_prod'].unique(): local...

stackoverflow.com › questions › 52598090 › how-can-i-create-a-multiple-new-dataframes-inside-a-for-loop

python - How can I create a multiple new dataframes inside a for loop? - Stack Overflow

I would generally discourage you from creating lots of variables with related names which is a dangerous design pattern in Python (although it's common in SAS for example). A better option would be to create a dictionary of dataframes with the key as your 'variable name'

df_dict = dict()
for df in 2011, 2012, 2013:
   df_dict["pivot_"+df.name] = pd.pivot_table(df, index=["income"], columns=["area"], values=["id"], aggfunc='count')

I'm assuming here that your dataframes have the names "2011", "2012", "2013"

cnasolution.com › questions › 196553 › create-multiple-dataframes-in-loop

I don't see any other way but to create a list or a dictionary of data frames, you'd have to name them manually otherwise.

df_list = [pd.pivot_table(df, index=["income"], columns=["area"], values=["id"], aggfunc='count') for df in 2011, 2012, 2013]

You can find an example here.

Cnasolution

cna solution | Create multiple dataframes in loop

February 20, 2020 - To operate on all companies you would typically use a loop like: for name, df in d.items(): # operate on DataFrame 'df' for company 'name' In Python 2 you are better writing for name, df in d.iteritems(): because this avoids instantiating a list of (name, df) tuples. Adding to the above great answers. The above will work flawless if you need to create empty data frames but if you need to create multiple dataframe based on some filtering: Suppose the list you got is a column of some dataframe and you want to make multiple data frames for each unique companies fro the bigger data frame:- First t

Find elsewhere

Google Bing Mojeek

Kaggle

kaggle.com › questions-and-answers › 93779

Looping multiple dataframes?

Checking your browser before accessing www.kaggle.com · Click here if you are not automatically redirected after 5 seconds

reddit.com › r/learnpython › python looping multiple dataframes

r/learnpython on Reddit: Python Looping multiple dataframes

November 3, 2021 -

I am learning python and is having trouble accessing data from multiple dataframes.

I want to make multiple bar plot with different dataframes. All of the dataframes have the same columns. So I thought, instead of writing the code one by one, maybe I could somehow iterate through the dataframes. But I haven't find the right way to do it. Could anyone advice me? I am curious if it can be done in one go instead of writing it for every dataframe.

I recommend using matplotlib when plotting, specifically its objected-oriented approach. From here you should be able to save the axes object for later.

1 of 4

2 of 4

for df in [df1, df2, df3]: pd.plot.bar(df["test"])

stackoverflow.com › questions › 46805851 › create-multiple-dataframe-in-loops

python - Create multiple dataframe in loops - Stack Overflow

This should do it:

for i in province_id:
    for j in year:
        locals()['sub_data_{}_{}'.format(i,j)] = data[(data.provid==i) & (data.wave==j)]

I initially suggested using exec, which is not usually considered best practice for safety reasons. Having said so, if your code is not exposed to anyone with malicious intentions, it should be OK, and I'll leave it here for the sake of completeness:

for i in province_id:
    for j in year:
        exec "sub_data_{}_{} = data[(data.provid==i) & (data.wave==j)]".format(i,j)

Nevertheless, for most use cases, it's probably better to use a collection of some sort, e.g. a dictionary, because it will be cumbersome to reference dynamically generated variable names in subsequent parts of your code. It's also a one-liner:

data_dict = {key:g for key,g in data.groupby(['provid','wave'])}

medium.com › @zeebrockeraa › create-list-of-dataframes-for-loop-python-b64acb9369e2

I think the best is create dictionary of DataFrames with groupby with filtering first by boolean indexing:

df = pd.DataFrame({'A':list('abcdef'),
                   'wave':[2004,2005,2004,2005,2005,2004],
                   'C':[7,8,9,4,2,3],
                   'D':[1,3,5,7,1,0],
                   'E':[5,3,6,9,2,4],
                   'provid':list('aaabbb')})

print (df)
   A  C  D  E provid  wave
0  a  7  1  5      a  2004
1  b  8  3  3      a  2005
2  c  9  5  6      a  2004
3  d  4  7  9      b  2005
4  e  2  1  2      b  2005
5  f  3  0  4      b  2004


province_id = ['a','b']
year = [2004]
df = df[(df.provid.isin(province_id)) &(df.wave.isin(year))]
print (df)
   A  C  D  E provid  wave
0  a  7  1  5      a  2004
2  c  9  5  6      a  2004
5  f  3  0  4      b  2004

dfs = {'{0[0]}_{0[1]}'.format(i) : x for i, x in df.groupby(['provid','wave'])}

Another solution:

dfs = dict(tuple(df.groupby(df['provid'] + '_' + df['wave'].astype(str))))

print (dfs)
{'a_2004':    A  C  D  E provid  wave
0  a  7  1  5      a  2004
2  c  9  5  6      a  2004, 'b_2004':    A  C  D  E provid  wave
5  f  3  0  4      b  2004}

Last you can select each DataFrame:

print (dfs['b_2004'])
   A  C  D  E provid  wave
5  f  3  0  4      b  2004

Your answer should be changed by:

sub_data = {}
province_id = ['a','b']
year = [2004]
for i in province_id:
    for j in year:
         sub_data[i + '_' + str(j)] = df[(df.provid==i) &(df.wave==j)]

print (sub_data)
{'a_2004':    A  C  D  E provid  wave
0  a  7  1  5      a  2004
2  c  9  5  6      a  2004, 'b_2004':    A  C  D  E provid  wave
5  f  3  0  4      b  2004}

Medium

Create List of dataframes For Loop Python | by Zeeshan Ali | Medium

July 15, 2023 - Create List of dataframes For Loop Python To create a list of DataFrames in Python, you can use the pandas library. Here’s an example of how you can create a list of DataFrames and then loop …

stackoverflow.com › questions › 62085424 › create-multiple-dataframes-using-loop-function

python 3.x - Create Multiple Dataframes using Loop & function - Stack Overflow

IIUC, I was able to achieve what you wanted.

import pandas as pd
import numpy as np

# source data for the dataframe
data = {
"ID":["x","y","z","x","y","z","x","y","a","b","x"],
"Date":["May 01","May 02","May 04","May 01","May 01","May 02","May 01","May 05","May 06","May 08","May 10"],
"Amount":[10,20,30,40,50,60,70,80,90,100,110]
}

df = pd.DataFrame(data)

# convert the Date column to datetime and still maintain the format like "May 01"
df['Date'] = pd.to_datetime(df['Date'], format='%b %d').dt.strftime('%b %d')

# sort the values on ID and Date
df.sort_values(by=['ID', 'Date'], inplace=True)
df.reset_index(inplace=True, drop=True)

print(df)

Original Dataframe:

    Amount    Date ID
0       90  May 06  a
1      100  May 08  b
2       10  May 01  x
3       40  May 01  x
4       70  May 01  x
5      110  May 10  x
6       50  May 01  y
7       20  May 02  y
8       80  May 05  y
9       60  May 02  z
10      30  May 04  z

# create a list of unique ids
list_id = sorted(list(set(df['ID'])))

# create an empty list that would contain dataframes
df_list = []

# count of iterations that must be seperated out
# for example if we want to record 3 entries for 
# each id, the iter would be 3. This will create
# three new dataframes that will hold transactions
# respectively. 
iter = 3
for i in range(iter):
    df_list.append(pd.DataFrame())


for val in list_id:
    tmp_df = df.loc[df['ID'] == val].reset_index(drop=True)

    # consider only the top iter(=3) values to be distributed
    counter = np.minimum(tmp_df.shape[0], iter)
    for idx in range(counter):
        df_list[idx] = df_list[idx].append(tmp_df.loc[tmp_df.index == idx])

for df in df_list:
    df.reset_index(drop=True, inplace=True)
    print(df)

Transaction #1:

   Amount    Date ID
0      90  May 06  a
1     100  May 08  b
2      10  May 01  x
3      50  May 01  y
4      60  May 02  z

Transaction #2:

   Amount    Date ID
0      40  May 01  x
1      20  May 02  y
2      30  May 04  z

Transaction #3:

   Amount    Date ID
0      70  May 01  x
1      80  May 05  y

Note that in your data, there are four transactions for 'x'. If lets say you wanted to track the 4th iterative transaction as well. All you need to do is change the value if 'iter' to 4 and you will get the fourth dataframe as well with the following value:

   Amount    Date ID
0     110  May 10  x

stackoverflow.com › questions › 35374995 › create-multiple-dataframe-using-for-loop-in-python-2-7 › 35394283

pandas - Create multiple dataframe using for loop in python 2.7 - Stack Overflow

I got the answer which I was looking for

import pandas as pd
gbl = globals()
for i in locations:
gbl['df_'+i] = df[df.Start_Location==i]

This will create 3 data frames df_HOME, df_office and df_SHOPPING

Thanks,

Use groupby() and then call it's get_group() method:

import pandas as pd
import io

text = b"""Start_Location  End_Location    Date
OFFICE          HOME            3-Apr-15
OFFICE          HOME            3-Apr-15
HOME            SHOPPING    3-Apr-15
HOME            SHOPPING    4-Apr-15
HOME            SHOPPING    4-Apr-15
SHOPPING    HOME            5-Apr-15
SHOPPING    HOME            5-Apr-15
HOME            SHOPPING    5-Apr-15"""

locations = ["HOME", "OFFICE", "SHOPPING"]

df = pd.read_csv(io.BytesIO(text), delim_whitespace=True)
g = df.groupby("Start_Location")
for name, df2 in g:
    globals()["df_" + name.lower()] = df2

but I think add global variables in a for loop isn't a good method, you can convert the groupby to a dict by:

d = dict(iter(g))

then you can use d["HOME"] to get the data.

stackoverflow.com › questions › 61490942 › python-import-multiple-dataframes-using-for-loop

pandas - Python: Import multiple dataframes using for loop - Stack Overflow

I have the following code which works to import a dataframe. #read tblA tbl = 'a' cols = 'imp_a' usecols = dfDD[dfDD[cols].notnull()][cols].values.tolist() dfa = getdf(tbl, dfRT, sfsession) dfa =...

stackoverflow.com › questions › 70069743 › create-multiple-dataframes-with-a-for-loop

python - Create multiple DataFrames with a for loop - Stack Overflow

I'm sorry to say this is where I'm up to. Could use of dicts help here? If there's any other info I can provide please let me know :) python · pandas · dataframe · for-loop · Share · Improve this question · Follow · edited Nov 23, 2021 at 8:22 · asked Nov 22, 2021 at 17:08 ·

stackoverflow.com › questions › 70354036 › how-to-create-multiple-data-frames-in-for-loop

python - How to Create Multiple Data Frames in For Loop - Stack Overflow

thewebdev.info › home › how to create multiple dataframes in loop with python pandas?

You could save the dataframes to a list:

list_of_dfs = list()

for x in names:
    df = data[data['Name'].eq(x)].groupby('Part No.')['Quantity'].sum().rename("QTY").to_frame()
    list_of_dfs.append(df)

The Web Dev

How to create multiple dataframes in loop with Python Pandas? - The Web Dev

November 15, 2022 - To create multiple dataframes in loop with Python Pandas, we can use dictionary comprehension.

stackoverflow.com › questions › 52861562 › creating-multiple-dataframes-at-once-in-a-for-loop

python - Creating multiple dataframes at once in a for loop - Stack Overflow

datascience.stackexchange.com › questions › 70325 › iterate-over-multiple-dataframe-rows-at-the-same-time

Consider using a dictionary of data frames with dst as the key building an inner list of data frames that are concatenated outside the loop:

df_dict = {}

for dst, dstkey in zip(Dst, DstKey):
    inner = []
    with arcpy.da.SearchCursor(dst, ("OBJECTID", dstkey)) as cursor:
        # returns an iterator of tuples
        for row in cursor:
            if (row[1] is None or not str(row[1]).strip()):
                inner.append(pd.DataFrame({dst.split("\\").pop(): str(row[0])}, index=[0])

    df_dict[dstkey] = pd.concat(inner, ignore_index=True)

Alternatively with list comprehension:

df_dict = {}

for dst, dstkey in zip(Dst, DstKey):
    with arcpy.da.SearchCursor(dst, ("OBJECTID", dstkey)) as cursor:
        # returns an iterator of tuples
        inner = [pd.DataFrame({dst.split("\\").pop(): str(row[0])}, index=[0]) 
                 for row in cursor if (row[1] is None or not str(row[1]).strip())]

    df_dict[dstkey] = pd.concat(inner, ignore_index=True)

For Excel export using dictionary of data frames:

writer = pd.ExcelWriter('/path/to/output.xlsx')

for i, df in df_dict.items():
   df.to_excel(writer, sheet_name=i)

writer.save()

Stack Exchange

python - Iterate over multiple dataframe rows at the same time - Data Science Stack Exchange