Brave Search

DataFrame of DataFrames in Python (Pandas)

stackoverflow.com › questions › 35932060 › dataframe-of-dataframes-in-python-pandas

I think that pandas offers better alternatives to what you're suggesting (rationale below).

For one, there's the pandas.Panel data structure, which was meant for things like you're doing here.

However, as Wes McKinney (the Pandas author) noted in his book Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython, multi-dimensional indices, to a large extent, offer a better alternative.

Consider the following alternative to your code:

dfs = []
for year in range(1967,2014):
    ....some codes that allow me to generate df1, df2 and df3 
    df1['year'] = year
    df1['origin'] = 'df1'
    df2['year'] = year
    df2['origin'] = 'df2'
    df3['year'] = year
    df3['origin'] = 'df3'
    dfs.extend([df1, df2, df3])
df = pd.concat(dfs)

This gives you a DataFrame with 4 columns: 'firm', 'price', 'year', and 'origin'.

This gives you the flexibility to:

Organize hierarchically by, say, 'year' and 'origin': df.set_index(['year', 'origin']), by, say, 'origin' and 'price': df.set_index(['origin', 'price'])
Do groupbys according to different levels
In general, slice and dice the data along many different ways.

What you're suggesting in the question makes one dimension (origin) arbitrarily different, and it's hard to think of an advantage to this. If a split along some dimension is necessary due, to, e.g., performance, you can combine DataFrames better with standard Python data structures:

A dictionary mapping each year to a Dataframe with the other three dimensions.
Three DataFrames, one for each origin, each having three dimensions.

Answer from Ami Tavory on Stack Overflow

Pandas

pandas.pydata.org › docs › user_guide › dsintro.html

Intro to data structures — pandas 3.0.2 documentation

DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects. It is generally the most commonly used pandas object.

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.html

pandas.DataFrame — pandas 3.0.2 documentation

Read a comma-separated values (csv) file into DataFrame.

Discussions

design - Designing interpretable/maintainable python code that use pandas DataFrames? - Software Engineering Stack Exchange

I am working with/writing a good amount of code in python using pandas dataframes. One thing I'm really struggling with is how to enforce a "schema" of sorts or make it apparent what data fields are More on softwareengineering.stackexchange.com

softwareengineering.stackexchange.com

Are Pandas' dataframes (Python) closer to R's dataframes or datatables? - Stack Overflow

To understand my question, I should first point out that R datatables aren't just R dataframes with syntaxic sugar, there are important behavioral differences : column assignation/modification by More on stackoverflow.com

stackoverflow.com

What actually defines a DataFrame?

Dataframe is an engineering term, not some strongly defined theoretical term. If it looks like a dataframe, walks like a dataframe, swims like a dataframe, it’s probably a dataframe. More on reddit.com

r/dataengineering

March 24, 2025

How to save DataFrame to a file and read back

I used to use pickle format too, but based on this benchmark , I now favour the use of the feather format in most situations. More on reddit.com

r/pythontips

September 17, 2022

Videos