pandas groupby write to file

Python pandas - writing groupby output to file

stackoverflow.com › questions › 35025917 › python-pandas-writing-groupby-output-to-file

Use reset_index():

testdf.reset_index().to_csv('CCCC_output_summary.txt', sep='\t', header=True, index=False)

Answer from Mike Müller on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 35025917 › python-pandas-writing-groupby-output-to-file

Python pandas - writing groupby output to file - Stack Overflow

Top answer

1 of 3

Use reset_index():

testdf.reset_index().to_csv('CCCC_output_summary.txt', sep='\t', header=True, index=False)

2 of 3

Recently, I had to work with an Excel file that has 2 columns, with headers 'Dog Breed' and 'Dog Name'. I came up with the following code (tested with Python 3.11.0) that uses groupby() and prints the grouped data into a .csv file.

from pathlib import Path
import pandas as pd

p = Path(__file__).with_name('data.xlsx')
q = Path(__file__).with_name('data-grouped.csv')

df = pd.read_excel(p)
groups = df.groupby('Dog Breed', sort=False)

with q.open('w') as foutput:
for g in groups: # For each group
    foutput.write(f"{g[0]}, {len(g[1])}") # Record the number of dogs in each group
    for e, (index, row) in enumerate(g[1].iterrows()): # Iterating over the group's dataframe
        name = str(row['Dog Name'])
        if(e == 0):
            mystr = f",{name}\n"
        else:
            mystr = f",,{name}\n"
        foutput.write(mystr)

data.xlsx:

data-grouped.csv:

Stack Overflow

stackoverflow.com › questions › 47602097 › pandas-groupby-to-to-csv

python - Pandas groupby to to_csv - Stack Overflow

Top answer

1 of 7

Try doing this:

week_grouped = df.groupby('week')
week_grouped.sum().reset_index().to_csv('week_grouped.csv')

That'll write the entire dataframe to the file. If you only want those two columns then,

week_grouped = df.groupby('week')
week_grouped.sum().reset_index()[['week', 'count']].to_csv('week_grouped.csv')

Here's a line by line explanation of the original code:

# This creates a "groupby" object (not a dataframe object) 
# and you store it in the week_grouped variable.
week_grouped = df.groupby('week')

# This instructs pandas to sum up all the numeric type columns in each 
# group. This returns a dataframe where each row is the sum of the 
# group's numeric columns. You're not storing this dataframe in your 
# example.
week_grouped.sum() 

# Here you're calling the to_csv method on a groupby object... but
# that object type doesn't have that method. Dataframes have that method. 
# So we should store the previous line's result (a dataframe) into a variable 
# and then call its to_csv method.
week_grouped.to_csv('week_grouped.csv')

# Like this:
summed_weeks = week_grouped.sum()
summed_weeks.to_csv('...')

# Or with less typing simply
week_grouped.sum().to_csv('...')

2 of 7

Group By returns key, value pairs where key is the identifier of the group and the value is the group itself, i.e. a subset of an original df that matched the key.

In your example week_grouped = df.groupby('week') is set of groups (pandas.core.groupby.DataFrameGroupBy object) which you can explore in detail as follows:

for k, gr in week_grouped:
    # do your stuff instead of print
    print(k)
    print(type(gr)) # This will output <class 'pandas.core.frame.DataFrame'>
    print(gr)
    # You can save each 'gr' in a csv as follows
    gr.to_csv('{}.csv'.format(k))

Or alternatively you can compute aggregation function on your grouped object

result = week_grouped.sum()
# This will be already one row per key and its aggregation result
result.to_csv('result.csv')

In your example you need to assign the function result to some variable as by default pandas objects are immutable.

some_variable = week_grouped.sum() 
some_variable.to_csv('week_grouped.csv') # This will work

basically result.csv and week_grouped.csv are meant to be same

Stack Overflow

stackoverflow.com › questions › 70949942 › python-write-a-dataframe-groupby-to-a-file

pandas - python: write a dataframe groupby to a file - Stack Overflow

You can call .reset_index() on your groupby count aggregation, then write that to a (text) csv file directly.

Stack Overflow

stackoverflow.com › questions › 25449784 › pandas-groupby-and-file-writing-problems

python - Pandas groupby and file writing problems - Stack Overflow

Top answer

1 of 2

See the docs on apply. Pandas will call the function twice on the first group (to determine between a fast/slow code path), so the side effects of the function (IO) will happen twice for the first group.

Your best bet here is probably to iterate over the groups directly, like this:

for group_name, group_df in df.head(1000).groupby('iid'):
    item_grouper(group_df)

2 of 2

I agree with chrisb's determination of the problem. As a cleaner way, consider having your user_grouper() function not save any values, but instead return these. With a structure as

def user_grouper(df, ...):
    (...)
    df['max_tag_count'] = some_calculation
    return df

results = df.groupby(...).apply(user_grouper, ...)
for i,row in results.iterrows():
    # calculate raw score
    raw_score = (tag_counts[row['tag']]-1) / row['max_tag_count']
    # write to file
    out.write('\t'.join(map(str,[row['uid'],row['iid'],row['tag'],raw_score,weight]))+'\n')

GitHub

github.com › pandas-dev › pandas › issues › 4882

pandas.core.groupby.DataFrameGroupBy to_csv method doesn't ouput csv file as expected · Issue #4882 · pandas-dev/pandas

September 19, 2013 - >>> df1 = pd.DataFrame( { "Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] , "City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"] } ) >>> g1 = df1.groupby( [ "Name" ] ) >>> print g1.head() City Name Name Alice 0 Seattle Alice Bob 1 Seattle Bob 4 Seattle Bob Mallory 2 Portland Mallory 3 Seattle Mallory 5 Portland Mallory >>> g1.to_csv('out.csv') g1.to_csv('out.csv') Out[10]: Name Alice None Bob None Mallory None dtype: object

Author c0indev3l

Stack Overflow

stackoverflow.com › questions › 21333832 › how-to-save-pandas-groups-to-separate-files

python - How to save pandas groups to separate files - Stack Overflow

Top answer

1 of 2

You were almost there

for name, group in grouped:
    group.to_csv(path_to_disk)

2 of 2

This answer was very helpful to me - thanks @mkln.

I just wanted to add something specific to my own use case, which relates to the original point about file-naming ('Some Text' + name = group).

You could add the name and additional text, for example current date, to each csv filename, so I will create a function to return the current date and then use this for the filename.

Therefore:

from datetime import datetime

def cur_date():
    return datetime.now().strftime("%Y-%m-%d")

for name, group in grouped:
    group.to_csv('{}_{}.csv'.format(name, cur_date()))

Real Python

realpython.com › pandas-groupby

pandas GroupBy: Your Guide to Grouping Data in Python – Real Python

January 19, 2025 - You can use custom functions with pandas .groupby() to perform specific operations on groups. This tutorial assumes that you have some experience with pandas itself, including how to read CSV files into memory as pandas objects with read_csv(). If you need a refresher, then check out Reading CSVs With pandas and pandas: How to Read and Write Files.

Saturn Cloud

saturncloud.io › blog › how-to-use-pandas-groupby-tocsv

How to Use Pandas Groupby tocsv | Saturn Cloud Blog

September 9, 2023 - We then group our data by region and product using the groupby function. Next, we calculate the total sales for each group using the sum function. Finally, we export our grouped data to a CSV file using the to_csv function.

Stack Overflow

stackoverflow.com › questions › 40899021 › output-groupby-to-csv-file-pandas

python - output groupby to csv file pandas - Stack Overflow

Top answer

1 of 2

The problem is that you are trying to apply a function to_csv which doesn't exist. Anyway, groupby also doesn't have a to_csv method. pd.Series and pd.DataFrame do.

What you should really use is drop_duplicates here and then export the resulting dataframe to csv:

df.drop_duplicates(['AA1','AA2']).to_csv('merged.txt')

PS: If you really wanted a groupby solution, there's this one that happens to be 12 times slower than drop_duplicates...:

df.groupby(['AA1','AA2']).agg(lambda x:x.value_counts().index[0]).to_csv('merged.txt')

2 of 2

you can use groupby with head

df.groupby(['AA1', 'AA2']).head(1)

Find elsewhere

Google Bing Mojeek

Medium

medium.com › @emarsja › python-pandas-groupby-tutorial-50049f660336

Python Pandas Groupby Tutorial. This post was originally published at… | by Erik Marsja | Medium

December 12, 2019 - In the last section, of this Pandas groupby tutorial, we are going to learn how to write the grouped data to CSV and Excel files. We are going to work with Pandas to_csv and to_excel, to save the groupby object as CSV and Excel file, respectively.

Rip Tutorial

riptutorial.com › export groups in different files

pandas Tutorial => Export groups in different files

# Same example data as in the previous example. import numpy as np import pandas as pd np.random.seed(0) df = pd.DataFrame({'Age': np.random.randint(20, 70, 100), 'Sex': np.random.choice(['Male', factor'Female'], 100), 'number_of_foo': np.random.randint(1, 20, 100)}) # Export to Male.csv and Female.csv files. for sex, data in df.groupby('Sex'): data.to_csv("{}.csv".format(sex)) PDF - Download pandas for free ·

Seaborn Line Plots

marsja.se › home › programming › python pandas groupby tutorial

Python Pandas Groupby Tutorial - Erik Marsja

February 24, 2025 - In the last section of this Pandas groupby tutorial, we will learn how to write the grouped data to CSV and Excel files. We are going to work with Pandas to_csv and to_excel, to save the groupby object as CSV and Excel file, respectively.

GeeksforGeeks

geeksforgeeks.org › python-pandas-dataframe-groupby

Pandas dataframe.groupby() Method - GeeksforGeeks

14:51

In this example, we will demonstrate how to group data by a single column using the groupby method. We will work with NBA-dataset that contains information about NBA players, including their teams, points scored, and assists. We'll group the data by the Team column and calculate the total points scored for each team. ... import pandas as pd df = pd.read_csv("https://media.geeksforgeeks.org/wp-content/uploads/nba.csv") team = df.groupby('Team') print(team.first()) # Let's print the first entries in all the groups formed.

Published December 3, 2024

Stack Overflow

stackoverflow.com › questions › 55835489 › python-dataframe-group-by-and-save-to-a-file-each-group

pandas - Python - dataframe group by and save to a file each group - Stack Overflow

April 25, 2019 - Data is in Pandas Dataframe. When do a groupby on a specific column - this is what I get: df.groupby('REPORT_DATE').count() REPORT_DATE 2016-07-01 2079 2016-07-05 2079 2016-07-06 2079 2016-07-07 2079 · How to automatically save each of these 2079 values in a separate file?

Stack Overflow

stackoverflow.com › questions › 54532087 › how-to-write-groupby-grouped-table-from-dataframe-to-word-document

python - How to write groupby grouped table from dataFrame to word document? - Stack Overflow

Top answer

1 of 1

It looks like you are not writing the index values of the dataframe to file. I have amended your script to show you can use df.index to access these and then write them to the first column.

#...

t = doc.add_table(df.shape[0]+2, df.shape[1]+1) #df.shape[1]+1 to add one extra column for row names

for j in range(df.shape[-1]):
    t.cell(0,j+1).text = df.columns[j] 

for i in range(df.shape[0]):
    for j in range(df.shape[-1]):
        t.cell(i+2,j+1).text = str(df.values[i,j])

# write columns name to file
t.cell(0,0).text = df.columns.name

# write the index values to file
t.cell(1,0).text = df.index.name

for i, my_index in enumerate(df.index):
     t.cell(i+2,0).text = my_index

Python.org

discuss.python.org › python help

Pandas Dataframes - exporting using the groupby statement into separate sheets in asc/dsc order - Python Help - Discussions on Python.org

February 8, 2024 - Hi- I created a dataframe, and it’s all in one Excel sheet. So I used groupby to export each value into a separate worksheet, as such: with pd.ExcelWriter(full_path) as writer: for key, g in df.groupby(‘Field Name’): …

AskPython

askpython.com › home › pandas groupby: split, aggregate, and transform data with python

Pandas groupby: Split, aggregate, and transform data with Python - AskPython

January 25, 2026 - When you write groupby('department')['salary'], you create a SeriesGroupBy object that operates only on the salary column. This approach runs faster than computing statistics across all columns because pandas processes less data.

Pandas

pandas.pydata.org › docs › user_guide › groupby.html

Group by: split-apply-combine — pandas 3.0.2 documentation

The name GroupBy should be quite familiar to those who have used a SQL-based tool (or itertools), in which you can write code like: SELECT Column1, Column2, mean(Column3), sum(Column4) FROM SomeTable GROUP BY Column1, Column2 · We aim to make operations like this natural and easy to express using pandas...

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.to_csv.html

pandas.DataFrame.to_csv — pandas 3.0.2 documentation

String, path object (implementing os.PathLike[str]), or file-like object implementing a write() function. If None, the result is returned as a string. If a non-binary file object is passed, it should be opened with newline=’’, disabling universal newlines.

Stack Overflow

stackoverflow.com › questions › 70949942

pandas - python: write a dataframe groupby to a file

Stack Overflow | The World’s Largest Online Community for Developers