The first part is similar to Constantine, you can get the boolean of which rows are empty*:

In [21]: ne = (df1 != df2).any(1)

In [22]: ne
Out[22]:
0    False
1     True
2     True
dtype: bool

Then we can see which entries have changed:

In [23]: ne_stacked = (df1 != df2).stack()

In [24]: changed = ne_stacked[ne_stacked]

In [25]: changed.index.names = ['id', 'col']

In [26]: changed
Out[26]:
id  col
1   score         True
2   isEnrolled    True
    Comment       True
dtype: bool

Here the first entry is the index and the second the columns which has been changed.

In [27]: difference_locations = np.where(df1 != df2)

In [28]: changed_from = df1.values[difference_locations]

In [29]: changed_to = df2.values[difference_locations]

In [30]: pd.DataFrame({'from': changed_from, 'to': changed_to}, index=changed.index)
Out[30]:
               from           to
id col
1  score       1.11         1.21
2  isEnrolled  True        False
   Comment     None  On vacation

* Note: it's important that df1 and df2 share the same index here. To overcome this ambiguity, you can ensure you only look at the shared labels using df1.index & df2.index, but I think I'll leave that as an exercise.

Answer from Andy Hayden on Stack Overflow
Top answer
1 of 16
177

The first part is similar to Constantine, you can get the boolean of which rows are empty*:

In [21]: ne = (df1 != df2).any(1)

In [22]: ne
Out[22]:
0    False
1     True
2     True
dtype: bool

Then we can see which entries have changed:

In [23]: ne_stacked = (df1 != df2).stack()

In [24]: changed = ne_stacked[ne_stacked]

In [25]: changed.index.names = ['id', 'col']

In [26]: changed
Out[26]:
id  col
1   score         True
2   isEnrolled    True
    Comment       True
dtype: bool

Here the first entry is the index and the second the columns which has been changed.

In [27]: difference_locations = np.where(df1 != df2)

In [28]: changed_from = df1.values[difference_locations]

In [29]: changed_to = df2.values[difference_locations]

In [30]: pd.DataFrame({'from': changed_from, 'to': changed_to}, index=changed.index)
Out[30]:
               from           to
id col
1  score       1.11         1.21
2  isEnrolled  True        False
   Comment     None  On vacation

* Note: it's important that df1 and df2 share the same index here. To overcome this ambiguity, you can ensure you only look at the shared labels using df1.index & df2.index, but I think I'll leave that as an exercise.

2 of 16
143

Highlighting the difference between two DataFrames

It is possible to use the DataFrame style property to highlight the background color of the cells where there is a difference.

Using the example data from the original question

The first step is to concatenate the DataFrames horizontally with the concat function and distinguish each frame with the keys parameter:

df_all = pd.concat([df.set_index('id'), df2.set_index('id')], 
                   axis='columns', keys=['First', 'Second'])
df_all

It's probably easier to swap the column levels and put the same column names next to each other:

df_final = df_all.swaplevel(axis='columns')[df.columns[1:]]
df_final

Now, its much easier to spot the differences in the frames. But, we can go further and use the style property to highlight the cells that are different. We define a custom function to do this which you can see in this part of the documentation.

def highlight_diff(data, color='yellow'):
    attr = 'background-color: {}'.format(color)
    other = data.xs('First', axis='columns', level=-1)
    return pd.DataFrame(np.where(data.ne(other, level=0), attr, ''),
                        index=data.index, columns=data.columns)

df_final.style.apply(highlight_diff, axis=None)

This will highlight cells that both have missing values. You can either fill them or provide extra logic so that they don't get highlighted.

🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.DataFrame.compare.html
pandas.DataFrame.compare — pandas 3.0.1 documentation
>>> df.compare(df2, keep_shape=True) col1 col2 col3 self other self other self other 0 a c NaN NaN NaN NaN 1 NaN NaN NaN NaN NaN NaN 2 NaN NaN NaN NaN 3.0 4.0 3 NaN NaN NaN NaN NaN NaN 4 NaN NaN NaN NaN NaN NaN · Keep all original rows and columns and also all original values
Discussions

Is this the most efficient way to compare 2 Pandas dataframes? Finding rows unique to one dataframe.
The pd.merge() function with the argument how='right' will perform a right join on the two dataframes, which means all the rows from the right_test_df and any common rows from the left_test_df will be in the result. The indicator=True parameter adds a column called _merge to the output DataFrame that can have three possible values: left_only, right_only, or both, indicating the source of each row. By filtering for rows where _merge is right_only, you get all rows that are unique to the right_test_df. However, if you want to find rows that are unique to either DataFrame, you would set how='outer' and then filter for rows where _merge is either right_only or left_only. Here is an example: merged_df = pd.merge(left_test_df, right_test_df, how='outer', indicator=True) final_df = merged_df[merged_df['_merge'] != 'both'] In this version of the code, final_df will contain all rows that are unique to either the left_test_df or the right_test_df. As with many things in programming (and Python/Pandas in particular), there are often multiple valid ways to accomplish a task. This is one efficient way, but depending on the specific requirements of your task, other methods may be more suitable. More on reddit.com
🌐 r/learnpython
1
2
July 20, 2023
Pandas: Compare two dataframes that are different lengths
You want to try a merge . So you can have 1 dataframe with columns from both years. In the how parameter if you only want players that have played both seasons you can use inner. And the on parameter for there unique key which will be name (if no player shares same name) More on reddit.com
🌐 r/learnpython
5
1
December 16, 2020
compare two DataFrames
Try printing them out and using a highlighter to mark the differences More on reddit.com
🌐 r/dataengineering
20
6
August 4, 2023
Pandas or Polars to work with dataframes?
I rewrote some old code that used Pandas to graph statistics recorded at high frequency over long periods. Think around 50 columns, and millions of rows. But the catch? Each row was a span of time, not an instantaneous measurement - and the time spans were variable and overlapping. (specifically, job logs on an HPC cluster). The pandas 1.x code took about half an hour to run - mostly because of the majority of the work being stuck in a single thread. I rewrote it in Polars (with the help of an expert on the Polars discord) and it brought that down to three minutes. In contrast to the original code, this was able to leverage the cores available properly. I did not touch anything directly to do threading, like reading files in a pool or such. 100% the doing of Polars, there. More on reddit.com
🌐 r/Python
68
79
April 10, 2023
🌐
Statology
statology.org › home › pandas: how to compare two dataframes row by row
Pandas: How to Compare Two DataFrames Row by Row
January 23, 2023 - Note: The argument keep_equal=True tells pandas to keep values that are equal. Otherwise, equal values are shown as NaNs. The following code shows how to use the argument keep_shape=True to compare the two DataFrames row by row and keep all of the rows from the original DataFrames:
🌐
Educative
educative.io › answers › how-to-compare-two-dataframes-in-pandas
How to compare two DataFrames in pandas
The compare method in pandas shows the differences between two DataFrames. It compares two data frames, row-wise and column-wise, and presents the differences side by side.
🌐
Spark By {Examples}
sparkbyexamples.com › home › pandas › compare two dataframes row by row
Compare Two DataFrames Row by Row - Spark By {Examples}
June 10, 2025 - Pandas DataFrame.compare() function compares two equal sizes and dimensions of DataFrames row by row along with align_axis = 0 and returns The DataFrame with unequal values of given DataFrames. By default, it compares the DataFrames column by column.
🌐
Towards Data Science
towardsdatascience.com › home › latest › comparing pandas dataframes to one another
Comparing Pandas Dataframes To One Another | Towards Data Science
January 18, 2025 - It’s a bit more roundabout but by using merge, we can compare just the values of each entry, irrespective of the indexes. Merging using the Player and Rings columns allows us to match up rows with identical values between our existing and new dataframes (while ignoring differences in the dataframe’s indexes). We need to rename the Rings column in our new dataframe to get it to output as two separate columns (Rings for the old and Rings_new for new).
🌐
Reddit
reddit.com › r/learnpython › is this the most efficient way to compare 2 pandas dataframes? finding rows unique to one dataframe.
r/learnpython on Reddit: Is this the most efficient way to compare 2 Pandas dataframes? Finding rows unique to one dataframe.
July 20, 2023 -

Still trying to wrap my head around Pandas (and continuing to be blown away by its capabilities every day...

Say I have 2 dataframes (lets call the left_test_df and right_test_df). And, I want to see which rows are only present in one of them. Is something like this the best way to go about doing it?

    merged_df=pd.merge(left_test_df, right_test_df, how='right', indicator=True)
    final_df=merged_df[merged_df['_merge']=='right_only']

Find elsewhere
🌐
Hackers and Slackers
hackersandslackers.com › compare-rows-pandas-dataframes
Comparing Rows Between Two Pandas DataFrames
April 28, 2024 - Which rows were only present in the second DataFrame? ... For starters, our function dataframe_difference() will need to be passed two DataFrames to compare.
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.DataFrame.equals.html
pandas.DataFrame.equals — pandas 3.0.1 documentation
This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements. NaNs in the same location are considered equal. The row/column index do not need to have the same type, as long as the values are considered equal.
🌐
Data Science Discovery
discovery.cs.illinois.edu › guides › DataFrame-Fundamentals › Comparing-Two-DataFrames
Comparing Two DataFrames - Data Science Discovery - Illinois
October 10, 2022 - import pandas as pd\ndf1 = pd.DataFrame([\n {'Name': 'Saul', 'Favorite Color': 'Maroon', 'Show': 'BCS'},\n {'Name': 'Walter', 'Favorite Color': 'Blue', 'Show': 'BB'},\n {'Name': 'Kim', 'Favorite Color': 'Red', 'Show': 'BCS'},\n {'Name': 'Howard', 'Favorite Color': 'Green', 'Show': 'BCS'}\n])\ndf2 = pd.DataFrame([\n {'Name': 'Saul', 'Favorite Color': 'Maroon', 'Show': 'BCS'},\n {'Name': 'Walter', 'Favorite Color': 'Blue', 'Show': 'BB'},\n {'Name': 'Kim', 'Favorite Color': 'Red', 'Show': 'BCS'},\n {'Name': 'Jesse', 'Favorite Color': 'Maroon', 'Show': 'BB'},\n])\ndf1.compare(df2) ... By default t
🌐
IncludeHelp
includehelp.com › python › how-to-compare-two-dataframes-and-output-their-differences-side-by-side.aspx
How to compare two DataFrames and output their differences side-by-side?
For this purpose, we will check both the DataFrames if they are equal or not. To check if the DataFrames are equal or not, we will use pandas.DataFrame.compare() method. This method is used to compare two DataFrames and to find the differences between the rows of two DataFrames.
🌐
Saturn Cloud
saturncloud.io › blog › how-to-compare-two-pandas-dataframes-for-differences
How to Compare Two Pandas Dataframes for Differences | Saturn Cloud Blog
October 27, 2023 - Similarly, three values in column C of three rows 0, 1, 2 are different between the two dataframes. Note: Before comparing two DataFrames make sure that the number of records in the first DataFrame matches the number of records in the second DataFrame. If not so, you will be getting a value error which is : ValueError: Can only compare identically-labeled Series objects · Comparing two pandas dataframes is an essential task in data analysis.
🌐
Medium
medium.com › data-science › 3-easy-ways-to-compare-two-pandas-dataframes-b2a18169c876
Some Of The Ways To Compare Two Pandas DataFrames | TDS Archive
August 9, 2023 - Compare two dataframes in pandas cell by cell, based on column & using for loop. The compare method in pandas shows the differences between two DataFrames.
🌐
Data to Fish
datatofish.com › compare-values-dataframes
How to Compare Values Between Two pandas DataFrames
January 12, 2023 - In this tutorial, you will learn how to compare values between two DataFrames. Let's say, you have data on caught fish by two fishing boats: import pandas as pd boat1 = { 'fish': ['salmon', 'pufferfish', 'shark'], 'count': [99, 33, 11] } boat2 = { 'fish': ['salmon', 'pufferfish'], 'count': [88, 22] } df1 = pd.DataFrame(boat1) df2 = pd.DataFrame(boat2) print(df1) print(df2) df1: fish count 0 salmon 99 1 pufferfish 33 2 shark 11 df2: fish count 0 salmon 88 1 pufferfish 22 ·
🌐
Saturn Cloud
saturncloud.io › blog › how-to-efficiently-compare-rows-in-a-pandas-dataframe
How to Efficiently Compare Rows in a Pandas DataFrame | Saturn Cloud Blog
November 14, 2023 - We covered three methods: using the equals() method, using the isin() method, and using the apply() method. Depending on the specific use case, one method may be more efficient or appropriate than another.
🌐
Statology
statology.org › home › how to compare two dataframes in pandas
How to Compare Two DataFrames in Pandas
July 28, 2020 - Player D had 5 more points and 5 more assists in DataFrame 2 compared to DataFrame 1. Example 3: Find all rows that only exist in one DataFrame. We can use the following code to obtain a complete list of rows that only appear in one DataFrame: #outer merge the two DataFrames, adding an indicator column called 'Exist' diff_df = pd.merge(df1, df2, how='outer', indicator='Exist') #find which rows don't exist in both DataFrames diff_df = diff_df.loc[diff_df['Exist'] != 'both'] diff_df player points assists Exist 0 A 12 4 left_only 1 B 15 6 left_only 2 C 17 7 left_only 3 D 24 8 left_only 4 A 12 7 right_only 5 B 24 8 right_only 6 C 26 10 right_only 7 D 29 13 right_only
🌐
GeeksforGeeks
geeksforgeeks.org › pandas › how-to-compare-two-dataframes-with-pandas-compare
How To Compare Two Dataframes with Pandas compare? - GeeksforGeeks
July 23, 2025 - It is of bool type and the default value for it is "false", i.e. it displays all the values in the table by default. keep_equal : This is mainly for displaying same or equal values in the output when set to True. If it is made false then it will display the equal values as NANs. Returns another DataFrame with the differences between the two dataFrames. Before Starting, an important note is the pandas version must be at least 1.1.0.
🌐
TutorialsPoint
tutorialspoint.com › how-to-compare-two-dataframe-with-pandas-compare
How to Compare two Dataframe with Pandas Compare?
July 20, 2023 - Whether you're an experienced data analyst or a fresher, this article will provide you with the knowledge you need for pandas to use compare effectively and confidently. The basic syntax for the compare() function is as follows: ... where df1 ...
🌐
Linux Hint
linuxhint.com › pandas-compare-two-dataframes-row-by-row
Pandas Compare Two DataFrames Row by Row
September 29, 2023 - Linux Hint LLC, [email protected] 1210 Kelly Park Circle, Morgan Hill, CA 95037 Privacy Policy and Terms of Use