check if two dataframes are equal pandas

Comparing two pandas dataframes for differences

stackoverflow.com › questions › 19917545 › comparing-two-pandas-dataframes-for-differences

You also need to be careful to create a copy of the DataFrame, otherwise the csvdata_old will be updated with csvdata (since it points to the same object):

csvdata_old = csvdata.copy()

To check whether they are equal, you can use assert_frame_equal as in this answer:

from pandas.util.testing import assert_frame_equal
assert_frame_equal(csvdata, csvdata_old)

You can wrap this in a function with something like:

try:
    assert_frame_equal(csvdata, csvdata_old)
    return True
except:  # appeantly AssertionError doesn't catch all
    return False

There was discussion of a better way...

Answer from Andy Hayden on Stack Overflow

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.equals.html

pandas.DataFrame.equals — pandas 3.0.1 documentation

This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements. NaNs in the same location are considered equal.

Stack Overflow

stackoverflow.com › questions › 19917545 › comparing-two-pandas-dataframes-for-differences

python - Comparing two pandas dataframes for differences - Stack Overflow

Top answer

1 of 10

90

You also need to be careful to create a copy of the DataFrame, otherwise the csvdata_old will be updated with csvdata (since it points to the same object):

csvdata_old = csvdata.copy()

To check whether they are equal, you can use assert_frame_equal as in this answer:

from pandas.util.testing import assert_frame_equal
assert_frame_equal(csvdata, csvdata_old)

You can wrap this in a function with something like:

try:
    assert_frame_equal(csvdata, csvdata_old)
    return True
except:  # appeantly AssertionError doesn't catch all
    return False

There was discussion of a better way...

2 of 10

50

Not sure if this is helpful or not, but I whipped together this quick python method for returning just the differences between two dataframes that both have the same columns and shape.

def get_different_rows(source_df, new_df):
    """Returns just the rows from the new dataframe that differ from the source dataframe"""
    merged_df = source_df.merge(new_df, indicator=True, how='outer')
    changed_rows_df = merged_df[merged_df['_merge'] == 'right_only']
    return changed_rows_df.drop('_merge', axis=1)

Discussions

Is this the most efficient way to compare 2 Pandas dataframes? Finding rows unique to one dataframe.

The pd.merge() function with the argument how='right' will perform a right join on the two dataframes, which means all the rows from the right_test_df and any common rows from the left_test_df will be in the result. The indicator=True parameter adds a column called _merge to the output DataFrame that can have three possible values: left_only, right_only, or both, indicating the source of each row. By filtering for rows where _merge is right_only, you get all rows that are unique to the right_test_df. However, if you want to find rows that are unique to either DataFrame, you would set how='outer' and then filter for rows where _merge is either right_only or left_only. Here is an example: merged_df = pd.merge(left_test_df, right_test_df, how='outer', indicator=True) final_df = merged_df[merged_df['_merge'] != 'both'] In this version of the code, final_df will contain all rows that are unique to either the left_test_df or the right_test_df. As with many things in programming (and Python/Pandas in particular), there are often multiple valid ways to accomplish a task. This is one efficient way, but depending on the specific requirements of your task, other methods may be more suitable. More on reddit.com

r/learnpython

1

2

July 20, 2023

How to compare 2 pandas dataframes that have the same column/on the same column?

You are on the right track! You can merge df1 and df2 on a shared key ("student_id"). This creates a new dataframe (lets call it df3). df3 will have the column values from both df1 and df2, so you can then compare if both "passed" columns in df3 are equal. df3 = pd.merge(d1, df2, on='student_id) More on reddit.com

r/learnpython

2

3

August 17, 2021

Pandas: Compare two dataframes that are different lengths

You want to try a merge . So you can have 1 dataframe with columns from both years. In the how parameter if you only want players that have played both seasons you can use inner. And the on parameter for there unique key which will be name (if no player shares same name) More on reddit.com

r/learnpython

5

1

December 16, 2020

How to test if all values in pandas dataframe column are equal?

Don't think there's a built-in functionality to quickly do that, but it can be done in two steps: In [19]: df Out[19]: A B 0 h h 1 h h 2 h i Count the number of uniques in each column: In [20]: uniques = df.apply(lambda x: x.nunique()) In [21]: uniques Out[21]: A 1 B 2 dtype: int64 Use boolean indexing on uniques to filter out rows where the number of uniques is not equal to one. Use the result's index to drop the columns in the original dataframe. In [22]: df = df.drop(uniques[uniques==1].index, axis=1) In [23]: df Out[23]: B 0 h 1 h 2 i More on reddit.com

r/learnpython

4

1

March 9, 2017

Videos

youtube.com

Compare Two pandas DataFrames in Python Explained With Example ...

May 20, 2023

06:25

YouTube

Pandas DataFrame eq() & equals() function in Python | Comparison ...

Find Differences Between Two Columns of pandas DataFrame in Python ...

August 24, 2022

03:49

YouTube

Compare Two pandas DataFrames in Python (Example) | Find Differences ...

How to Check If Two pandas DataFrames are Equal in Python (Example) ...

May 27, 2022

View all

Spark By {Examples}

sparkbyexamples.com › home › pandas › pandas dataframe equals() method

Pandas DataFrame equals() Method - Spark By {Examples}

July 29, 2024 - In Pandas, the equals() method is used to determine if two DataFrame objects contain the same data. It returns True if the DataFrames are exactly equal (i.e., they have the same shape, elements, and data types), otherwise it returns False.

W3Schools

w3schools.com › python › pandas › ref_df_equals.asp

Pandas DataFrame equals() Method

Check if two DataFrames ar equal: ... Try it Yourself » · The duplicated() method compares two DataFrames and returns True if they are equal, in both shape and content, otherwise False....

Pandas

pandas.pydata.org › docs › reference › api › pandas.testing.assert_frame_equal.html

pandas.testing.assert_frame_equal — pandas 3.0.1 documentation

Check DataFrame equality. ... This example shows comparing two DataFrames that are equal but with columns of differing dtypes. >>> from pandas.testing import assert_frame_equal >>> df1 = pd.DataFrame({"a": [1, 2], "b": [3, 4]}) >>> df2 = pd.DataFrame({"a": [1, 2], "b": [3.0, 4.0]})

Moonbooks

en.moonbooks.org › Articles › How-to-check-if-two-columns-are-equal-identical-with-pandas-

How to check if two columns are equal (identical) with pandas ?

March 3, 2021 - import pandas as pd import numpy ... C Score D 0 5 4 7 7 1 5 9 7 7 2 1 2 6 6 3 5 2 5 5 4 4 4 4 4 · To check if two columns are equal a solution is to use pandas.DataFrame.equals, example: df['Score A'].equals(df['Score B']) retruns ·...

Saturn Cloud

saturncloud.io › blog › how-to-confirm-equality-of-two-pandas-dataframes

How to Confirm Equality of Two Pandas DataFrames | Saturn Cloud Blog

December 28, 2023 - Let’s explore each of these methods in detail. The simplest method of comparing two pandas DataFrames is to use the equals() method. This method returns True if the two DataFrames are equal and False otherwise.

Find elsewhere

Google Bing Mojeek

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.compare.html

pandas.DataFrame.compare — pandas 3.0.1 documentation

DataFrame.compare(other, align_axis=1, keep_shape=False, keep_equal=False, result_names=('self', 'other'))[source]# Compare to another DataFrame and show the differences. ... Object to compare with. align_axis{0 or ‘index’, 1 or ‘columns’}, default 1 · Determine which axis to align the comparison on. 0, or ‘index’ : Resulting differences are stacked vertically with rows drawn alternately from self and other.

Medium

medium.com › @tzhaonj › comparing-dataframes-in-pandas-equal-and-compare-9648e8441c5c

Comparing dataframes in Pandas .equal and .compare | by Tim Zhao | Medium

July 30, 2024 - Two powerful methods in Pandas, equals() and compare(), are designed for such tasks. While equals() checks for complete equality between two DataFrames, compare() helps pinpoint specific differences.

Statology

statology.org › home › pandas: how to check if two dataframes are equal

Pandas: How to Check if Two DataFrames Are Equal

October 10, 2022 - #check if two DataFrames are equal df1.equals(df2) False

Finxter

blog.finxter.com › 5-best-ways-to-check-if-two-pandas-dataframes-are-exactly-the-same

5 Best Ways to Check if Two Pandas DataFrames are Exactly the Same – Be on the Right Side of Change

By running df1.compare(df2), it returns an empty DataFrame if the two are equal, and by further checking if comparison.empty is True, we validate that the DataFrames are indeed the same. The assert_frame_equal() function from Pandas’ testing module can be used in test cases to assert that ...

Data Science Parichay

datascienceparichay.com › home › blog › compare two dataframes for equality in pandas

Compare Two DataFrames for Equality in Pandas - Data Science Parichay

October 6, 2020 - Let’s see using some examples of how the equals() function works and what to expect when using it to compare two dataframes. import pandas as pd # two identical dataframes df1 = pd.DataFrame({'A': [1,2], 'B': ['x', 'y']}) df2 = pd.DataFrame({'A': ...

Towards Data Science

towardsdatascience.com › home › latest › comparing pandas dataframes to one another

Comparing Pandas Dataframes To One Another | Towards Data Science

January 18, 2025 - The output of .eq lists out each cell position and tells us whether the values in that cell position were equal between the two dataframes (note that rows 1 and 3 contain errors). ... We can use boolean indexing and the .all method to print out the rows that are erroneous. Boolean indexing is using a set of conditions to decide which rows to print (the rows where our boolean index is equal to True get printed). In: # .all returns True for a row if all values are True

Databricks

databricks.com › blog › simplify-pyspark-testing-dataframe-equality-functions

Simplify PySpark testing with DataFrame equality functions | Databricks Blog

March 6, 2024 - Since we specified that assert_frame_equal should ignore column data types, it now considers the two DataFrames equal. These functions also allow comparisons between Pandas API on Spark objects and pandas objects, facilitating compatibility checks between different DataFrame libraries, as illustrated in this example:

Pandas

pandas.pydata.org › pandas-docs › version › 0.25.0 › reference › api › pandas.DataFrame.equals.html

pandas.DataFrame.equals — pandas 0.25.0 documentation

Compare two DataFrame objects of the same shape and return a DataFrame where each element is True if the respective element in each DataFrame is equal, False otherwise. ... Return True if left and right Series are equal, False otherwise.

GeeksforGeeks

geeksforgeeks.org › pandas › how-to-compare-two-dataframes-with-pandas-compare

How To Compare Two Dataframes with Pandas compare? - GeeksforGeeks

July 23, 2025 - If it is made false then it will display the equal values as NANs. Returns another DataFrame with the differences between the two dataFrames. Before Starting, an important note is the pandas version must be at least 1.1.0. To check that, run this on your cmd or Anaconda navigator cmd. ... If it is 1.1.0 or greater than that, you are ...

LabEx

labex.io › tutorials › pandas-dataframe-equals-method-68617

Pandas DataFrame Equals Method | Data Comparison | LabEx

In this tutorial, you learned how to use the equals() method in Pandas DataFrame to compare two DataFrames. This method is useful when you want to check if two DataFrames are equal in terms of shape and elements.

TutorialsPoint

tutorialspoint.com › python-pandas-check-if-two-dataframes-are-exactly-same

Python Pandas – Check if two Dataframes are exactly same

Create DataFrame2 with two columns − · dataFrame2 = pd.DataFrame( { "Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Mercedes', 'Jaguar'], "Reg_Price": [7000, 1500, 5000, 8000, 9000, 6000] } ) To check for equality, use the equals() method − · dataFrame1.equals(dataFrame2) Following is the ...

GeeksforGeeks

geeksforgeeks.org › python › pandas-find-the-difference-between-two-dataframes

Pandas - Find the Difference between two Dataframes - GeeksforGeeks

July 23, 2025 - import pandas as pd # first dataframe ... '20', '24', '40', '22'], 'Weight': [55, 59, 73, 85, 56]}) display(df2) ... By using equals() function we can directly check if df1 is equal to df2. This function is used to determine if ...

reddit.com › r/learnpython › is this the most efficient way to compare 2 pandas dataframes? finding rows unique to one dataframe.

r/learnpython on Reddit: Is this the most efficient way to compare 2 Pandas dataframes? Finding rows unique to one dataframe.

July 20, 2023 -

Still trying to wrap my head around Pandas (and continuing to be blown away by its capabilities every day...

Say I have 2 dataframes (lets call the left_test_df and right_test_df). And, I want to see which rows are only present in one of them. Is something like this the best way to go about doing it?

    merged_df=pd.merge(left_test_df, right_test_df, how='right', indicator=True)
    final_df=merged_df[merged_df['_merge']=='right_only']