I think I understand what you are trying to do. I have no idea of the relative performance of our two machines so maybe you can benchmark it yourself.

from PIL import Image
import numpy as np

# Load images, convert to RGB, then to numpy arrays and ravel into long, flat things
a=np.array(Image.open('a.png').convert('RGB')).ravel()
b=np.array(Image.open('b.png').convert('RGB')).ravel()

# Calculate the sum of the absolute differences divided by number of elements
MAE = np.sum(np.abs(np.subtract(a,b,dtype=np.float))) / a.shape[0]

The only "tricky" thing in there is the forcing of the result type of np.subtract() to a float which ensures I can store negative numbers. It may be worth trying with dtype=np.int16 on your hardware to see if that is faster.


A fast way to benchmark it is as follows. Start ipython and then type in the following:

from PIL import Image
import numpy as np

a=np.array(Image.open('a.png').convert('RGB')).ravel()
b=np.array(Image.open('b.png').convert('RGB')).ravel()

Now you can time my code with:

%timeit np.sum(np.abs(np.subtract(a,b,dtype=np.float))) / a.shape[0]
6.72 µs ± 21.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Or, you can try an int16 version like this:

%timeit np.sum(np.abs(np.subtract(a,b,dtype=np.int16))) / a.shape[0]
6.43 µs ± 30.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

If you want to time your code, paste in your function then use:

%timeit compare_images_pil(img1, img2)
Answer from Mark Setchell on Stack Overflow
Top answer
1 of 2
5

I think I understand what you are trying to do. I have no idea of the relative performance of our two machines so maybe you can benchmark it yourself.

from PIL import Image
import numpy as np

# Load images, convert to RGB, then to numpy arrays and ravel into long, flat things
a=np.array(Image.open('a.png').convert('RGB')).ravel()
b=np.array(Image.open('b.png').convert('RGB')).ravel()

# Calculate the sum of the absolute differences divided by number of elements
MAE = np.sum(np.abs(np.subtract(a,b,dtype=np.float))) / a.shape[0]

The only "tricky" thing in there is the forcing of the result type of np.subtract() to a float which ensures I can store negative numbers. It may be worth trying with dtype=np.int16 on your hardware to see if that is faster.


A fast way to benchmark it is as follows. Start ipython and then type in the following:

from PIL import Image
import numpy as np

a=np.array(Image.open('a.png').convert('RGB')).ravel()
b=np.array(Image.open('b.png').convert('RGB')).ravel()

Now you can time my code with:

%timeit np.sum(np.abs(np.subtract(a,b,dtype=np.float))) / a.shape[0]
6.72 µs ± 21.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Or, you can try an int16 version like this:

%timeit np.sum(np.abs(np.subtract(a,b,dtype=np.int16))) / a.shape[0]
6.43 µs ± 30.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

If you want to time your code, paste in your function then use:

%timeit compare_images_pil(img1, img2)
2 of 2
2

Digging a bit, I found this repository that takes a different approach that is more based on Pillow itself and seems to give similar results.

from PIL import Image
from PIL import ImageChops, ImageStat


def compare_images_pil(img1, img2):
    '''Calculate the difference between two images of the same size
    by comparing channel values at the pixel level.
    `delete_diff_file`: removes the diff image after ratio found
    `diff_img_file`: filename to store diff image

    Adapted from Nicolas Hahn:
    https://github.com/nicolashahn/diffimg/blob/master/diffimg/__init__.py
    '''

    # Don't compare if images are of different modes or different sizes.
    if (img1.mode != img2.mode) \
            or (img1.size != img2.size) \
            or (img1.getbands() != img2.getbands()):
        return None

    # Generate diff image in memory.
    diff_img = ImageChops.difference(img1, img2)

    # Calculate difference as a ratio.
    stat = ImageStat.Stat(diff_img)

    # Can be [r,g,b] or [r,g,b,a].
    sum_channel_values = sum(stat.mean)
    max_all_channels = len(stat.mean) * 255
    diff_ratio = sum_channel_values / max_all_channels

    return diff_ratio * 100

For my test images sample, the results seem to be the same (except for a few minor float rounding errors) and it runs considerably faster than the first version I had above.

🌐
GeeksforGeeks
geeksforgeeks.org › python › python-program-to-find-sum-of-absolute-difference-between-all-pairs-in-a-list
Python program to find sum of absolute difference between all pairs in a list - GeeksforGeeks
July 11, 2025 - Convert the list to a NumPy array. Reshape the array to have two axes. Use broadcasting to compute the absolute differences between all pairs of elements. Sum the resulting array and divide by 2 to account for double-counting.
Discussions

python - Sum of absolute differences of a number in an array - Stack Overflow
I want to calculate the sum of absolute differences of a number at index i with all integers up to index i-1 in o(n). But I am not able to think of any approach better than o(n^2) . For example: [3... More on stackoverflow.com
🌐 stackoverflow.com
May 22, 2017
performance - Python pypy: Efficient sum of absolute array/vector difference - Stack Overflow
I am trying to reduce the computation time of my script,which is run with pypy. It has to calculate for a large number of lists/vectors/arrays the pairwise sums of absolute differences. The length ... More on stackoverflow.com
🌐 stackoverflow.com
May 24, 2017
python 3.x - Sum of absolute off-diagonal differences in numpy matrix - Stack Overflow
I have a 2d numpy matrix and want to calculate the following test statistic. I have brute-force code to do it, but it seems like there should be a more general numpy solution that works for any 2D More on stackoverflow.com
🌐 stackoverflow.com
python - Sum of absolute difference of values and corresponding indices of an array - Code Review Stack Exchange
I am solving questions on arrays from here. Problem: You are given an array of N integers, A1, A2 ,…, AN. Return maximum value of: f(i, j) for all 1 ≤ i, j ≤ N. f(i, j) is defined as |A... More on codereview.stackexchange.com
🌐 codereview.stackexchange.com
May 13, 2018
Top answer
1 of 2
16

I can offer an O(n log n) solution for a start: Let fi be the i-th number of the result. We have:

When walking through the array from left to right and maintain a binary search tree of the elements a0 to ai-1, we can solve all parts of the formula in O(log n):

  • Keep subtree sizes to count the elements larger than/smaller than a given one
  • Keep cumulative subtree sums to answer the sum queries for elements larger than/smaller than a given one

We can replace the augmented search tree with some simpler data structures if we want to avoid the implementation cost:

  • Sort the array beforehand. Assign every number its rank in the sorted order
  • Keep a binary indexed tree of 0/1 values to calculate the number of elements smaller than a given value
  • Keep another binary indexed tree of the array values to calculate the sums of elements smaller than a given value

TBH I don't think this can be solved in O(n) in the general case. At the very least you would need to sort the numbers at some point. But maybe the numbers are bounded or you have some other restriction, so you might be able to implement the sum and count operations in O(1).

An implementation:

# binary-indexed tree, allows point updates and prefix sum queries
class Fenwick:
  def __init__(self, n):
    self.tree = [0]*(n+1)
    self.n = n
  def update_point(self, i, val):  # O(log n)
    i += 1
    while i <= self.n:
      self.tree[i] += val
      i += i & -i
  def read_prefix(self, i):        # O(log n)
    i += 1
    sum = 0
    while i > 0:
      sum += self.tree[i]
      i -= i & -i
    return sum

def solve(a):
  rank = { v : i for i, v in enumerate(sorted(a)) }
  res = []
  counts, sums = Fenwick(len(a)), Fenwick(len(a))
  total_sum = 0
  for i, x in enumerate(a):
    r = rank[x]
    num_smaller = counts.read_prefix(r)
    sum_smaller = sums.read_prefix(r)
    res.append(total_sum - 2*sum_smaller + x * (2*num_smaller - i))
    counts.update_point(r, 1)
    sums.update_point(r, x)
    total_sum += x
  return res

print(solve([3,5,6,7,1]))  # [0, 2, 4, 7, 17]
print(solve([2,0,1]))      # [0, 2, 2]
2 of 2
5

Here's an Omega(n log n)-comparison lower bound in the linear decision tree model. This rules out the possibility of a "nice" o(n log n)-time algorithm (two now-deleted answers both were in this class).

There is a trivial reduction to this problem from the problem of computing

f(x1, ..., xn) = sum_i sum_j |xi - xj|.

The function f is totally differentiable at x1, ..., xn if and only if x1, ..., xn are pairwise distinct. The set where f is totally differentiable thus has n! connected components, of which each leaf of the decision tree can handle at most one.

🌐
Finxter
blog.finxter.com › 5-best-ways-to-find-sum-of-absolute-differences-in-a-sorted-array-in-python
5 Best Ways to Find Sum of Absolute Differences in a Sorted Array in Python – Be on the Right Side of Change
March 4, 2024 - import numpy as np def ... matrix operations. It first converts the array into a NumPy array, then calculates a matrix of absolute differences, and finally sums the upper triangle (excluding the diagonal) to obtain the desired sum....
🌐
TutorialsPoint
tutorialspoint.com › home › articles on trending technologies › sum of absolute differences in a sorted array in python
Program to find sum of absolute differences in a sorted ...
October 6, 2021 - We have to make an array called result with the same length as nums such that result[i] is the summation of absolute differences between nums[i] and all the other elements in the array.
🌐
Vultr Docs
docs.vultr.com › python › third-party › numpy › absolute
Python Numpy absolute() - Calculate Absolute Value | Vultr Docs
November 6, 2024 - For instance, 3+4j has a magnitude calculated as sqrt(3^2 + 4^2). The numpy.absolute() function is a versatile tool for computing the absolute values of elements within numpy arrays, including handling complex data types.
🌐
OpenGenus
iq.opengenus.org › absolute-sum-of-elements-in-numpy-matrix
Absolute sum of elements in Numpy matrix
The complete Python code example ... Output: Original matrix = [[ 1.94353 -2.13254 3.00845] [-4.3423 5.5675 -6.01029]] Absolute sum = 23.00461 · Now, the difference between absolute sum and normal sum is critical for several ...
Find elsewhere
🌐
NumPy
numpy.org › doc › 2.1 › reference › generated › numpy.absolute.html
numpy.absolute — NumPy v2.1 Manual
Calculate the absolute value element-wise. ... Input array. outndarray, None, or tuple of ndarray and None, optional
🌐
NumPy
numpy.org › doc › 2.3 › reference › generated › numpy.sum.html
numpy.sum — NumPy v2.3 Manual
It must have the same shape as ... the type of the output values will be cast if necessary. ... If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array. If the default value is passed, then keepdims will not be passed through to the sum method of ...
Top answer
1 of 1
6

First of all, I want to say this is really well presented. It just looks neat, with docstrings and comments and such. If you really want to perfect it, check the Python conventions in PEP 8 and linked documents. For example,

An inline comment is a comment on the same line as a statement. Inline comments should be separated by at least two spaces from the statement. They should start with a # and a single space.

and

The docstring is a phrase ending in a period. It prescribes the function or method's effect as a command ("Do this", "Return that"), not as a description; e.g. don't write "Returns the pathname ...".


I also very much appreciate the case analysis that has gone into the code to define those four cases, and so to reduce the naively quadratic function to a linear one. (Refering back to comments, that case analysis would not be at all amiss in a comment just after the docstring explaining the internal logic of the function)

Because, per the code and your case analysis, max1 and min1 always contain the same data, you could make the code shorter and reduce its space requirements by only having one of those. (And likewise for max2 and min2) However, see the next section.


In terms of the code itself, I suggest getting rid of those arrays. If they are only filled so that you can reduce them to a single maximum or minimum value, you might as well just track the rolling maximum or minimum as you go. This definitely saves space and probably saves time.

I would name the rolling reductions max_sum and max_difference rather than max1 and max2.


In terms of testing examples, do look for pathological cases. One case that always deserves to be checked, although it's not always obvious what the behaviour should be, is the case of an empty list.


Finally, it is generally worth using Python 3 in preference to Python 2 for new code, and especially practice code.

🌐
GeeksforGeeks
geeksforgeeks.org › how-to-calculate-the-element-wise-absolute-value-of-numpy-array
How to calculate the element-wise absolute value of NumPy array? | GeeksforGeeks
August 29, 2020 - Method 1: Finding the sum of diagonal elements using nump ... True Division in Python3 returns a floating result containing the remainder of the division. To get the true division of an array, NumPy library has a function numpy.true_divide(x1, x2). This function gives us the value of true division done on the arrays passed in the function. To get the element-w ... In NumPy with the help of any() function, we can check whether any of the elements of a given array in NumPy is non-zero.
🌐
Codecademy
codecademy.com › docs › python:numpy › math methods › .abs()
Python:NumPy | Math Methods | .abs() | Codecademy
June 12, 2025 - An array containing the absolute value of each element in the input. For complex numbers, returns the magnitude calculated as √(real² + imaginary²). This example demonstrates the fundamental usage of numpy.abs() with different numeric data types:
🌐
Wikipedia
en.wikipedia.org › wiki › Sum_of_absolute_differences
Sum of absolute differences - Wikipedia
3 weeks ago - For each of these three image patches, the 9 absolute differences are added together, giving SAD values of 20, 25, and 17, respectively. From these SAD values, it could be asserted that the right side of the search image is the most similar to the template image, because it has the lowest sum of absolute differences as compared to the other two locations.
🌐
w3resource
w3resource.com › python-exercises › basic › python-basic-1-exercise-26.php
Python: Compute the summation of the absolute difference of all distinct pairs in a given array - w3resource
Python Exercises, Practice and Solution: Write a Python program to compute the summation of the absolute difference of all distinct pairs in a given array (non-decreasing order).
🌐
Medium
medium.com › swlh › sum-of-absolute-differences-in-a-sorted-array-e2667c0aa7d4
Sum of Absolute Differences in a Sorted Array-Algorithms&Visualizations | by Federico Feresini | The Startup | Medium
December 19, 2020 - In a nutshell, we can easily compute ... the sum of the differences from [4] because: |6–1| = |4-1| + |4-6| |6–4| = |4-6| |6–9| = |4-9| -|4-6| |6–14| = |4-14|- |4–6| ... I hope that at this point the idea is clear, let’s ...
🌐
GeeksforGeeks
geeksforgeeks.org › maximize-difference-between-the-sum-of-absolute-differences-of-each-element-with-the-remaining-array
Maximize difference between the sum of absolute differences of each element with the remaining array - GeeksforGeeks
September 21, 2023 - Given two integers N and K, the task is to maximize the sum of absolute differences between adjacent elements of an array of length N and sum K. Examples: Input: N = 5, K = 10 Output: 20 Explanation: The array arr[] with sum 10 can be {0, 5, ...