Based on this post, we could create sliding windows to get a 2D array of such windows being set as rows in it. These windows would merely be views into the data array, so no memory consumption and thus would be pretty efficient. Then, we would simply use those ufuncs along each row axis=1.

Thus, for example sliding-median` could be computed like so -

Copynp.median(strided_app(data, window_len,1),axis=1)

For the other ufuncs, just use the respective ufunc names there : np.min, np.max & np.mean. Please note this is meant to give a generic solution to use ufunc supported functionality.

For the best performance, one must still look into specific functions that are built for those purposes. For the four requested functions, we have the builtins, like so -

Median : scipy.signal.medfilt.

Max : scipy.ndimage.filters.maximum_filter1d.

Min : scipy.ndimage.filters.minimum_filter1d.

Mean : scipy.ndimage.filters.uniform_filter1d

Answer from Divakar on Stack Overflow
🌐
SciPy
docs.scipy.org › doc › scipy › reference › generated › scipy.signal.medfilt2d.html
medfilt2d — SciPy v1.17.0 Manual
A 2-dimensional input array. ... A scalar or a list of length 2, giving the size of the median filter window in each dimension. Elements of kernel_size should be odd. If kernel_size is a scalar, then this scalar is used as the size in each dimension.
🌐
SciPy
docs.scipy.org › doc › scipy › reference › generated › scipy.ndimage.median_filter.html
median_filter — SciPy v1.17.0 Manual
Either size or footprint must be defined. size gives the shape that is taken from the input array, at every element position, to define the input to the filter function. footprint is a boolean array that specifies (implicitly) a shape, but also which of the elements within this shape will get passed to the filter function. Thus size=(n,m) is equivalent to footprint=np.ones((n,m)). We adjust size to the number of dimensions of the input array, so that, if the input array is shape (10,10,10), and size is 2, then the actual size used is (2,2,2).
Discussions

Python median filter applied to 3D array to produce 2D result - Stack Overflow
I had seen several discussions in this forum about applying median filter with moving window, but my application have a special peculiarity. I have a 3D array of dimension 750x12000x10000 and I n... More on stackoverflow.com
🌐 stackoverflow.com
Python Median Filter for 1D numpy array - Stack Overflow
Now for the second question, how can I obtain these other filters? ... Save this answer. ... Show activity on this post. Based on this post, we could create sliding windows to get a 2D array of such windows being set as rows in it. These windows would merely be views into the data array, so no memory consumption and thus would be pretty efficient. Then, we would simply use those ufuncs along each row axis=1. Thus, for example sliding-median... More on stackoverflow.com
🌐 stackoverflow.com
c - Two dimensional array median filtering - Stack Overflow
I'm trying to write code that implements median filtering on a two-dimensional array. Here's an image to illustrate: The program starts at the beginning of the array. The maximum array size is 100. I More on stackoverflow.com
🌐 stackoverflow.com
numpy - Vectorizing 1D median filter For 2D Arrays in Python - Stack Overflow
How can I vectorize the process of applying 1D median filter to the rows of a 2D NumPy array? Is there any way to avoid looping through the rows (0, 1, ..., 19)? My data is a time-series (25000 sam... More on stackoverflow.com
🌐 stackoverflow.com
February 11, 2020
🌐
SciPy
docs.scipy.org › doc › scipy-1.16.1 › reference › generated › scipy.signal.medfilt2d.html
medfilt2d — SciPy v1.16.1 Manual
A 2-dimensional input array. ... A scalar or a list of length 2, giving the size of the median filter window in each dimension. Elements of kernel_size should be odd. If kernel_size is a scalar, then this scalar is used as the size in each dimension.
🌐
Stack Overflow
stackoverflow.com › questions › 49740518 › python-median-filter-applied-to-3d-array-to-produce-2d-result
Python median filter applied to 3D array to produce 2D result - Stack Overflow
If you were applying a mean-filter, this problem would be trivial: you would take the mean over the z-axis and then apply the mean filter in 2D; this would be exactly equivalent to computing the mean over the full (x,y,z) neighbourhood in one go as the mean operation is associative (if that is the term; I mean: f(f(a,b), c) = f(a, b, c)). In principle, this is not true for the median.
🌐
SciPy
docs.scipy.org › doc › scipy › reference › generated › scipy.signal.medfilt.html
medfilt — SciPy v1.17.0 Manual
An N-dimensional input array. ... A scalar or an N-length list giving the size of the median filter window in each dimension. Elements of kernel_size should be odd. If kernel_size is a scalar, then this scalar is used as the size in each dimension.
Top answer
1 of 2
9

Based on this post, we could create sliding windows to get a 2D array of such windows being set as rows in it. These windows would merely be views into the data array, so no memory consumption and thus would be pretty efficient. Then, we would simply use those ufuncs along each row axis=1.

Thus, for example sliding-median` could be computed like so -

Copynp.median(strided_app(data, window_len,1),axis=1)

For the other ufuncs, just use the respective ufunc names there : np.min, np.max & np.mean. Please note this is meant to give a generic solution to use ufunc supported functionality.

For the best performance, one must still look into specific functions that are built for those purposes. For the four requested functions, we have the builtins, like so -

Median : scipy.signal.medfilt.

Max : scipy.ndimage.filters.maximum_filter1d.

Min : scipy.ndimage.filters.minimum_filter1d.

Mean : scipy.ndimage.filters.uniform_filter1d

2 of 2
1

The fact that applying of a median filter with the window size 1 will not change the array gives us a freedom to apply the median filter row-wise or column-wise.

For example, this code

Copyfrom scipy.ndimage import median_filter
import numpy as np

arr = np.array([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]])
median_filter(arr, size=3, cval=0, mode='constant')
#with cval=0, mode='constant' we set that input array is extended with zeros 
#when window overlaps edges, just for visibility and ease of calculation

outputs an expected filtered with window (3, 3) array

Copyarray([[0., 2., 0.],
       [2., 5., 3.],
       [0., 5., 0.]])

because median_filter automatically extends the size to all dimensions, so the same effect we can get with:

Copymedian_filter(arr, size=(3, 3), cval=0, mode='constant')

Now, we can also apply median_filter row-wise with setting 1 to the first element of size

Copymedian_filter(arr, size=(1, 3), cval=0, mode='constant')

Output:

Copyarray([[1., 2., 2.],
       [4., 5., 5.],
       [7., 8., 8.]])

And column-wise with the same logic

Copymedian_filter(arr, size=(3, 1), cval=0, mode='constant')

Output:

Copyarray([[1., 2., 3.],
       [4., 5., 6.],
       [4., 5., 6.]])
Find elsewhere
Top answer
1 of 2
2

It looks like you're trying to implement a two-dimensional median filter. The straightforward way to implement such a filter is to have four nested loops: two outer loops over the x and y coordinates of the whole image, and two inner loops over the neighborhood of the center pixel.

It's perhaps easier to describe this in code than in text, so here's some Python-esque pseudocode to illustrate:

# assumptions:
#  * image is a height x width array containing source pixel values
#  * filtered is a height x width array to store result pixel values in
#  * size is an odd number giving the diameter of the filter region

radius = (size - 1) / 2   # size = 3 -> radius = 1

for y from 0 to height-1:
    top = max(y - radius, 0)
    bottom = min(y + radius, height-1)

    for x from 0 to width-1:
        left = max(x - radius, 0)
        right = min(x + radius, width-1) 
        values = new list

        for v from top to bottom:
            for u from left to right:
                add image[v][u] to values

        filtered[y][x] = median(values)

Translating this code into C is left as an exercise.

It's also possible to optimize this code by noting that the neighborhoods of adjacent array cells overlap significantly, so that the values of those neighboring cells can be reused across successive iterations of the outer loops. Since the performance of this algorithm on modern CPUs is essentially limited by RAM access latency, such reuse can provide a significant speedup, especially for large filter sizes.

2 of 2
0

this:

for(i=0;i<size_filter;i++)
for(j=0;j<size_filter;j++)
      temp[i][j]=a[i][j];

is a good starting point. You just iterating over every pixel of your input array, determine the median of the neighborhood and write it to an output array. So instead of temp[i][j]=a[i][j]; you need some WhatEverType calcMedianAt(const WhatEverType a[100][100], int r, int c, int size); function.

So you can call temp[i][j]=calcMedianAt(a, i,j, 3);

the function itself has to extract the value to a list (do proper border handling) and find the median in that list (for example by calling some median function WhatEverType calcMedian(const WhatEverType* data, int len); and return it.

🌐
TutorialsPoint
tutorialspoint.com › scipy › scipy_median_filter.htm
SciPy - Median Filter
Following is an example shows how to apply a simple Median Filter on a 2D image or array to remove noise with the help of scipy.ndimage.median_filter() function
Top answer
1 of 2
5

Code review

Peilonrays points out a mixup with the out-of-bounds testing that is valid. The statement if j ... must be within the loop for k .... One of the results is that you add a different number of elements to temp depending on which boundary you're at. But there are better ways to avoid out-of-bounds indexing, see below.

Your biggest bug, however, is that you write the result of the filter into the image you are processing. Median filtering cannot be done in-place. When you update data[i][j], you'll be reading the updated value to compute data[i][j+1]. You need to allocate a new image, and write the result there.

I would suggest not adding zeros for out-of-bounds pixels at all, because it introduces a bias to the output pixels near the boundary. The clearest example is for the pixels close to any of the corners. At the corner pixel, with a 3x3 kernel, you'll have 4 image pixels covered by the kernel. Adding 5 zeros for the out-of-bounds pixels guarantees that the output will be 0. For larger kernels this happens in more pixels of course. Instead, it is easy to simply remove the temp.append(0) statements, leading to a non-biased result. Other options are to read values from elsewhere in the image, for example mirroring the image at the boundary or extending the image by extrapolation. For median filtering this has little effect, IMO.

You set temp = [] at the very beginning of your function, then reset it when you're done using it, in preparation for the next loop. Instead, initialize it once inside the main double-loop over the pixels:

for i in range(len(data)):
   for j in range(len(data[0])):
      temp = []
      # ...

You're looping over i and j as image indices, then over z and c or k for filter kernel indices. c and k have the same function in two different loops, I would suggest using the same variable for that. z doesn't really fit in with either c or k. I would pick two names that are related in the way that i and j are, such as m and n. The choice of variable names is always very limited if it's just one letter. Using longer names would make this code clearer: for example img_column, img_row, kernel_column, kernel_row.


Out-of-bounds checking

This concludes my comments on your code. Now I'd like to offer some alternatives for out-of-bounds checking. These tests are rather expensive when performed for every pixel -- it's a test that is done \$n k\$ times (with \$n\$ pixels in the image and \$k\$ pixels in the kernel). Maybe in Python the added cost is relatively small, it's an interpreted language after all, but for a compiled language these tests can easily amount to doubling processing time. There are 3 common alternatives that I know of. I will use border = filter_size // 2, and presume filter_size is odd. It is possible to adjust all 3 methods to even-sized filters.

Separate loops for image border pixels

The idea here is that the loop over the first and last border pixels along each dimension are handled separately from the loop over the core of the image. This avoids all tests. But it does require some code duplication (all in the name of speed!).

for i in range(border):
   # here we loop over the kernel from -i to border+1
for i in range(border, len(data)-border):
   # here we loop over the full kernel
for i in range(len(data)-border, len(data)):
   # here we loop over the kernel from -border to len(data)-i

Of course, within each of those loops, a similar set of 3 loops is necessary to loop over j. The filter logic is thus repeated 9 times. In a compiled language, where this is the most efficient method, code duplication can be avoided with inlined functions or macros. I don't know how a Python function call compares to a bunch of tests for out-of-bounds access, so can't comment on the usefulness of this method in Python.

A separate code path for border pixels

The idea here is to do out-of-bounds checking only for those pixels that are close to the image boundary. For pixels within the border, you use a version of the filtering logic with out-of-bounds checking. For the pixels in the core of the image (which is the big majority of pixels), you use a second version of the logic without out-of-bounds checking.

for i in range(len(data)):
   i_border = i < border or i >= len(data)-border
   for j in range(len(data[0])):
      j_border = j < border or j >= len(data)-border
      if i_border or j_border:
         # filtering with bounds checking
      else:
         # filtering without bounds checking

Padding the image

The simplest solution, and also the most flexible one, is to create a temporary image that is larger than the input image by 2*border along each dimension, and copy the input image into it. The "new" pixels can be filled with zeros (to replicate what OP intended to do), or with values taken from the input image (for example by mirroring the image at the boundary or extrapolating in some other way).

The filter now never needs to check for out-of-bounds reads. When the filtering kernel is placed over any of the input image pixels, all samples fall within the padded image.

Since for this type of filtering it is necessary to create a new output image anyway (it is not possible to compute it in-place, as I mentioned before), this is not a huge cost: the original input image can now be re-used as output image.

This solution leads to the simplest code, allows for all sorts of boundary extension methods without complicating the filtering code, and often results in the fastest code too.

2 of 2
3

You seem to have a few bugs.

  1. if i + z - indexer < 0 or i + z - indexer > len(data) - 1:
    

    If i and z are 0, where indexer is 1, then you'll have 0 + 0 - 1 < 0. This would mean that you'd replace the data in (-1, j), (0, j) and (1, j) to 0. Since 0 and 1 probably do contain data this is just plain wrong.

  2. if j + z - indexer < 0 or j + indexer > len(data[0]) - 1:
          temp.append(0)
    

    This removes some data, meaning that the median is shifted. Say you should have (0, 0, 0, 1, 2, 3), however you removed the first three because of this you'd have (0, 1, 2, 3). Now the median is 1 rather than 0.


Your code would be simpler if you:

  1. Made a window list, that contained all the indexes that you want to move to.
  2. Have an if to check if the data in that index is out of bounds.
  3. If it's out of bounds default to 0.
  4. If it's not out of bounds use the data.

This could become:

def median_filter(data, filter_size):
    temp = []
    indexer = filter_size // 2
    window = [
        (i, j)
        for i in range(-indexer, filter_size-indexer)
        for j in range(-indexer, filter_size-indexer)
    ]
    index = len(window) // 2
    for i in range(len(data)):
        for j in range(len(data[0])):
            data[i][j] = sorted(
                0 if (
                    min(i+a, j+b) < 0
                    or len(data) <= i+a
                    or len(data[0]) <= j+b
                ) else data[i+a][j+b]
                for a, b in window
            )[index]
    return data
🌐
Sarnold
sarnold.github.io › medians-1D › median_search.html
Fast median search: an ANSI C implementation
There is now an example Python implementation of an adaptive median image filter. The arguments allow for both a variable window size and an adaptive threshold parameter. Aside from argument and error handling, the Python code uses both numpy and pillow (formerly Python Imaging Library) to manipulate image files and pixel arrays/vectors.
🌐
scikit-image
scikit-image.org › skimage-tutorials › lectures › 1_image_filters.html
Image filtering — Image analysis in Python
As the name implies, this filter takes a set of pixels (i.e. the pixels within a kernel or “structuring element”) and returns the median value within that neighborhood.
🌐
NVIDIA
docs.nvidia.com › vpi › python › build › vpi.Image.median_filter.html
vpi.Image.median_filter — VPI Python API Reference 4.0 documentation
Runs a 2D median filter over the image. ... Refer to the algorithm explanation for more details and usage examples. ... kernel (Tuple[int, int] or 2D array of int) – The kernel defines the neighborhood of the operation, as either a tuple of the kernel size, e.g.
🌐
GitHub
github.com › scipy › scipy › issues › 13509
signal.medfilt2d vs ndimage.median_filter · Issue #13509 · scipy/scipy
February 5, 2021 - The subsequent PR #9685 added a note in the docs suggesting the use of median_filter instead. However, it also added the same note to signal.medfilt2d, but medfilt2d appears to be faster than median_filter. Taking the example from #9680 and adapting it to a 2D array of approximately the same ...
Author   scipy
🌐
GitHub
github.com › MeteHanC › Python-Median-Filter
GitHub - MeteHanC/Python-Median-Filter: Simple implementation of median filter in python to remove noise from the images. · GitHub
Median_Filter method takes 2 arguments, Image array and filter size. Lets say you have your Image array in the variable called img_arr, and you want to remove the noise from this image using 3x3 median filter.
Starred by 58 users
Forked by 21 users
Languages   Python