Code review

Peilonrays points out a mixup with the out-of-bounds testing that is valid. The statement if j ... must be within the loop for k .... One of the results is that you add a different number of elements to temp depending on which boundary you're at. But there are better ways to avoid out-of-bounds indexing, see below.

Your biggest bug, however, is that you write the result of the filter into the image you are processing. Median filtering cannot be done in-place. When you update data[i][j], you'll be reading the updated value to compute data[i][j+1]. You need to allocate a new image, and write the result there.

I would suggest not adding zeros for out-of-bounds pixels at all, because it introduces a bias to the output pixels near the boundary. The clearest example is for the pixels close to any of the corners. At the corner pixel, with a 3x3 kernel, you'll have 4 image pixels covered by the kernel. Adding 5 zeros for the out-of-bounds pixels guarantees that the output will be 0. For larger kernels this happens in more pixels of course. Instead, it is easy to simply remove the temp.append(0) statements, leading to a non-biased result. Other options are to read values from elsewhere in the image, for example mirroring the image at the boundary or extending the image by extrapolation. For median filtering this has little effect, IMO.

You set temp = [] at the very beginning of your function, then reset it when you're done using it, in preparation for the next loop. Instead, initialize it once inside the main double-loop over the pixels:

for i in range(len(data)):
   for j in range(len(data[0])):
      temp = []
      # ...

You're looping over i and j as image indices, then over z and c or k for filter kernel indices. c and k have the same function in two different loops, I would suggest using the same variable for that. z doesn't really fit in with either c or k. I would pick two names that are related in the way that i and j are, such as m and n. The choice of variable names is always very limited if it's just one letter. Using longer names would make this code clearer: for example img_column, img_row, kernel_column, kernel_row.


Out-of-bounds checking

This concludes my comments on your code. Now I'd like to offer some alternatives for out-of-bounds checking. These tests are rather expensive when performed for every pixel -- it's a test that is done \ times (with \ pixels in the image and \ pixels in the kernel). Maybe in Python the added cost is relatively small, it's an interpreted language after all, but for a compiled language these tests can easily amount to doubling processing time. There are 3 common alternatives that I know of. I will use border = filter_size // 2, and presume filter_size is odd. It is possible to adjust all 3 methods to even-sized filters.

Separate loops for image border pixels

The idea here is that the loop over the first and last border pixels along each dimension are handled separately from the loop over the core of the image. This avoids all tests. But it does require some code duplication (all in the name of speed!).

for i in range(border):
   # here we loop over the kernel from -i to border+1
for i in range(border, len(data)-border):
   # here we loop over the full kernel
for i in range(len(data)-border, len(data)):
   # here we loop over the kernel from -border to len(data)-i

Of course, within each of those loops, a similar set of 3 loops is necessary to loop over j. The filter logic is thus repeated 9 times. In a compiled language, where this is the most efficient method, code duplication can be avoided with inlined functions or macros. I don't know how a Python function call compares to a bunch of tests for out-of-bounds access, so can't comment on the usefulness of this method in Python.

A separate code path for border pixels

The idea here is to do out-of-bounds checking only for those pixels that are close to the image boundary. For pixels within the border, you use a version of the filtering logic with out-of-bounds checking. For the pixels in the core of the image (which is the big majority of pixels), you use a second version of the logic without out-of-bounds checking.

for i in range(len(data)):
   i_border = i < border or i >= len(data)-border
   for j in range(len(data[0])):
      j_border = j < border or j >= len(data)-border
      if i_border or j_border:
         # filtering with bounds checking
      else:
         # filtering without bounds checking

Padding the image

The simplest solution, and also the most flexible one, is to create a temporary image that is larger than the input image by 2*border along each dimension, and copy the input image into it. The "new" pixels can be filled with zeros (to replicate what OP intended to do), or with values taken from the input image (for example by mirroring the image at the boundary or extrapolating in some other way).

The filter now never needs to check for out-of-bounds reads. When the filtering kernel is placed over any of the input image pixels, all samples fall within the padded image.

Since for this type of filtering it is necessary to create a new output image anyway (it is not possible to compute it in-place, as I mentioned before), this is not a huge cost: the original input image can now be re-used as output image.

This solution leads to the simplest code, allows for all sorts of boundary extension methods without complicating the filtering code, and often results in the fastest code too.

Answer from Cris Luengo on Stack Exchange
🌐
GitHub
github.com › MeteHanC › Python-Median-Filter
GitHub - MeteHanC/Python-Median-Filter: Simple implementation of median filter in python to remove noise from the images. · GitHub
Median_Filter method takes 2 arguments, Image array and filter size. Lets say you have your Image array in the variable called img_arr, and you want to remove the noise from this image using 3x3 median filter.
Starred by 58 users
Forked by 21 users
Languages   Python
🌐
SciPy
docs.scipy.org › doc › scipy › reference › generated › scipy.ndimage.median_filter.html
median_filter — SciPy v1.17.0 Manual
>>> from scipy import ndimage, datasets >>> import matplotlib.pyplot as plt >>> fig = plt.figure() >>> plt.gray() # show the filtered result in grayscale >>> ax1 = fig.add_subplot(121) # left side >>> ax2 = fig.add_subplot(122) # right side >>> ascent = datasets.ascent() >>> result = ndimage.median_filter(ascent, size=20) >>> ax1.imshow(ascent) >>> ax2.imshow(result) >>> plt.show()
🌐
ResearchGate
researchgate.net › figure › Median-Filter-implementation-using-Python_fig9_332574579
Median Filter implementation using Python. | Download Scientific Diagram
The research concerns the validation of the effectiveness of image filtering methods including Wiener Filter and Median Filter. The filters were implemented in Python and the source code is available at: https://github.com/tranleanh/Wiener-Median-Comparison
🌐
PyiHub
pyihub.org › home › how to implement the median filter in image processing with python?
How to Implement the Median Filter in Image Processing with Python? - PyiHub
July 4, 2024 - In conclusion, the median filter is a valuable tool for processing noisy images while preserving important image details. This nonlinear smoothing filter is particularly effective at eliminating salt and pepper noise. In this tutorial, we learned how to implement a median filter in Python using the OpenCV library.
Top answer
1 of 2
5

Code review

Peilonrays points out a mixup with the out-of-bounds testing that is valid. The statement if j ... must be within the loop for k .... One of the results is that you add a different number of elements to temp depending on which boundary you're at. But there are better ways to avoid out-of-bounds indexing, see below.

Your biggest bug, however, is that you write the result of the filter into the image you are processing. Median filtering cannot be done in-place. When you update data[i][j], you'll be reading the updated value to compute data[i][j+1]. You need to allocate a new image, and write the result there.

I would suggest not adding zeros for out-of-bounds pixels at all, because it introduces a bias to the output pixels near the boundary. The clearest example is for the pixels close to any of the corners. At the corner pixel, with a 3x3 kernel, you'll have 4 image pixels covered by the kernel. Adding 5 zeros for the out-of-bounds pixels guarantees that the output will be 0. For larger kernels this happens in more pixels of course. Instead, it is easy to simply remove the temp.append(0) statements, leading to a non-biased result. Other options are to read values from elsewhere in the image, for example mirroring the image at the boundary or extending the image by extrapolation. For median filtering this has little effect, IMO.

You set temp = [] at the very beginning of your function, then reset it when you're done using it, in preparation for the next loop. Instead, initialize it once inside the main double-loop over the pixels:

for i in range(len(data)):
   for j in range(len(data[0])):
      temp = []
      # ...

You're looping over i and j as image indices, then over z and c or k for filter kernel indices. c and k have the same function in two different loops, I would suggest using the same variable for that. z doesn't really fit in with either c or k. I would pick two names that are related in the way that i and j are, such as m and n. The choice of variable names is always very limited if it's just one letter. Using longer names would make this code clearer: for example img_column, img_row, kernel_column, kernel_row.


Out-of-bounds checking

This concludes my comments on your code. Now I'd like to offer some alternatives for out-of-bounds checking. These tests are rather expensive when performed for every pixel -- it's a test that is done \ times (with \ pixels in the image and \ pixels in the kernel). Maybe in Python the added cost is relatively small, it's an interpreted language after all, but for a compiled language these tests can easily amount to doubling processing time. There are 3 common alternatives that I know of. I will use border = filter_size // 2, and presume filter_size is odd. It is possible to adjust all 3 methods to even-sized filters.

Separate loops for image border pixels

The idea here is that the loop over the first and last border pixels along each dimension are handled separately from the loop over the core of the image. This avoids all tests. But it does require some code duplication (all in the name of speed!).

for i in range(border):
   # here we loop over the kernel from -i to border+1
for i in range(border, len(data)-border):
   # here we loop over the full kernel
for i in range(len(data)-border, len(data)):
   # here we loop over the kernel from -border to len(data)-i

Of course, within each of those loops, a similar set of 3 loops is necessary to loop over j. The filter logic is thus repeated 9 times. In a compiled language, where this is the most efficient method, code duplication can be avoided with inlined functions or macros. I don't know how a Python function call compares to a bunch of tests for out-of-bounds access, so can't comment on the usefulness of this method in Python.

A separate code path for border pixels

The idea here is to do out-of-bounds checking only for those pixels that are close to the image boundary. For pixels within the border, you use a version of the filtering logic with out-of-bounds checking. For the pixels in the core of the image (which is the big majority of pixels), you use a second version of the logic without out-of-bounds checking.

for i in range(len(data)):
   i_border = i < border or i >= len(data)-border
   for j in range(len(data[0])):
      j_border = j < border or j >= len(data)-border
      if i_border or j_border:
         # filtering with bounds checking
      else:
         # filtering without bounds checking

Padding the image

The simplest solution, and also the most flexible one, is to create a temporary image that is larger than the input image by 2*border along each dimension, and copy the input image into it. The "new" pixels can be filled with zeros (to replicate what OP intended to do), or with values taken from the input image (for example by mirroring the image at the boundary or extrapolating in some other way).

The filter now never needs to check for out-of-bounds reads. When the filtering kernel is placed over any of the input image pixels, all samples fall within the padded image.

Since for this type of filtering it is necessary to create a new output image anyway (it is not possible to compute it in-place, as I mentioned before), this is not a huge cost: the original input image can now be re-used as output image.

This solution leads to the simplest code, allows for all sorts of boundary extension methods without complicating the filtering code, and often results in the fastest code too.

2 of 2
3

You seem to have a few bugs.

  1. if i + z - indexer < 0 or i + z - indexer > len(data) - 1:
    

    If i and z are 0, where indexer is 1, then you'll have 0 + 0 - 1 < 0. This would mean that you'd replace the data in (-1, j), (0, j) and (1, j) to 0. Since 0 and 1 probably do contain data this is just plain wrong.

  2. if j + z - indexer < 0 or j + indexer > len(data[0]) - 1:
          temp.append(0)
    

    This removes some data, meaning that the median is shifted. Say you should have (0, 0, 0, 1, 2, 3), however you removed the first three because of this you'd have (0, 1, 2, 3). Now the median is 1 rather than 0.


Your code would be simpler if you:

  1. Made a window list, that contained all the indexes that you want to move to.
  2. Have an if to check if the data in that index is out of bounds.
  3. If it's out of bounds default to 0.
  4. If it's not out of bounds use the data.

This could become:

def median_filter(data, filter_size):
    temp = []
    indexer = filter_size // 2
    window = [
        (i, j)
        for i in range(-indexer, filter_size-indexer)
        for j in range(-indexer, filter_size-indexer)
    ]
    index = len(window) // 2
    for i in range(len(data)):
        for j in range(len(data[0])):
            data[i][j] = sorted(
                0 if (
                    min(i+a, j+b) < 0
                    or len(data) <= i+a
                    or len(data[0]) <= j+b
                ) else data[i+a][j+b]
                for a, b in window
            )[index]
    return data
🌐
GitHub
github.com › RamSriKorukonda › Median-Filter-from-Scratch
GitHub - RamSriKorukonda/Median-Filter-from-Scratch: Creating a Median Filter for denoising the images from scratch using Python · GitHub
Creating a Median Filter for denoising the images from scratch using Python - RamSriKorukonda/Median-Filter-from-Scratch
Author   RamSriKorukonda
🌐
LinuxTut
linuxtut.com › en › 05c4911c0b2f726bb00a
Image processing with Python 100 knock # 10 median filter
September 29, 2020 - import numpy as np import cv2 import matplotlib.pyplot as plt def medianFilter(img,k): w,h,c = img.shape size = k // 2 #0 padding process _img = np.zeros((w+2*size,h+2*size,c), dtype=np.float) _img[size:size+w,size:size+h] = img.copy().astype(np.float) dst = _img.copy() #Filtering process for x in range(w): for y in range(h): for z in range(c): dst[x+size,y+size,z] = np.median(_img[x:x+k,y:y+k,z]) dst = dst[size:size+w,size:size+h].astype(np.uint8) return dst #Image reading img = cv2.imread('image.jpg') #Median filter #Second argument: Filter size img = medianFilter(img,15) #Save image cv2.imwrite('result.jpg', img) #Image display plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB)) plt.show()
Find elsewhere
🌐
DEV Community
dev.to › enzoftware › how-to-build-amazing-image-filters-with-python-median-filter---sobel-filter---5h7
How to build amazing image filters with Python— Median filter 📷 , Sobel filter ⚫️ ⚪️ - DEV Community
December 9, 2017 - Taken from http://artemhlezin.com/2016/09/04/median.html This filter is used to eliminate the ‘noise’ of the images, mainly is salt-n-pepper noise. There is not much theory beyond the one in the picture. This is how the filter works : gets all the values inside a mask, sorts them and then assigns the mean value to the coordinate. This is how it looks an image with salt and pepper noise : In Python 🐍 the filter works like this, enter to check the result:
🌐
GeeksforGeeks
geeksforgeeks.org › python-pil-medianfilter-and-modefilter-method
Python PIL | MedianFilter() and ModeFilter() method | GeeksforGeeks
June 29, 2019 - The ImageFilter module contains definitions for a pre-defined set of filters, which can be used with the Image.filter() method. PIL.ImageFilter.MedianFilter() method creates a median filter.
🌐
GitHub
github.com › sarnold › adaptive-median
GitHub - sarnold/adaptive-median: Adaptive-median image filter in pure python - use with medians-1D · GitHub
This is just a python implementation of an adaptive median image filter, which is essentially a despeckling filter for grayscale images. The other piece (which you can disable by commenting out the import line for medians_1D) is a set of example C median filters and swig wrappers (see the medians-1D repo for that part).
Starred by 13 users
Forked by 15 users
Languages   Jupyter Notebook 98.7% | Python 1.3%
Top answer
1 of 2
1

Most of the answers here seem to center on performance optimizations of the naive median filtering algorithm. It's worth noting that the median filters you would find in imaging packages like OpenCV/scikit-image/MATLAB/etc. implement faster algorithms.

http://nomis80.org/ctmf.pdf

If you are median filtering uint8 data, there are a lot of clever tricks to be played with reusing histograms as you move from neighborhood to neighborhood.

I would use the median filter in an imaging package rather than trying to roll one yourself if you care about speed.

2 of 2
1

I think you want to replace all pixels around the radius of each circle of the image with the mean of the pixels on that same radius in the input image.

I propose to warp the image to cartesian coordinates, calculate the mean and then warp back to polar coordinates.

I generated some test data of a decent size like this:

#!/usr/bin/env python3

import cv2
from PIL import Image
from scipy import stats, ndimage, misc
import matplotlib.image as mpimg
from scipy import stats
import numpy as np

w, h = 600, 600
a = np.zeros((h,w),np.uint8)

# Generate some arcs
for s in range(1,6):
    radius = int(s*w/14)
    centre = (int(w/2), int(w/2))
    axes = (radius, radius)
    angle = 360
    startAngle = 0
    endAngle = 72*s

    cv2.ellipse(a, centre, axes, angle, startAngle, endAngle, 255, 2)

That gives this:

Image.fromarray(a.astype(np.uint8)).save('start.png')

def orig(a):
    b = a.copy().flatten()
    y,x = np.indices((a.shape))
    center = [len(x)//2, len(y)//2]
    r = np.hypot(x-center[0],y-center[1])
    r = r.astype(np.int) # integer part of radii (bin size = 1)
    set_r = set(r.flatten()) # get the list of r without duplication
    max_r = max(set_r) # determine the maximum r
    median_r = np.array([0.]*len(r.flatten())) # array of median I for each r
    for j in set_r:
        result = np.where(r.flatten() == j) 
        median_r[result[0]] = np.median(b[result[0]])
    return median_r

def me(a):
    h, w = a.shape
    centre = (int(h/2), int(w/2))
    maxRad = np.sqrt(((h/2.0)**2.0)+((w/2.0)**2.0))
    pol = cv2.warpPolar(a.astype(np.float), a.shape, centre, maxRad, flags=cv2.WARP_POLAR_LINEAR+cv2.WARP_FILL_OUTLIERS)
    polmed = np.median(pol,axis=0,keepdims=True)
    polmed = np.broadcast_to(polmed,a.shape)
    res = cv2.warpPolar(polmed, a.shape, centre,  maxRad, cv2.WARP_INVERSE_MAP)
    return res.astype(np.uint8)

a_med = orig(a).reshape(a.shape)

Image.fromarray(a_med.astype(np.uint8)).save('result.png')

r = me(a)
Image.fromarray(r).save('result-me.png')

The result is the same as yours, i.e. it removes all arcs less than 180 degrees and fills all arcs over 180 degrees:

But the timing for mine is 10x faster:

In [58]: %timeit a_med = orig(a).reshape(a.shape)                                                                               
287 ms ± 17.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [59]: %timeit r = me(a)                                                                                                      
29.9 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In case you are having difficulty imagining what I get after warpPolar(), it looks like this. Then I use np.mean() to take the mean down the columns, i.e. axis=0:

Keywords: Python, radial mean, radial median, cartesian coordinates, polar coordinates, rectangular, warpPolar, linearPolar, OpenCV, image, image processing

🌐
YouTube
youtube.com › watch
Tutorial 33 - Image filtering in python - Median filter for denoising images - YouTube
In microscopy, noise arises from many sources including electronic components such as detectors and sensors. Salt & pepper noise may also show up due to erro...
Published   May 28, 2020
🌐
Medium
medium.com › @enzoftware › how-to-build-amazing-images-filters-with-python-median-filter-sobel-filter-️-️-22aeb8e2f540
How to build amazing image filters with Python— Median filter 📷 , Sobel filter ⚫️ ⚪️ | by Enzo Lizama Paredes | Medium
July 21, 2020 - How to build amazing image filters with Python— Median filter 📷 , Sobel filter ⚫️ ⚪️ This post is published too on my personal blog. Take a look at it 👀. Nowadays, I’m starting in a …
🌐
GitHub
github.com › topics › median-filter
median-filter · GitHub Topics · GitHub
python image-processing median-filter wiener-filter ... Implementation of various image processing methods from scratch in python.
🌐
Medium
medium.com › @sarves021999 › noise-filtering-mean-median-mid-point-filter-72ab3be76da2
Noise filtering (Mean, Median &Mid-point filter) without OpenCV Library | by Sarves | Medium
February 17, 2023 - The article provides a step-by-step guide to implementing a linear filter using Python programming language, which can apply various filters such as edge detection, blurring, and sharpening to images.
🌐
GitHub
github.com › savan77 › Median-Filter
GitHub - savan77/Median-Filter: Implementation of median filter in python to remove noise from an image.
Median Filtering is a digital filtering technique, used to remove noise from an image. This type of filtering can be a pre-processing step for further processing like object/edge detection. Detailed explanation of median filter can be found here (http://blog.savanvisalpara.com/median-filter.html) #Run Dependency for this script is python 2.7, numpy and scipy.
Author   savan77
🌐
Algorithmexamples
python.algorithmexamples.com › web › digital_image_processing › filters › median_filter.html
2000+ Algorithm Examples in Python, Java, Javascript, C, C++, Go, Matlab, Kotlin, Ruby, R and Scala
By using the median instead of the mean, the algorithm inherently discards outlier values, ensuring that only the relevant information is used for filtering. This makes median filters highly effective in removing noise while preserving essential features like edges and textures.
🌐
Literateprograms
literateprograms.org › median_filter__python_.html
Median filter (Python) - LiteratePrograms
Now, the actual image filtering function: The first thing we have to do is get the pixels out of the image objects for faster access (Image.GetPixel is convenient, but slow, and worse, results in a Python-to-C-call which, you guessed it, can't be optimized by Psyco): ... """filters a region of an image with a circular-shaped median/quantile filter.