algorithm - python image recognition - Stack Overflow
What is the best library for simple image recognition?
I want to learn Image Processing. Reddit, what are some great resources?
Does python have any good image manipulation libraries that could be used to create a "splitDepthGif" image filter? Here's an example:
Videos
Hello, I would like to learn enough that ultimately I can run python script, have it view what ever is on my monitor and locate photos, and interact with them.
For example I have folder with pictures of four animals. fish.png deer.png bear.png horse.png
I want the script to scan whatever is on my monitor and attempt to find out if any of the picture I am interested in are present. Ideally I will also be able to get the X,Y coords on the screen so I can automate some mouse movement and clicking as well.
This is a static image. I dont need to be able to identify ANY deer, or ANY fish. I just need to be able to find the known photo.
Not asking for code help, but can anyone point me in the right direction on what to research or any guides?
So far I have been able to use NUMPY and MATPLOTLIB following this series https://pythonprogramming.net/image-recognition-python/ and I am able to pull in my saved image and work with its array, but really not sure how to examine my screen and get x, y, resolution coords of a positive match.
A typical python tool chain would be:
- read your images with with PIL
- transform them into Numpy arrays
- use Scipy's image filters (linear and rank, morphological) to implement your solution
As far differentiating the shapes, I would obtain its silhouette by looking at the shape of the background. I would then detect the number of corners using a corner detection algorithm (e.g. Harris). A triangle has 3 corners, a square 4, and a smiley none. Here's a python implementation of the Harris corner detection with Scipy.
Edit:
As you mention in the comments, the blog post didn't present the function that produces a gaussian kernel needed in the algorithm. Here's an example of a such a function from the Scipy Cookbook (great resource btw):
def gauss_kern(size, sizey=None):
""" Returns a normalized 2D gauss kernel array for convolutions """
size = int(size)
if not sizey:
sizey = size
else:
sizey = int(sizey)
x, y = mgrid[-size:size+1, -sizey:sizey+1]
g = exp(-(x**2/float(size)+y**2/float(sizey)))
return g / g.sum()
OpenCV has blob analysis tools, it will give you metrics about the shape which you can feed for your favourite pattern recognition algorithm :) Eg. rectangle has 1.0 ratio for area / (height * width), when circle's ratio is about 0.78.
I'm writing a script that requires me to figure out how many of an object are on the screen. The objects are kinda low resolution, think the ducks from Duck Hunt. I want to be able to track them in the screen and know their RGB value, even if acreen orientation changes.
However, I don't know much Python, let alone machine learning, so I'd like a library that has some capabilities built in.
I've tried pyautogui, but it doesn't always register the objects on screen, and I can't figure out how its confidence interval stuff works. I've heard of opencv, but it looks like it requires some machine learning knowledge.
Cheers.