When you use Chrome or Chromium as a browser there is a much easier and much more stable approach using ONLY pyautogui:
Perform Crtl + F with pyautogui
Perform Ctrl + Enter to 'click' on search result / open the link related to the result
With other browsers you have to clarify if there keyboard shortcuts also exists.
Answer from Ulrich on Stack OverflowWhen you use Chrome or Chromium as a browser there is a much easier and much more stable approach using ONLY pyautogui:
Perform Crtl + F with pyautogui
Perform Ctrl + Enter to 'click' on search result / open the link related to the result
With other browsers you have to clarify if there keyboard shortcuts also exists.
Yes, you can do that, but you additionally need Tesseract (and the Python-module pytesseract) for text recognition and PIL for taking screenshots.
Then perform the following steps:
- Open the page
- Open and perform the search (ctrl+f with pyautogui) - the view changes to the first result
- Take a screenshot (with PIL)
- Convert the image to text and data (with Tesseract) and find the text and the position
- Use pyautogui to move the mouse and click on it
Here is the needed code for getting the image and the related data:
import time
from PIL import ImageGrab # screenshot
import pytesseract
from pytesseract import Output
pytesseract.pytesseract.tesseract_cmd = (r"C:\...\AppData\Local\Programs\Tesseract-OCR\tesseract") # needed for Windows as OS
screen = ImageGrab.grab() # screenshot
cap = screen.convert('L') # make grayscale
data=pytesseract.image_to_boxes(cap,output_type=Output.DICT)
print(data)
In data you find all required information you need to move the mouse and click on the text.
The downside of this approach is the ressource consuming OCR part which takes a few seconds on slower machines.
Read words using pyautogui, open-cv python and pytesseract
Trying to use pyautogui and locateOnScreen
pyautogui - Reading the text on screen using python - Stack Overflow
pyautogui locate on screen returns a wrong coordinates
Videos
I can successfully take a screenshot of a part of my screen using pyautogui and I can get open-cv to read the screenshot. The problem is I want pytesseract to read the screenshot without saving it as a file locally. Would there be any way to do this?
So, I am VERY new to Python and programming in general and have been researching how to make my program work for the past couple hours.
I am trying to create a program that will automate a repetitive task by clicking particular buttons for me on a particular internet site and printing (to a printer) what shows up when those buttons are pressed.
These buttons do not have the same coordinates on the screen each time, so I am trying to combine a click command with a locateOnScreen command. The program is running from the same directory as the screencaps I took for locateOnScreen - Python gave me a syntax error when I used the full path for the image, so I just used the file name.
When I run the program from IDLE, it will actually try to use that last command and print to the printer - if I confirm, it will simply print the ensuing shell message that pops up -
Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:14:34) [MSC v.1900 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> ================ RESTART: C:\Users\***\Desktop\program.py ================ >>>
Here is my program's code:
#!\C:\Users\***\AppData\Local\Programs\Python\Python36-32\
import pyautogui
pyautogui.PAUSE = 1
pyautogui.FAILSAFE = True
pyautogui.locateOnScreen('Show Explanation.png')
pyautogui.click
pyautogui.hotkey('ctrl', 'p')
pyautogui.locateOnScreen('Next.png')
pyautogui.clickIt seems to me that I either messed up semantically and/or locateOnScreen is not working and trying to find the buttons in the shell window rather than the internet window that is right behind it (which is weird because I made sure the buttons are visible when I run it). I'm sure my code probably has at least one dumb error in it though...