PyAutoGUI is a Python module for automating tasks on Windows, Linux and MacOS. "To automate" is generally understood as controlling the mouse and keyboard, although in this particular case other tools are included, such as dialogs and screenshots.
The module API is quite simple, as we'll see. Unfortunately, PyAutoGUI functions do not follow the naming convention recommended by the PEP 8 – Style Guide for Python Code.
PyAutoGUI can be installed along with all its dependencies via pip. Open a terminal and run:
pip install pyautogui
If you have problems installing with pip, just download the source code, unzip and run:
python setup.py install
In order to successfully run PyAutoGUI on Linux, the following dependencies must also be installed:
pip install python3-xlib sudo apt-get install scrot sudo apt-get install python3-tk sudo apt-get install python3-dev
On Max OS the following is required:
pip install pyobjc-core pip install pyobjc
On Windows the package does not require any additional installation.
To get mouse position:
To move the mouse to a certain position:
The code places the cursor at position X=300 Y=200, with the origin of coordinates being the upper left corner.
By default the transition occurs instantly. By passing the
duration parameter you can tell the module that the movement should take a certain number of seconds.
You can also configure how the transition from the current position to the target position should be made, which by default occurs linearly, via a fourth parameter called
tween. The options are:
moveRel() function is similar to the previous one, but the X and Y values are specified relative to the current cursor position, so negative values can also be used.
To simulate a mouse click:
These three functions simulate a left click by default. To change this, specify the
button parameter, which can take
You can also specify where the click should be done within the screen:
A click is simulated by generating a mouse-down and a mouse-up event. These events can also be generated individually:
To drag and drop:
By default the mouse is dragged by holding down the right button.
dragTo() accepts the same parameters as
It is generally recommended that the operation does not occur immediately so that it works correctly on all platforms:
To specify the position in relative terms, use
The drag and drop method is basically similar to the following, as long as the
duration parameter is not specified:
To scroll up or down:
The meaning of the argument is highly platform dependent, so I recommend trying out some arbitrary numbers to see which one suits your needs.
To press and release a key:
To write text:
pyautogui.typewrite("Python Assets") # "\n" causes the Enter key to be pressed. pyautogui.typewrite("Python\nAssets")
To perform key combinations:
We can see a list of available keys via:
interval to specify the interval (in seconds) between one key and another.
Message Boxes and Dialogs¶
PyAutoGUI includes the PyMsgBox module, which allows you to create message boxes and dialogs by internally using Tk.
alert() function displays a message on the screen with a title, and returns the text of the pressed button (
"OK" by default) or
None if the window was closed. To change the button text, pass a third argument.
If you want the message to show several options, use
# Buttons. OPT_CLOSE = "Yes, close" OPT_SAVE_AND_CLOSE = "Save All and Close" OPT_KEEP_WORKING = "No, keep working" # Create the message. opt = pyautogui.confirm( "Do you want to close the program?", "Confirmation", [OPT_CLOSE, OPT_SAVE_AND_CLOSE, OPT_KEEP_WORKING] ) # Make decision based on button pressed. if opt == OPT_CLOSE: # ... elif opt == OPT_SAVE_AND_CLOSE: # ... elif opt == OPT_KEEP_WORKING: # ...
To request the user to enter a text, there is a
# The function returns None if the window was closed or "Cancel" was pressed. user = pyautogui.prompt("Enter your username", "Login")
It is possible to pass a default value and a timeout in milliseconds:
To take a screenshot:
Since PyAutoGUI works internally with PIL/Pillow,
screenshot will be an instance of
PIL.Image.Image. Thus you can use
screenshot.save("output.png") to save the image or
screenshow.show() to visualize it.
Capture a portion of the screen:
region must be a
(X, Y, Width, Height) tuple.
To get the color of a pixel as a RGB tuple:
To find out if a pixel equals a certain color, you can use:
Although this code is similar to
screenshot.getpixel((500, 400)) == (225, 228, 229),
pixelMatchesColor() accepts an extra argument called
tolerance that controls how much you can vary the color to make the search less restrictive.
# The color does not match exactly but it is close. >>> pyautogui.pixelMatchesColor(500, 400, (220, 230, 232), tolerance=10) True
Locate an image within the screen:
# Returns a (X, Y, Width, Height) tuple if image.png # is found on screen. >>> pyautogui.locateOnScreen("image.png") (759, 44, 96, 48)
The function also accepts images opened by Pillow:
>>> from PIL import Image >>> image = Image.open("image.png") >>> pyautogui.locateOnScreen(image) (759, 44, 96, 48)
If the image appears multiple times on the screen,
locateAllOnScreen() returns an iterator with every match:
Locate an image within a region of the screen:
# Also applies to locateAllOnScreen(). >>> pyautogui.locateOnScreen("image.png", region=(0, 0, 1100, 300)) (759, 44, 96, 48)
Specifying a region greatly increases the performance of the function.
Locate an image considering only grayscale:
# Also applicable to locateAllOnScreen(). >>> pyautogui.locateOnScreen("image.png", grayscale=True) (759, 44, 96, 48)
grayscale=True has a ~30% performance speedup.
To get the size of the screen:
Determine if a pair of coordinates are within the screen limits: