PyAutoGUI - Cross-Platform Automation Module

Python Assets

2022-09-21

PyAutoGUI is a Python module for automating tasks on Windows, Linux and MacOS. "To automate" is generally understood as controlling the mouse and keyboard, although in this particular case other tools are included, such as dialogs and screenshots.

The module API is quite simple, as we'll see. Unfortunately, PyAutoGUI functions do not follow the naming convention recommended by the PEP 8 – Style Guide for Python Code.

Installing PyAutoGUI¶

PyAutoGUI can be installed along with all its dependencies via pip. Open a terminal and run:

pip install pyautogui

If you have problems installing with pip, just download the source code, unzip and run:

python setup.py install

In order to successfully run PyAutoGUI on Linux, the following dependencies must also be installed:

pip install python3-xlib
sudo apt-get install scrot
sudo apt-get install python3-tk
sudo apt-get install python3-dev

On Max OS the following is required:

pip install pyobjc-core
pip install pyobjc

On Windows the package does not require any additional installation.

Mouse Automation¶

To get mouse position:

>>> import pyautogui
>>> pyautogui.position()
Point(x=913, y=444)

To move the mouse to a certain position:

>>> pyautogui.moveTo(300, 200)

The code places the cursor at position X=300 Y=200, with the origin of coordinates being the upper left corner.

By default the transition occurs instantly. By passing the duration parameter you can tell the module that the movement should take a certain number of seconds.

# Take 3 seconds to move the mouse.
>>> pyautogui.moveTo(300, 200, 3)

You can also configure how the transition from the current position to the target position should be made, which by default occurs linearly, via a fourth parameter called tween. The options are:

linear (default value)
easeInBack
easeInBounce
easeInCirc
easeInCubic
easeInElastic
easeInExpo
easeInOutBack
easeInOutBounce
easeInOutCirc
easeInOutCubic
easeInOutElastic
easeInOutExpo
easeInOutQuad
easeInOutQuart
easeInOutQuint
easeInOutSine
easeInQuad
easeInQueart
easeInQuint
easeInSine
easeOutBack
easeOutBounce
easeOutCirc
easeOutCubic
easeOutElastic
easeOutExpo
easeOutQuad
easeOutQueart
easeOutQuint
easeOutSine

For example:

>>> pyautogui.moveTo(300, 200, 3, pyautogui.easeInBack)

The moveRel() function is similar to the previous one, but the X and Y values are specified relative to the current cursor position, so negative values can also be used.

# Move the mouse up and to the right.
>>> pyautogui.moveRel(50, -100, 1)

To simulate a mouse click:

>>> pyautogui.click()
>>> pyautogui.doubleClick()
>>> pyautogui.tripleClick()

These three functions simulate a left click by default. To change this, specify the button parameter, which can take left (default), middle or right.

# Double left click.
>>> pyautogui.doubleClick(button="right")

You can also specify where the click should be done within the screen:

# Click at X=50, Y=100
>>> pyautogui.click(50, 100)

A click is simulated by generating a mouse-down and a mouse-up event. These events can also be generated individually:

# Simulate pyautogui.click().
>>> pyautogui.mouseDown()
>>> pyautogui.mouseUp()

To drag and drop:

>>> pyautogui.dragTo(500, 400)

By default the mouse is dragged by holding down the right button. dragTo() accepts the same parameters as click().

It is generally recommended that the operation does not occur immediately so that it works correctly on all platforms:

# Recommended.
>>> pyautogui.dragTo(500, 400, duration=0.3)

To specify the position in relative terms, use dragRel():

>>> pyautogui.dragRel(100, 50)

The drag and drop method is basically similar to the following, as long as the duration parameter is not specified:

pyautogui.mouseDown()
pyautogui.moveTo(x, y)
pyautogui.mouseUp()

To scroll up or down:

# Scroll up.
>>> pyautogui.scroll(400)
# Scroll down.
>>> pyautogui.scroll(-400)

The meaning of the argument is highly platform dependent, so I recommend trying out some arbitrary numbers to see which one suits your needs.

Keyboard Automation¶

To press and release a key:

pyautogui.press("a")
pyautogui.press("enter")

To write text:

pyautogui.typewrite("Python Assets")
# "\n" causes the Enter key to be pressed.
pyautogui.typewrite("Python\nAssets")

To perform key combinations:

# CTRL + F5.
pyautogui.hotkey("ctrl", "f5")

We can see a list of available keys via:

>>> pyautogui.KEYBOARD_KEYS

Both typewrite() and hotkey() accept interval to specify the interval (in seconds) between one key and another.

pyautogui.typewrite("Python Assets", interval=0.2)

Message Boxes and Dialogs¶

PyAutoGUI includes the PyMsgBox module, which allows you to create message boxes and dialogs by internally using Tk.

# Display a message.
pyautogui.alert("Hello world!", "From PyAutoGUI")

The alert() function displays a message on the screen with a title, and returns the text of the pressed button ("OK" by default) or None if the window was closed. To change the button text, pass a third argument.

pyautogui.alert("Hello world!", "From PyAutoGUI", button="OK")

If you want the message to show several options, use confirm():

# Buttons.
OPT_CLOSE = "Yes, close"
OPT_SAVE_AND_CLOSE = "Save All and Close"
OPT_KEEP_WORKING = "No, keep working"

# Create the message.
opt = pyautogui.confirm(
    "Do you want to close the program?",
    "Confirmation",
    [OPT_CLOSE, OPT_SAVE_AND_CLOSE, OPT_KEEP_WORKING]
)

# Make decision based on button pressed.
if opt == OPT_CLOSE:
    # ...
elif opt == OPT_SAVE_AND_CLOSE:
    # ...
elif opt == OPT_KEEP_WORKING:
    # ...

To request the user to enter a text, there is a prompt() function.

# The function returns None if the window was closed or "Cancel" was pressed.
user = pyautogui.prompt("Enter your username", "Login")

It is possible to pass a default value and a timeout in milliseconds:

# Default value "user1" and close after 5 seconds.
user = pyautogui.prompt("Enter your username", "Login",
                        default="user1", timeout=5000)

Screenshots¶

To take a screenshot:

>>> screenshot = pyautogui.screenshot()

Since PyAutoGUI works internally with PIL/Pillow, screenshot will be an instance of PIL.Image.Image. Thus you can use screenshot.save("output.png") to save the image or screenshow.show() to visualize it.

Capture a portion of the screen:

>>> screenshot = pyautogui.screenshot(region=(50, 50, 400, 300))

region must be a (X, Y, Width, Height) tuple.

To get the color of a pixel as a RGB tuple:

# Get pixel color at X=500, Y=400
>>> screenshot.getpixel((500, 400))
(225, 228, 229)

To find out if a pixel equals a certain color, you can use:

>>> pyautogui.pixelMatchesColor(500, 400, (225, 228, 229))
True

Although this code is similar to screenshot.getpixel((500, 400)) == (225, 228, 229), pixelMatchesColor() accepts an extra argument called tolerance that controls how much you can vary the color to make the search less restrictive.

# The color does not match exactly but it is close.
>>> pyautogui.pixelMatchesColor(500, 400, (220, 230, 232), tolerance=10)
True

Locate an image within the screen:

# Returns a (X, Y, Width, Height) tuple if image.png
# is found on screen.
>>> pyautogui.locateOnScreen("image.png")
(759, 44, 96, 48)

The function also accepts images opened by Pillow:

>>> from PIL import Image
>>> image = Image.open("image.png")
>>> pyautogui.locateOnScreen(image)
(759, 44, 96, 48)

If the image appears multiple times on the screen, locateAllOnScreen() returns an iterator with every match:

>>> list(pyautogui.locateAllOnScreen("image2.png"))
[(1060, 629, 17, 20), (1177, 629, 17, 20)]

Locate an image within a region of the screen:

# Also applies to locateAllOnScreen().
>>> pyautogui.locateOnScreen("image.png", region=(0, 0, 1100, 300))
(759, 44, 96, 48)

Specifying a region greatly increases the performance of the function.

Locate an image considering only grayscale:

# Also applicable to locateAllOnScreen().
>>> pyautogui.locateOnScreen("image.png", grayscale=True)
(759, 44, 96, 48)

grayscale=True has a ~30% performance speedup.

Miscellaneous¶

To get the size of the screen:

>>> pyautogui.size()
Size(width=1360, height=768)

Determine if a pair of coordinates are within the screen limits:

>>> pyautogui.onScreen(100, 500)
True
>>> pyautogui.onScreen(1400, 700)
False