Find Biggest File in Directory Tree

/images/find-biggest-file-in-directory-tree/find-biggest-file.gif

The following cross-platform Python script allows you to search for the biggest file within a directory and all its subfolders (the search is recursive). The directory path must be passed as the first argument when invoking the script. Additionally, a second argument indicating a file extension can optionally be passed to limit the search.

import os
import pathlib
import sys
target_path = pathlib.Path(sys.argv[1])
try:
    target_ext = sys.argv[2]
except IndexError:
    target_ext = None
current_max_size = 0
print("Searching...")
# Recursively iterate over the target directory tree.
for dirpath, dirnames, filenames in os.walk(target_path):
    for filename in filenames:
        # Ignore files that lack the specified target extension, if any.
        if (target_ext is not None and
            not filename.lower().endswith(target_ext.lower())):
            continue
        p = pathlib.Path(dirpath, filename)
        if (size := p.stat().st_size) > current_max_size:
            # Update the variables that record the biggest file
            # found up to now.
            current_max_size = size
            biggest_file = p
            print("Biggest file found up to now:", filename)
print(f"Final biggest file: {biggest_file.absolute()} "
    f"({current_max_size // 1024} KB)")

For example, to find the biggest file inside C:\Python310 and its subfolders, run in the terminal:

py find_biggest_file.py C:\Python310

To find the largest Python (.py) file:

py find_file.py C:\Python310 .py

On macOS and Linux distributions, use python or python3 instead of py.