pathfinder – a simpler os.walk


Stable releases of pathfinder can be installed with pip or you may download a .tgz source archive from pypi. See the Installation page for more detailed instructions.

Basic find

from pathfinder import find_paths

# all files and directories
paths = find_paths(".")

# all files
paths = find_paths(".", just_files=True)

# all directories
paths = find_paths(".", just_dirs=True)

By default find_paths prepends the path you search for to the results. If you want you can ensure the results only contain absolute paths:

paths = find_paths(".", abspath=True)

Filtering the results

Having a full listing is useful but wouldn’t it be great if we could filter the results.

There a are a number of ways we can do this. Let’s start with the Unix shell-style pattern approach:

# all PDF files
paths = find_paths(".", fnmatch="*.pdf")

fnmatching provides some power, but for more flexibility lets have a look at the regular expression support:

# all PDF files
paths = find_paths(".", regex=".*\.pdf")

# all PDF files with four letter base names
paths = find_paths(pwd, regex=".*/.{4}\.pdf")

pathfinder provides the ability to ignore certain paths too:

# create your ignore filter to ignore all PDF files
# from the files with three character extensions
from pathfinder import FnmatchFilter
ignore = FnmatchFilter("*.pdf")
find_paths(".", regex=".*/.*\..{3}$", ignore=ignore)

# ignore all files and directories that begin with .
ignore = RegexFilter("\..*")
find_paths(".", ignore=ignore)

Extra support for images

Let’s find some images in the directory:

# all of the images
from pathfinder import ImageFilter
find_paths(".", filter=ImageFilter())

That is just a shortcut for matching multiple file extensions, but we can also filter the results based on the dimensions of the image:

# only images less than 20 pixels tall
from pathfinder import ImageDimensionFilter
find_paths(".", filter=ImageDimensionFilter(max_height=20))

# only images less than 10 pixels tall and wide
from pathfinder import ImageDimensionFilter
find_paths(".", filter=ImageDimensionFilter(max_height=10, min_height=10))

And we can also search for images based on their color paletter:

# only color images
from pathfinder import ColorImageFilter
find_paths(".", filter=ColorImageFilter())

# only greyscale images
from pathfinder import GreyscaleImageFilter
find_paths(".", filter=GreyscaleImageFilter())

Combining filters

Filters can also be combined to create even more complex filters (just in case you need them). pathfinder supports AND, OR and NOT functions.

# color images AND greater than 400 bytes
from pathfinder import ColorImageFilter
from pathfinder import SizeFilter
color = ColorImageFilter()
size = SizeFilter(max_bytes=400)
find_paths(".", filter=color & size)

# pdf OR txt files
from pathfinder import FnmatchFilter
txt = FnmatchFilter("*.txt")
pdf = FnmatchFilter("*.pdf")
find_paths(".", filter=txt | pdf)

# txt files, but NOT ones begining with a
from pathfinder import NotFilter
from pathfinder import SizeFilter
from pathfinder import FnmatchFilter
txt = FnmatchFilter("*.txt")
afiles = NotFilter(FnmatchFilter("*/a*"))
find_paths(".", filter=txt & afiles)

find shortcut

You can also run a find directly from a filter:

