Skip to content

tbhb/searchpath

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

searchpath

PyPI CI Codecov CodSpeed Badge Python 3.10+ License: MIT

searchpath finds files across prioritized directories and tracks where each match comes from. Use it for config cascades (project overrides user overrides system), plugin discovery, or any scenario where files might exist in more than one location and you need to know which one you found.

Note

This package is under active development. The API may change before the 1.0 release.

Features

  • Search across directories with priority: Find files in config cascades, plugin directories, or any ordered set of paths
  • Track where matches come from: Every match includes provenance, telling you exactly which directory contains the file
  • Filter with patterns: Use glob patterns (default), full regex, or gitignore-style rules with negation (using pathspec)
  • Load patterns from files: Support hierarchical pattern files that cascade like .gitignore
  • Simple one-liners for common cases: searchpath.first("config.toml", project_dir, user_dir)
  • Minimal footprint: Only requires typing-extensions on Python < 3.12; pathspec optional for gitignore support
  • Safe for concurrent use: Immutable objects with no global state

Installation

pip:

pip install searchpath

# With gitignore-style pattern support via pathspec
pip install searchpath[gitignore]

poetry:

poetry add searchpath

# With gitignore-style pattern support via pathspec
poetry add searchpath[gitignore]

uv:

uv add searchpath

# With gitignore-style pattern support via pathspec
uv add searchpath[gitignore]

Quick start

import searchpath

# Find the first config.toml in project or user directories
config = searchpath.first("config.toml", "/project", "~/.config")

# Find all Python files
py_files = searchpath.all("**/*.py", "/src")

# Get match with provenance information
match = searchpath.match("settings.json", ("project", "/project"), ("user", "~/.config"))
if match:
    print(f"Found in {match.scope}: {match.path}")

Usage

The SearchPath class

from searchpath import SearchPath

# Create with named scopes
sp = SearchPath(
    ("project", "/project/.config"),
    ("user", "~/.config/myapp"),
    ("system", "/etc/myapp"),
)

# Find first matching file
config = sp.first("config.toml")

# Find all matching files with provenance
matches = sp.matches("**/*.toml")
for m in matches:
    print(f"{m.scope}: {m.relative}")

Pattern filtering

# Include/exclude patterns
sp.all("**/*.py", exclude=["test_*", "**/tests/**"])

# Load patterns from files
sp.all(exclude_from="exclude_patterns.txt")

# Ancestor pattern files (like gitignore cascading)
sp.all(exclude_from_ancestors=".searchignore")

Manipulating search paths

# Append path components
config_sp = sp.with_suffix(".config", "myapp")

# Concatenate search paths
combined = project_sp + user_sp

# Filter to existing directories
sp.existing()

Custom matchers

from searchpath import RegexMatcher, GitignoreMatcher

# Regex patterns
sp.all(r".*\.py$", matcher=RegexMatcher())

# Gitignore-style patterns (requires pathspec)
sp.all(exclude=["*.pyc", "__pycache__/"], matcher=GitignoreMatcher())

API

Module-level functions

def first(
    pattern: str = "**",
    *entries: Entry,
    kind: Literal["files", "dirs", "both"] = "files",
    include: str | Sequence[str] | None = None,
    include_from: Path | str | Sequence[Path | str] | None = None,
    include_from_ancestors: str | None = None,
    exclude: str | Sequence[str] | None = None,
    exclude_from: Path | str | Sequence[Path | str] | None = None,
    exclude_from_ancestors: str | None = None,
    matcher: PathMatcher | None = None,
    follow_symlinks: bool = True,
) -> Path | None

Find the first matching path across directories. Returns Path or None.

def match(...) -> Match | None  # Same parameters as first()

Find the first matching path with provenance information.

def all(
    pattern: str = "**",
    *entries: Entry,
    kind: Literal["files", "dirs", "both"] = "files",
    dedupe: bool = True,  # Additional parameter
    include: ...,  # Same as first()
    ...
) -> list[Path]

Find all matching paths across directories.

def matches(...) -> list[Match]  # Same parameters as all()

Find all matching paths with provenance information.

The SearchPath class

class SearchPath:
    def __init__(self, *entries: Entry) -> None: ...

    @property
    def dirs(self) -> list[Path]: ...

    @property
    def scopes(self) -> list[str]: ...

    def first(self, pattern: str = "**", ...) -> Path | None: ...
    def match(self, pattern: str = "**", ...) -> Match | None: ...
    def all(self, pattern: str = "**", ...) -> list[Path]: ...
    def matches(self, pattern: str = "**", ...) -> list[Match]: ...

    def with_suffix(self, *parts: str) -> SearchPath: ...
    def filter(self, predicate: Callable[[Path], bool]) -> SearchPath: ...
    def existing(self) -> SearchPath: ...
    def items(self) -> Iterator[tuple[str, Path]]: ...

Match dataclass

@dataclass(frozen=True, slots=True)
class Match:
    path: Path      # Absolute path to the matched file
    scope: str      # Scope name (e.g., "user", "project")
    source: Path    # The search path directory

    @property
    def relative(self) -> Path: ...  # Path relative to source

Entry type

Entry = tuple[str, Path | str | None] | Path | str | None

Pattern matchers

  • GlobMatcher - Default glob-style patterns (*, **, ?, [abc])
  • RegexMatcher - Full Python regex syntax
  • GitignoreMatcher - Full gitignore compatibility (requires pathspec)

Exceptions

SearchPathError          # Base exception
├── PatternError         # Pattern-related errors
│   ├── PatternSyntaxError(pattern, message, position)
│   └── PatternFileError(path, message, line_number)
└── ConfigurationError   # Invalid configuration

Development

# Clone and install
git clone https://github.com/tbhb/searchpath
cd searchpath
just install

# Common commands
just test          # Run tests
just lint          # Run linters
just format        # Format code

See CONTRIBUTING.md for more details.

AI disclosure

The development of this library involved AI language models, specifically Claude. AI tools contributed to drafting code, tests, and documentation. Human authors made all design decisions and final implementations, and they reviewed, edited, and validated AI-generated content. The authors take full responsibility for the correctness of this software.

Acknowledgments

This library optionally uses the pathspec library by Caleb P. Burns for gitignore-compatible pattern matching.

License

MIT License. See LICENSE for details.

About

Python library for searching ordered directories with pattern matching and provenance tracking

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks