pyrfs provides a uniform, chainable interface to file system operations, porting the UX of R’s fs package to Python. It is pure Python over the standard library — no compiled extension, no dependencies — adding the ergonomics the stdlib leaves out: one consistent namespace, tidy paths, typed self-describing values, and explicit failure.
📖 Documentation: https://pyrfs.netlify.app
Pre-release — install from GitHub:
pip install "pyrfs @ git+https://github.com/Lightbridge-KS/pyrfs"
# with the pandas integration:
pip install "pyrfs[pandas] @ git+https://github.com/Lightbridge-KS/pyrfs"Python’s file APIs accreted across os, os.path, shutil, glob,
pathlib, and tempfile. pyrfs smooths over the seams:
-
One namespace, predictable names. Functions are grouped by the noun they act on —
path_*(pure string algebra),file_*,dir_*,link_*— sodir_+ Tab shows every directory operation. No more remembering that creating isos.makedirsbut removing isshutil.rmtree. -
Predictable, path-carrying returns. Mutating verbs return the new path (so calls chain); queries return typed values. Compare
os.path.getsize()→ bareintvsfile_size()→Bytesthat displays444.5Kand compares against"10KB". -
Explicit failure, safe defaults.
shutil.copy2()silently overwrites its target;file_copy()raisesFileExistsErrorunless you passoverwrite=True. Traversals raise on unreadable entries unless you soften them withfail=False. -
Polymorphic over scalars, lists, and pandas Series. Every path function accepts one path or many — vectorization like fs, the Pythonic way.
-
Tidy paths. Always
/separators, never doubled or trailing — and in a terminal, paths are coloured by file type viaLS_COLORS(degrading automatically on non-TTY orNO_COLOR).
pyrfs functions are divided into four main families:
path_for manipulating and constructing pathsfile_for filesdir_for directorieslink_for links
Directories and links are special types of files, so file_ functions
generally also work on them (there is deliberately no dir_move() — use
file_move()).
import pyrfs as fs
# construct a path with path()
fs.path("foo", "bar", "a", ext="txt")FsPath('foo/bar/a.txt')
# list files
fs.dir_ls("pyrfs", glob="*.py")[FsPath('pyrfs/__init__.py'),
FsPath('pyrfs/display.py'),
FsPath('pyrfs/errors.py'),
FsPath('pyrfs/fspath.py'),
FsPath('pyrfs/info.py'),
FsPath('pyrfs/values.py')]
# create a new directory
tmp = fs.dir_create(fs.file_temp())
tmpFsPath('/tmp/pyrfs-demo/data')
# create new files in that directory
fs.file_create(tmp / "my-file.txt")
fs.dir_ls(tmp)[FsPath('/tmp/pyrfs-demo/data/my-file.txt')]
# remove files from the directory
fs.file_delete(tmp / "my-file.txt")
fs.dir_ls(tmp)[]
# remove the directory
fs.dir_delete(tmp);Where R pipes
file_temp() |> dir_create() |> path(letters[1:3]) |> file_create(),
pyrfs chains: every mutating verb returns the resulting FsPath (which
is a str, so it drops into open(), pd.read_csv(), or any API
expecting a path).
from pyrfs import FsPath
(FsPath(fs.file_temp())
.mkdir()
.touch_file("a").touch_file("b").touch_file("c")
.ls())[FsPath('/tmp/pyrfs-demo/chain/a'),
FsPath('/tmp/pyrfs-demo/chain/b'),
FsPath('/tmp/pyrfs-demo/chain/c')]
With the [pandas] extra, dir_info() returns a DataFrame whose
path, size, and permissions columns are real ExtensionDtypes — so
string literals work inside .query(), exactly like fs’s typed tibble
columns:
import pandas as pd
big = (fs.dir_info("pyrfs", recurse=True, glob="*.py")
.query("size > '4KB' and type == 'file'")
.sort_values("size", ascending=False)
.loc[:, ["path", "permissions", "size"]])
print(big.to_string(index=False)) path permissions size
pyrfs/_engine/dirops.py rw-r--r-- 14.3K
pyrfs/_engine/paths.py rw-r--r-- 13.5K
pyrfs/_engine/fileops.py rw-r--r-- 12.9K
pyrfs/fspath.py rw-r--r-- 10.9K
pyrfs/_pandas/arrays.py rw-r--r-- 7.26K
pyrfs/display.py rw-r--r-- 7.22K
pyrfs/values.py rw-r--r-- 6.71K
The .fs accessor vectorizes path operations over a Series:
paths = pd.Series(fs.dir_ls("pyrfs/_engine", glob="*ops.py"))
print(paths.fs.size())0 14.3K
1 12.9K
2 3.7K
dtype: bytes
And reading a collection of files into one frame — the
purrr::map_df(.id=) trick — is a dict comprehension away, because
dir_ls() returns paths whose .name() you can key on:
tsv_dir = fs.dir_create(fs.file_temp())
df = pd.DataFrame({"species": ["adelie", "adelie", "gentoo"], "mass": [3800, 3250, 5000]})
for species, d in df.groupby("species"):
d.to_csv(tsv_dir / f"{species}.tsv", sep="\t", index=False)
files = fs.dir_ls(tsv_dir, glob="*.tsv")
combined = pd.concat({f.name(): pd.read_csv(f, sep="\t") for f in files}, names=["file"])
print(combined) species mass
file
adelie.tsv 0 adelie 3800
1 adelie 3250
gentoo.tsv 0 gentoo 5000
The functional names are identical, so muscle memory transfers:
| R fs | pyrfs |
|---|---|
dir_ls("d", recurse = TRUE) |
dir_ls("d", recurse=True) |
file_copy("a", "b") |
file_copy("a", "b") or FsPath("a").copy_to("b") |
dir_info("d") |> filter(size > "10KB") |
dir_info("d").query("size > '10KB'") |
See the full translation guide and the honest list of deliberate differences.
The docs site has guides, an executable tour notebook, and the API reference — plus llms.txt / llms-full.txt for AI agents. Bug reports and feature requests: GitHub issues.
MIT © Lightbridge-KS — see LICENSE.md. Release history: CHANGELOG.md.