Thanks for your interest in contributing.
We actively use tidylake in day-to-day data platform work, and we continuously extend it based on real project needs. Contributions from the community are welcome, including bug fixes, documentation improvements, plugins, demos, and feature proposals.
This project uses a standard Python toolchain:
uvfor virtual environments and dependency managementTaskfile(task) as the main automation entry pointpytestfor unit, integration, and acceptance testsrufffor linting and formattingpre-commitfor local quality gates- Material for MkDocs for documentation
In practice, most local development starts with task ... commands.
- Clone the repository:
git clone <repo-url>
cd tidylake- Install development dependencies:
task deps:installThis task:
- Creates a fresh
.venv - Syncs dependencies with
uv(--all-groups) - Installs
pre-commithooks
- Optional: activate the environment manually:
source .venv/bin/activateUse task to list all commands:
taskCommon commands:
task deps:install
task pack:install
task lint:check
task lint:format
task docs:serve
task docs:build
task docs:build:github-pages
task test:unit
task test:integration
task test:acceptance
task test:run
task demo:pandas:local
task demo:pandas:iceberg:local
task demo:spark
task spark:start
task spark:stopNotes:
task test:run BADGE=truealso updatesdocs/img/coverage.svgtask demo:sparkstarts and stops Spark Connect automatically; it is used during demos and testing.
This is the contributor-oriented shape of the repository:
src/: Python package source codetests/: automated test suitesdocs/: documentation source for MkDocs
First-level package layout in src/tidylake/:
core/: core framework abstractions and runtime behaviorcli/: Typer-based CLI commands and entry pointsplugins/: built-in extension points (for example compute engine integration)demo/: runnable examples used for learning and validationscaffold/: project/template scaffolding supportutils/,visualization/: shared helpers and visualization output
src/ contains both the framework internals and the user-facing CLI. Because tidylake is designed to be user-friendly and extensible, demos, plugins, and other CLI-exposed capabilities are organized into dedicated modules.
Testing is organized by scope:
tests/unit: isolated behavior and module-level logictests/integration: interactions between components and CLI flowstests/acceptance: end-to-end user-facing behavior
Run tests with Taskfile:
task test:unit
task test:integration
task test:acceptance
task test:runA dedicated testing guide can be added later for deeper conventions, fixtures, and patterns.
The docs/ directory contains project documentation written in Markdown and built with Material for MkDocs.
You can preview docs locally with:
task docs:serveBuild the static documentation site:
task docs:buildThis generates the publishable site in site/.
For GitHub Pages-oriented builds (strict mode enabled):
task docs:build:github-pagesDocumentation contributions are encouraged: clarify concepts, improve examples, and add missing guides.
Before implementing major changes, align with the framework proposal and design direction.
The best way to propose improvements is to build a real use case, share it, and discuss the outcome. Plugins and extensions can also be contributed as community packages.