-
Notifications
You must be signed in to change notification settings - Fork 1.3k
build: integrate prek #15629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: integrate prek #15629
Conversation
Integrates prek as a repository scanner. It has some built-in rust checks that run across the repository, and can also integrate external tools. External tools are managed as python dependencies with `uv`, just explicitly instead of using prek's built-in uv support. You need uv either way. shellcheck, actionlint, and zizmor tools are initially integrated. We can migrate checks from the gradle build here (e.g. ast-grep, java-based checks) with follow-up commits. Closes apache#15626
|
I think this one is ok, although I didn't move a huge portion of stuff out of the gradle build yet, since some of it is a bit hairy :) Basically just added the easy-wins, keeping everything fast (none of the tooling here is actually in python). I think we should try to keep the precommit hook subsecond so that people will want to actually use it. |
dweiss
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh... I like it very much. Especially how it's all consolidated in one config file. I guess Windows users may suffer a bit... but these days you'd go through a PR which runs this for you anyway, so I don't see any problem here.
I'm +1 to add this and replace the hand-coded stuff.
|
It is configured with the default which is a universal lockfile. Check out the |
Add self-checks for the configuration. They fail if hook doesn't apply to any files in the repository, or if the excludes contain extraneous files that wouldn't be included anyway. It is helpful for detecting simple mistakes. For this reason, remove the symlink target check since we have no symlinks in the repository.
|
I plan to followup to address the TODOs in the (long) exclusion lists. Purpose of adding all these exclusions was to avoid reformatting/fixing tons of the codebase in this PR. |
|
Ah figured it out but does not work on Windows, so it can't run on Windows. So I assume also the ast-grep "uv run" does not work. I have no idea what all this means, as I have no idea where it looks for the binary. |
|
Aaah figured out. So it works!
You should read documentation. @rmuir: So to basically run the checks on jenkins just add a job and call |
|
P.S.: It also survived the "whitespace in username" hardcore check! |
@uschindler yes. If you want to be very pedantic, you can improve it for CI: Either of these does same thing: $ uv run --project dev-tools --frozen prek all-files
$ UV_PROJECT=dev-tools UV_FROZEN=1 uv run prek all-filesThis will definitely ensure it uses exact version of prek in our lockfile and is good for CI. If you look at the github workflow in here and the precommit configuration, you can see everything is done that way internally. edited: add "frozen" argument for more safety. |
nice!, question is, does it work fast on windows? |
The full checks take a few seconds (feels like 8 seconds on my device but there's terminal animations). The preconmit is milliseconds. Can't measure, not sure how to measure time on windows. |
|
Yeah the full checks across all files are several seconds on my machine too. It is really for CI purposes only. Glad to hear precommit works fast if you only change a few files, that is the whole purpose :) |
There's no desire to update the uv.lock in source code control when executing these commands: it can only trigger noise and confusion. This can happen if you have some custom uv config on your machine which would resolve packages differently (e.g. UV_DEFAULT_REGISTRY=)
|
I tested this on MacOS just now as well. |
|
@uschindler I updated those suggested CI commands (it matches what the github workflow does) with the extra safety. |
This delivers a clear error message if the user happens to have a too-old prek, rather than something confusing. We require the 0.2.29 which supports includes/excludes as yaml glob lists rather than the previous regexp approach, this is just a massive improvement to the maintainers of these files: https://github.com/j178/prek/releases/tag/v0.2.29
Improves the output from the github workflow if something had to be reformatted or fixed.
|
I will add the check-all-files command to the main branch Jenkins job on Policeman's instance. |
| priority: 0 | ||
| language: system | ||
| entry: uv | ||
| args: [ 'run', 'ast-grep', '-c', './gradle/validation/ast-grep/sgconfig.yml', 'scan', '--format', 'github' ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we no longer run ast-grep from withing Gradle we should move the config files to the dev tools folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uschindler I agree. I am working followups for a lot of this. I was trying to avoid reformatting / messing with entire codebase in one big PR.
| priority: 0 | ||
| language: system | ||
| entry: uv | ||
| args: [ 'run', 'ast-grep', '-c', './gradle/validation/ast-grep/sgconfig.yml', 'test', '--skip-snapshot-tests' ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
See the install link. There are many ways, including |
That's known. On ASF Jenkins we should check if it's available by default like Java. |
|
The PREK execution is now executed at end of build, temporarily for testing it was at beginning. I think this also makes sure the repo has no uncommitted changes. |
Integrates prek as a repository scanner and pre-commit tool. It has some built-in rust checks that run across the repository, and can also integrate external tools. External tools are managed as pypi dependencies with uv, just explicitly instead of using prek's built-in uv support. You need uv either way. shellcheck, actionlint, zizmor, ruff, rumdl, and ast-grep tools are initially integrated. We can migrate other checks from the gradle build here with follow-up commits. The CI is reorganized to run this "quick check" before any of the heavy java jobs. Current job in the worst case (cache miss) runs in less than a minute: so it does not slow down feedback to the user, but saves a lot of CI minutes. good task for ubuntu slim runner. To try the functionality out locally: You need uv package manager: https://docs.astral.sh/uv/getting-started/installation/ You can then change some code and try out a precommit-run with `uvx prek` To do this automatically as git-hook, use `uvx prek install`. Locally, the checks are fast, since generally only the files you touched are being passed to the checker. So the pre-commit git hook is actually not annoying (< 1s). The checker is configured with some autofix support. All of the current "Check" steps run in parallel, then the "Fix" steps run sequentially in the order they are defined in the `.pre-commit-config.yml` file. `Fix` solution is always preferred to make it easier on the user, the individual steps will use multiple cores anyway. Fixes will still "fail" the `git commit`, but they'll make the changes for you locally. You just need to review and `git add` the fixes made, then `git commit` again. Closes apache#15626



Integrates prek as a repository scanner and pre-commit tool. It has some built-in rust checks that run across the repository, and can also integrate external tools.
External tools are managed as pypi dependencies with uv, just explicitly instead of using prek's built-in uv support. You need uv either way.
shellcheck, actionlint, zizmor, ruff, rumdl, and ast-grep tools are initially integrated.
We can migrate other checks from the gradle build here with follow-up commits.
Closes #15626
The CI is reorganized to run this "quick check" before any of the heavy java jobs. Current job in the worst case (cache miss) runs in less than a minute: so it does not slow down feedback to the user, but saves a lot of CI minutes. good task for ubuntu slim runner.
To try the functionality out locally:
You need uv package manager
You can then change some code and try out a precommit-run with
uvx prekTo do this automatically as git-hook, use
uvx prek install.Locally, the checks are fast, since generally only the files you touched are being passed to the checker. So the pre-commit git hook is actually not annoying. On my 2-core with modifications to
IndexWriter.javastaged:The checker is configured with some autofix support. All of the current
Checksteps run in parallel, then theFixsteps run sequentially in the order they are defined in the.pre-commit-config.ymlfile.Fixsolution is always preferred to make it easier on the user, the individual steps will use multiple cores anyway.Fixes will still "fail" the
git commit, but they'll make the changes for you locally. You just need to review andgit addthe fixes made, thengit commitagain.