-
Notifications
You must be signed in to change notification settings - Fork 401
feature: cross-format pdf-standard selection
#13857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
|
Following our meeting discussion, related issues on pdf accessibility |
21d0394 to
71ab9c4
Compare
|
Clarification: LaTeX does not do compliance validation; that's why it only produces warnings. We'd have to integrate something like veraPDF to actually validate LaTeX PDFs. Compliance validation is integrated for Typst. |
|
Probably closing: |
b79e590 to
c53b00a
Compare
cderv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some quick comments for now as I tried veraPDF install on windows
| async function installed(): Promise<boolean> { | ||
| const dir = verapdfInstallDir(); | ||
| const verapdfBin = isWindows | ||
| ? join(dir, "verapdf.bat") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a heads up that we know .bat file are a problem in some environment. We had some report about dart-sass not working inside Quarto (because this is a bat file).
So a limitation to know about with this veraPdf, though quite specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok let’s talk about the right way. We can probably launch directly. Hopefully we don’t need another Rust launcher. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you manage to run quarto install verapdf on windows with java then I think we're good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I installed a random Java 17 in Windows 11, installed verapdf, and see a nice big validation failure for an example
[verapdf]: Validating ua2-missing-title.pdf against PDF/UA2...FAILED
WARN: PDF validation failed for ua-2:
The Metadata stream as specified in ISO 32000-2:2020, 14.3 in the document catalog dictionary shall contain a dc:title entry
Output created: ua2-missing-title.pdf
Still want to get verapdf testing in windows ci, but I think it can wait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed this to look in the following order:
- If
QUARTO_VERAPDFis set, it's assumed to be a verapdf command prefix, likedocker runor however people need to run verapdf - Run the JAR from the Quarto installation of veraPDF directly using
java(no batch file). This turns out to be safe because the scripts already look for the veraPDF JAR in a forward-compatible way, so we just do the same thing. - Fall back to
verapdfon thePATH, if any
Tested on Mac, Windows - let's see how it does in CI!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still want to get verapdf testing in windows ci, but I think it can wait.
Let's open an issue, and we can discuss it. I am happy to take this over when we decide what we want the CI todo (only test verapdf install command, and that our integration is working, or use it also to have test for accessiblity regression)
cscheid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great aside from one small question on a comment.
c53b00a to
1e64da5
Compare
|
Tests need some help now that I understand standards better. Typst only supports ua-1 and LaTeX only supports ua-2 so shared tests will have to look like this: Most people won't target both formats, so this is mainly a concern for testing, but we'll have to note the differing standards in the documentation. |
Adds `pdf-standard` option for LaTeX and Typst PDF output supporting: - PDF versions: 1.4, 1.5, 1.6, 1.7, 2.0 - PDF/A: a-1b, a-2a, a-2b, a-2u, a-3a, a-3b, a-3u, a-4, a-4e, a-4f - PDF/UA: ua-1, ua-2 (LaTeX only) - PDF/X: x-4, x-4p, x-5g, x-5n, x-5pg, x-6, x-6n, x-6p (LaTeX only) Features: - Automatic PDF version inference from standard requirements - LaTeX image alt text propagation for PDF/UA compliance - Tagging enabled only for standards that require it - `quarto install verapdf` for PDF/A and PDF/UA validation - Automatic validation when verapdf is available Closes #4426, #13782, #13248 Co-Authored-By: Claude Opus 4.5 <[email protected]>
Workaround for Pandoc not passing alt text to Typst image() calls. When an image has alt text (from caption or fig-alt), emits raw Typst with image(..., alt: "...") for accessibility compliance. Temporary fix until pandoc#11394 is merged upstream. Closes #13868 Co-Authored-By: Claude Opus 4.5 <[email protected]>
Preserve terms element structure in definition lists for PDF/UA-1 validation. Uses nested show rule instead of .map().join() which destroyed the list structure. Suggested by @mcanouil (PR #13249 discussion) Co-Authored-By: Claude Opus 4.5 <[email protected]>
1e64da5 to
5bb21a7
Compare
- Add QUARTO_VERAPDF environment variable support for custom verapdf invocation (Docker, custom Java setups, etc.) - Use direct Java invocation with quarto-installed JAR by default, avoiding .bat file issues on Windows - Fall back to system verapdf on PATH if JAR not installed - Log info message when QUARTO_VERAPDF override is used - Warn if specified QUARTO_VERAPDF path doesn't exist - Add smoke test validating env var override and validation warnings Co-Authored-By: Claude Opus 4.5 <[email protected]>
When LaTeX doesn't support a PDF standard (e.g., ua-1), it warns and ignores it. Previously, verapdf would still try to validate against the unsupported standard, causing spurious failures. Now the applied standards are stored in metadata during format extras and used for validation, so only standards that were actually applied get validated. Changes: - Add kPdfStandardApplied constant for filtered standards - Store applied standards in extras.metadata during format processing - Read from pandocOptions.format.metadata for validation - Add latex-ua1-filtered.qmd test for filtering behavior - Update latex-multi-standard.qmd to use a-4f + ua-2 (compatible) - Update latex-combined.qmd to use 1.7 + a-2b (simpler test) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Match pandoc's template by removing the conditional that skipped pdflang when pdfstandard was set. Having pdflang in both \hypersetup and \DocumentMetadata is harmless (just redundant). This simplifies comparison for the passthrough branch. Co-Authored-By: Claude Opus 4.5 <[email protected]>
The test expected both standards (a-2b, ua-1) to appear in DocumentMetadata, but quarto filters out unsupported standards before passing to pandoc. Updated test to expect only the supported standard (a-2b) in the output. Co-Authored-By: Claude Opus 4.5 <[email protected]>
Fixes #4426
Also includes
Adds basic support in Typst and LaTeX for PDF standard selection.
PDF versions and standards can be independently specified, so this takes a list. Version specifies the output format; standard specified what validation is applied to the output.
Here are the rules: you can combine standards from different families:
The constraint is you can only have one from each category:
It would be equally valid to take
pdf-versionandpdf-standardas separate options, as LaTeX does, but I chose to follow Typst here and use one option for both.Once the options have been validated, these go to the command line
--pdf-standardfor Typst, new\DocumentMetadata{}command for PDF.The metadata includes
tagging=onfor any standard in LaTeX.Auto-installs required LaTeX packages (latex-lab, colorprofiles) when missing.
This includes a test which will fail UA-1 in Typst and warn in LaTeX; these warnings are surfaced from the LaTeX log.