Skip to content

Introduce raw0 format for zero-terminated strings#388

Merged
01mf02 merged 8 commits into01mf02:mainfrom
pjungkamp:fmts/raw0
Feb 4, 2026
Merged

Introduce raw0 format for zero-terminated strings#388
01mf02 merged 8 commits into01mf02:mainfrom
pjungkamp:fmts/raw0

Conversation

@pjungkamp
Copy link
Contributor

This implements a raw0 format that can be used for zero-terminated strings as both input and output of jaq. I also added a jq compatible --raw-output0 switch, which is essentially just an alias for --to raw0.

Use case

I want to be able to safely use UNIX utilities with jaq. Especially the safe enumeration of files using find -print0 and batch process execution using xargs -0 would make me more confident in using jaq for shell scripting.

Examples

roundtrip conversion

$ echo '["some string\nwith a newline", "another string"]' | jaq --to raw0 ".[]" | jaq --from raw0
"some string\nwith a newline"
"another string"

jq compatibility

$ echo '["some string\nwith a newline", "another string", "a\u0000b"]' | jq --raw-output0 ".[]" | target/debug/jaq --from raw0
jq: error (at <stdin>:1): Cannot dump a string containing NUL with --raw-output0 option
"some string\nwith a newline"
"another string" 
$ echo '["some string\nwith a newline", "another string", "string with nul \u0000b"]' | target/debug/jaq --raw-output0 ".[]" | target/debug/jaq --from raw0
"some string\nwith a newline"
"another string"
Error: Cannot dump a string containing NUL with `--to raw0` or `--raw-output0`

find -print0

$ find jaq/tests -print0 | target/debug/jaq --from raw0 --slurp
[
  "jaq/tests",
  "jaq/tests/golden.rs",
  "jaq/tests/b.jq",
  "jaq/tests/a.jq",
  "jaq/tests/mods",
  "jaq/tests/mods/d.jq",
  "jaq/tests/mods/c.jq",
  "jaq/tests/256.bin",
  "jaq/tests/data.json"
]

Resolves #101

Copy link
Owner

@01mf02 01mf02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks already quite good! Thanks for your effort!

@pjungkamp pjungkamp force-pushed the fmts/raw0 branch 3 times, most recently from 7224222 to 902bab8 Compare January 1, 2026 18:34
@pjungkamp
Copy link
Contributor Author

Okay. I removed the write::raw0 module and cleaned up the read function.

I've also refactored the jaq-fmts::write::write function because I found the code around the main match statement to become too hard to read regarding what conditionals apply to which format. I can also revert it back and just add an additional if and tweak the other conditionals if you like that better.

@01mf02
Copy link
Owner

01mf02 commented Jan 2, 2026

Okay. I removed the write::raw0 module and cleaned up the read function.

This is great! I'm now happy with the functionality (except for --slurp, where I would like to wait for feedback).

I've also refactored the jaq-fmts::write::write function because I found the code around the main match statement to become too hard to read regarding what conditionals apply to which format. I can also revert it back and just add an additional if and tweak the other conditionals if you like that better.

I think that the write was easier to read before. I really like having the match arms as one-liners. Lines of code matter to me, because having fewer lines facilitates browsing files (at least for me). I'd appreciate it if you would revert the refactoring.

@01mf02
Copy link
Owner

01mf02 commented Jan 2, 2026

I would also eliminate the functions raw_parse_many and raw0_parse_many, because they are essentially "one-liners" as well. You could also extract the |line| ... / |slice| ... closure and make it into an inline function, such as let slice_bytes = |line| Ok(...). That way, you avoid some code duplication.

@pjungkamp
Copy link
Contributor Author

Better now? I'll squash the fixups and wait for the feedback on --raw-input0 from jq before marking this as ready.

@01mf02
Copy link
Owner

01mf02 commented Jan 2, 2026

Better now? I'll squash the fixups and wait for the feedback on --raw-input0 from jq before marking this as ready.

This is definitely going in the right direction. I left you some more comments. Thank you so much already for your dedication. And a happy new year, by the way. ;)

// end of YAML document
writeln!(w, "...")?
}
Ok(())
Copy link
Owner

@01mf02 01mf02 Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Ok(()) can be omitted because of the flush before.

@01mf02
Copy link
Owner

01mf02 commented Jan 3, 2026

I went ahead and did the last changes myself. Now I think that this PR is complete. Let's leave it like it is and wait for the comments on slurping. Have a nice weekend!

@pjungkamp
Copy link
Contributor Author

I've squashed all the fixup commits. Let's see what happens one the jq side.

@01mf02
Copy link
Owner

01mf02 commented Jan 26, 2026

Hi @pjungkamp! There is no activity on jqlang/jq#3456, so I think I'm fine to go ahead with the --slurp solution that you proposed, because it is more useful that way.

Do you have any other thoughts about this?

@pjungkamp
Copy link
Contributor Author

I just noticed that you had already added tests and documentation. I consider the PR ready.

@01mf02
Copy link
Owner

01mf02 commented Jan 27, 2026

I just noticed that you had already added tests and documentation. I consider the PR ready.

Thanks for your heads up.

Because the documentation did not cover --from raw0 --slurp, I went ahead and made an own section for --from raw0. To mirror the documentation structure for --raw-output0, I also implemented --raw-input0. What do you think about this?

@pjungkamp
Copy link
Contributor Author

To mirror the documentation structure for --raw-output0, I also implemented --raw-input0. What do you think about this?

I took some time to check the features and issue tracker of other jq clones. Neither gojq, fq or yq implement a --raw-input0 or have an issue or PR for such functionality. So we'd at least be safe in the way that we don't introduce new semantics that might conflict with other jq implementations.

The active discussion on jq itself is limited to the issues/feature requests mentioned here: jqlang/jq#3456 (comment). I also found this issue which is closed but still provides some interesting discussion on the topic.

I think we're pretty safe here. The --raw-output0 flag is part of upstream jq and thus here to stay and makes --raw-input0 the likely path forward should jq decide to implement zero-delimited input. I doubt that the configurable-delimiter ideas would influence --raw-{output,input}0 flags. I'm fairly certain that the way that we've implemented --raw-input0 --slurp is also the most likely path for jq to take if they decide to implement it.

I can't predict the future and may be totally wrong here but I hope I at least did my research properly.

@pjungkamp pjungkamp marked this pull request as ready for review January 27, 2026 11:53
@01mf02
Copy link
Owner

01mf02 commented Feb 4, 2026

I took some time to check the features and issue tracker of other jq clones. Neither gojq, fq or yq implement a --raw-input0 or have an issue or PR for such functionality. So we'd at least be safe in the way that we don't introduce new semantics that might conflict with other jq implementations.

The active discussion on jq itself is limited to the issues/feature requests mentioned here: jqlang/jq#3456 (comment). I also found this issue which is closed but still provides some interesting discussion on the topic.

I think we're pretty safe here. The --raw-output0 flag is part of upstream jq and thus here to stay and makes --raw-input0 the likely path forward should jq decide to implement zero-delimited input. I doubt that the configurable-delimiter ideas would influence --raw-{output,input}0 flags. I'm fairly certain that the way that we've implemented --raw-input0 --slurp is also the most likely path for jq to take if they decide to implement it.

I can't predict the future and may be totally wrong here but I hope I at least did my research properly.

Thanks a lot for your thorough research! I will now merge this.

@01mf02 01mf02 merged commit 291ba17 into 01mf02:main Feb 4, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support --raw-output0

2 participants