Add full support of format string parsing in compile-time API#2129
Conversation
vitaut
left a comment
There was a problem hiding this comment.
Looks great, thanks for another high quality PR!
| constexpr void on_error(const char* message) { throw format_error(message); } | ||
|
|
||
| constexpr int on_arg_id() { | ||
| throw format_error("handler cannot be used for empty arg_id"); |
There was a problem hiding this comment.
"for empty arg_id" -> "with automatic indexing"
Also can this be an assert?
There was a problem hiding this comment.
But it actually can be used for automatic indexing with named identifiers. Both runtime and (now) compile-time APIs keep automatic indexing when a named argument identifier is used. So it just cannot be used for an unnamed argument identifier in the automatic indexing mode, which this message is trying to say.
By the way, this function wouldn't be used in normal conditions because the code that invokes this handler actually controls that this handler is used only for numeric or named arguments. As long as it's true, no one would see this message, but when someone breaks the parsing code, they will get this message.
Also can this be an assert?
As I said, it just indicates an internal error, so the cause of this compile-time error can be everything not compile-time friendly. I saw several usages of throw format_error(...) and use it too.
There was a problem hiding this comment.
I'm not entirely sure what you mean by "unnamed argument identifier". Both "{}" and "{:...}" denote automatic indexing which is why I'm suggesting this minor wording change. It doesn't matter much since it's an internal error but a bit more consistent with the wording elsewhere.
it just indicates an internal error
Right and this is exactly why I'm suggesting to use an assert if possible. This will distinguish an internal error from a user error even though they both result in a compilation error. If assert doesn't work for some reason, then throw is OK.
There was a problem hiding this comment.
Yes, "unnamed argument identifier" sounds a bit strange. 🙂
But the problem is probably in my wrong understanding of how named arguments work. After updating this PR (as I wrote here), this wording problem would be probably eliminated.
| template <typename Char> struct parse_arg_id_result { | ||
| arg_ref<Char> arg_id; | ||
| const Char* arg_id_end; | ||
| }; |
There was a problem hiding this comment.
Can we pass begin by reference in parse_arg_id and avoid introducing this struct?
There was a problem hiding this comment.
Hmm... it would be a reference to the pointer, or (IMHO better) a pointer to the pointer, is it ok?
There was a problem hiding this comment.
Sure, I think reference is better unless it can be null.
There was a problem hiding this comment.
Actually, it's probably impossible because there is a need to have arg_id_end as a constexpr variable or, more importantly, begin has to be a non-constexpr variable in that case, but it should be used in a constexpr context.
| struct test_custom_formattable {}; | ||
|
|
||
| FMT_BEGIN_NAMESPACE | ||
| template <> struct formatter<test_custom_formattable> { | ||
| enum class output_type { two, four } type{output_type::two}; | ||
|
|
||
| FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) { | ||
| auto it = ctx.begin(), end = ctx.end(); | ||
| while (it != end && *it != '}') { | ||
| ++it; | ||
| } | ||
| auto spec = string_view(ctx.begin(), static_cast<size_t>(it - ctx.begin())); | ||
| auto tag = string_view("custom"); | ||
| if (spec.size() == tag.size()) { | ||
| bool is_same = true; | ||
| for (size_t index = 0; index < spec.size(); ++index) { | ||
| if (spec[index] != tag[index]) { | ||
| is_same = false; | ||
| break; | ||
| } | ||
| } | ||
| type = is_same ? output_type::four : output_type::two; | ||
| } else { | ||
| type = output_type::two; | ||
| } | ||
| return it; | ||
| } | ||
|
|
||
| template <typename FormatContext> | ||
| auto format(const test_custom_formattable&, FormatContext& ctx) const | ||
| -> decltype(ctx.out()) { | ||
| return format_to(ctx.out(), type == output_type::two ? "{:>2}" : "{:>4}", | ||
| 42); | ||
| } | ||
| }; | ||
| FMT_END_NAMESPACE |
There was a problem hiding this comment.
I suggest using one of the existing formatters such as duration formatter instead of introducing a new one here.
There was a problem hiding this comment.
One problem here is that the chrono::duration formatter is not ready to be used with compile-time API because of that format() constness requirement. Should I update it in this PR or the separate one?
There was a problem hiding this comment.
Should I update it in this PR or the separate one?
This PR is OK since it should be a small change.
There was a problem hiding this comment.
Done with the weirdest looking format string from chrono-test
| FMT_BEGIN_NAMESPACE | ||
| template <> struct formatter<test_dynamic_formattable> { | ||
| size_t amount = 0; | ||
| detail::arg_ref<char> width_refs[3]; | ||
|
|
||
| FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) { | ||
| amount = static_cast<size_t>(*ctx.begin() - '0'); | ||
| if (amount >= 1) { | ||
| width_refs[0] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
| } | ||
| if (amount >= 2) { | ||
| width_refs[1] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
| } | ||
| if (amount >= 3) { | ||
| width_refs[2] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
| } | ||
| return ctx.begin() + 1; | ||
| } | ||
|
|
||
| template <typename FormatContext> | ||
| auto format(const test_dynamic_formattable&, FormatContext& ctx) const | ||
| -> decltype(ctx.out()) { | ||
| int widths[3]{}; | ||
| for (size_t i = 0; i < amount; ++i) { | ||
| detail::handle_dynamic_spec<detail::width_checker>(widths[i], | ||
| width_refs[i], ctx); | ||
| } | ||
| if (amount == 1) { | ||
| return format_to(ctx.out(), "{:{}}", 41, widths[0]); | ||
| } else if (amount == 2) { | ||
| return format_to(ctx.out(), "{:{}}{:{}}", 41, widths[0], 42, widths[1]); | ||
| } else if (amount == 3) { | ||
| return format_to(ctx.out(), "{:{}}{:{}}{:{}}", 41, widths[0], 42, | ||
| widths[1], 43, widths[2]); | ||
| } else { | ||
| throw format_error("formatting error"); | ||
| } | ||
| } | ||
| }; | ||
| FMT_END_NAMESPACE |
There was a problem hiding this comment.
Same here. duration formatter has dynamic field support.
There was a problem hiding this comment.
Actually, the previous one (about custom formatter) and this are not the same.
Yes, it has dynamic field support. But as far as I can see, it supports the same set of nested replacement fields as the default formatter, {:{}.{}}. So handling 2 dynamic fields for the default formatter would probably be enough to pass the test with chrono::duration formatter.
While this custom formatter has a custom syntax for nested replacement fields (non {:{}.{}}), and it has 3 of them. So handling default dynamic fields wouldn't be enough to pass the test.
There was a problem hiding this comment.
I don't think we need to test the implementation of exotic formatter specializations here.
There was a problem hiding this comment.
Done with format string from chrono-test that uses dynamic specs
| constexpr void on_error(const char* message) { throw format_error(message); } | ||
|
|
||
| constexpr int on_arg_id() { | ||
| throw format_error("handler cannot be used for empty arg_id"); |
There was a problem hiding this comment.
I'm not entirely sure what you mean by "unnamed argument identifier". Both "{}" and "{:...}" denote automatic indexing which is why I'm suggesting this minor wording change. It doesn't matter much since it's an internal error but a bit more consistent with the wording elsewhere.
it just indicates an internal error
Right and this is exactly why I'm suggesting to use an assert if possible. This will distinguish an internal error from a user error even though they both result in a compilation error. If assert doesn't work for some reason, then throw is OK.
| template <typename Char> struct parse_arg_id_result { | ||
| arg_ref<Char> arg_id; | ||
| const Char* arg_id_end; | ||
| }; |
There was a problem hiding this comment.
Sure, I think reference is better unless it can be null.
| struct test_custom_formattable {}; | ||
|
|
||
| FMT_BEGIN_NAMESPACE | ||
| template <> struct formatter<test_custom_formattable> { | ||
| enum class output_type { two, four } type{output_type::two}; | ||
|
|
||
| FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) { | ||
| auto it = ctx.begin(), end = ctx.end(); | ||
| while (it != end && *it != '}') { | ||
| ++it; | ||
| } | ||
| auto spec = string_view(ctx.begin(), static_cast<size_t>(it - ctx.begin())); | ||
| auto tag = string_view("custom"); | ||
| if (spec.size() == tag.size()) { | ||
| bool is_same = true; | ||
| for (size_t index = 0; index < spec.size(); ++index) { | ||
| if (spec[index] != tag[index]) { | ||
| is_same = false; | ||
| break; | ||
| } | ||
| } | ||
| type = is_same ? output_type::four : output_type::two; | ||
| } else { | ||
| type = output_type::two; | ||
| } | ||
| return it; | ||
| } | ||
|
|
||
| template <typename FormatContext> | ||
| auto format(const test_custom_formattable&, FormatContext& ctx) const | ||
| -> decltype(ctx.out()) { | ||
| return format_to(ctx.out(), type == output_type::two ? "{:>2}" : "{:>4}", | ||
| 42); | ||
| } | ||
| }; | ||
| FMT_END_NAMESPACE |
There was a problem hiding this comment.
Should I update it in this PR or the separate one?
This PR is OK since it should be a small change.
| FMT_BEGIN_NAMESPACE | ||
| template <> struct formatter<test_dynamic_formattable> { | ||
| size_t amount = 0; | ||
| detail::arg_ref<char> width_refs[3]; | ||
|
|
||
| FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) { | ||
| amount = static_cast<size_t>(*ctx.begin() - '0'); | ||
| if (amount >= 1) { | ||
| width_refs[0] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
| } | ||
| if (amount >= 2) { | ||
| width_refs[1] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
| } | ||
| if (amount >= 3) { | ||
| width_refs[2] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
| } | ||
| return ctx.begin() + 1; | ||
| } | ||
|
|
||
| template <typename FormatContext> | ||
| auto format(const test_dynamic_formattable&, FormatContext& ctx) const | ||
| -> decltype(ctx.out()) { | ||
| int widths[3]{}; | ||
| for (size_t i = 0; i < amount; ++i) { | ||
| detail::handle_dynamic_spec<detail::width_checker>(widths[i], | ||
| width_refs[i], ctx); | ||
| } | ||
| if (amount == 1) { | ||
| return format_to(ctx.out(), "{:{}}", 41, widths[0]); | ||
| } else if (amount == 2) { | ||
| return format_to(ctx.out(), "{:{}}{:{}}", 41, widths[0], 42, widths[1]); | ||
| } else if (amount == 3) { | ||
| return format_to(ctx.out(), "{:{}}{:{}}{:{}}", 41, widths[0], 42, | ||
| widths[1], 43, widths[2]); | ||
| } else { | ||
| throw format_error("formatting error"); | ||
| } | ||
| } | ||
| }; | ||
| FMT_END_NAMESPACE |
There was a problem hiding this comment.
I don't think we need to test the implementation of exotic formatter specializations here.
|
Actually, I'm going to make this PR a draft (yep, again 😄). |
…replacement fields
… instead of `throw`
vitaut
left a comment
There was a problem hiding this comment.
Two minor comments, otherwise looks good.
| const T& arg = get<N>(args...); | ||
| return write<Char>(out, arg); | ||
| if constexpr (is_named_arg<typename std::remove_cv<T>::type>::value) { | ||
| decltype(T::value) arg = get<N>(args...).value; |
There was a problem hiding this comment.
I think decltype(T::value) can be replaced with a bit simpler const auto&
| if constexpr (str.size() == 2 && str[0] == '{' && str[1] == '}') | ||
| return fmt::to_string(detail::first(args...)); | ||
| if constexpr (str.size() == 2 && str[0] == '{' && str[1] == '}') { | ||
| auto first = detail::first(args...); |
There was a problem hiding this comment.
This may introduce an extra copy. Please use const auto& instead of auto&.
|
Merged, thanks! |
Compile-time API functionality extended to support manual ordering and named arguments. Unlike my first attempt to do this in #2111, where I tried to use a format part array, here I'm just reusing that recursion of functions
compile_format_string()andparse_tail().Some points for the changes:
{0}and automatic indexing with{name}work exactly as they work in the runtime APIstatic_asserts fail with corresponding messagesunknown_format()is returned from string compilation procedure, thus we fallback to the runtime API for this string (but this fallback is currently broken)