Define the concept of "parser" in X3#807
Conversation
Also modernizes the entry points and the most fundamental "x3::parser" base class. Includes overhaul on `any_parser` as it operates on iterator/sentinal now. Deprecate `any_parser`. The reasoning is described in the comments.
|
Please ignore the build failures on AppVeyor. The workflow configuration for Windows is currently poorly implemented and requires non-trivial changes to enable C++23 support in X3. I'll send a follow-up PR for that, and it's definitely off-topic. I'm developing this PR on Windows, and running tests locally every time before I push commits. The tests are passing. |
|
Go for it! |
|
Did a quick review. Looks good to me. |
|
Thanks for your review! I'll immediately start working on the Windows CI fix, then continue on the follow-up PRs. Note that the CI is failing for purely operational reasons (causes no harm to end users even if merged with "failing" status), and the tests are 100% passing in my local Windows machine. The workflow fix requires non-trivial changes, and I'm merging this PR now to avoid unnecessary rebase nightmare. |
|
CI is fixed in #808. |
|
@saki7 There's been some backlash with this PR, mostly due to breaking changes. We should probably ease up and rethink what we are doing here. Boost thrives on consensus, and the discussions seem to be going in a negative direction, which should be addressed both here and in the mailing list. I'll urge the parties to raise the issues here. |
|
@djowel Could you post your opinion on the mailing list? Most of the discussions are occurring on the mailing list and people are accusing me of these changes on the mailing list. |
I wrote a reply. I am urging people to have the technical discussions done here as well. The problem is communication. Discussions are happening everywhere. I don't even see the affected parties in the list, actually, esp. the MQTT authors. |
|
FYI, the MQTT authors have already decided to use more stable alternative.
I respect their choice and I'm not pushing X3 toward acceptance by such target-specific frameworks. |
I am not really sure how many other libraries depend on Spirit X3. I know I have a LOT that was written many years ago and would require porting if I need to bring it up to date. But it's a balance, and X3 was really originally conceptualized as eXperimental and pushing the boundaries of c++ (at the time C++14). I'd prefer to keep that original objective. But at the same time, I also do not want to alienate users of X3. The problem is the lack of communication. It used to be that there was just the Boost dev and users list. Now, communication is VERY fragmented. |
I truly agree with this concern. This is the reason why I've never submitted any single mail to the list, despite that I'm the heavy reader who has subscribed on the list for more than 10 years. The problem is that the all political discussions occur on the Boost Developers Mailing List, and the core devs are almost always making important decisions on the list. Whenever a core dev posts some personal thoughts on the mailing list, if we don't reply to them, we lose the chance to provide information to correct misunderstandings. |
This contradicts the statement here, specifically that X3 is currently the official, main version of Spirit. Unless you're saying that Spirit as a whole is an experimental library not suitable for production usage, I think you (Spirit maintainers) need to make a decision and announce which versions of Spirit are actively maintained and serve which purposes. And in particular, which version of Spirit should be considered stable and production-ready by users. This announcement should probably be made not here but on Boost ML and Spirit release notes, and the same should be clear from the documentation. |
|
My understanding is that the spirit of Spirit is "eXperimental", while the current library status for X3 is indeed the main version. These two facts does not contradict with each other. An enthusiastic, "eXperimental" library which pushes the boundaries of C++ can technically be the main version at the same time. V2 (Qi & Karma & Lex) is in fact not actively maintained thus we can't say they are the "main" version. |
Very good point, and this needs some elaboration. What I am saying is that originally, rewind 10-15 years, there was Qi and Karma, which were the main libraries of Spirit. X3 was the experimental branch at the time. X3 pushed into C++14. It has since matured and became stable. Yes, it was originally conceptualized as eXperimental and pushing the boundaries of c++. Now, fast forward, 2025, whether or not it should keep the "X" brand is subject to discussion and debate, but, historically, Spirit, as a whole, always pushed the boundaries of C++. Always, from the very start. This discussion is somewhat nostalgic, in the sense that, this was also the situation when X3 came to be: there were mature libraries (Qi, Karma) that need to be maintain backward compatibility and not introduce breakage because there are dependent libraries, and we need to be very careful. But that also prevented me from exploring new C++ features (C++14 at the time and onwards to C++17). I simply overlooked the fact that X3 already has dependent libraries and users, and I regret that. For that matter, the suggestion to make X3 stable and move to X4 is starting to be appealing. But that, of course, depends on @saki7. |
|
I really think this new effort from @saki7 should be x4, or v4. It is kind of hard with breaking ABI and other dependencies when you have others depending on you. But also, to do this there has to be some smart documentation that will explain that classic and v2 is not maintained and that x3/v3 will maybe be bugfix only and v4 will be the new world with c++23 and forward version required. But also, in one way, have anybody thought about what the new c++26 things can bring to the mix?? Maybe v4 should be c++26. And we in smart ways keep ABI but 'sprinkle' some C++23 magic if compiler support it for v3?? Just my thoughts. |
|
@jpyllman I value your opinion. Actually, the recent suggestion I just posted on the mailing list is heading to the same way as you mentioned. Regarding the C++26 support: there were two reasons that I chose C++23 for the initial "modernization" attempt. (1) To make sure the change is somewhat less aggressive. I personally use Don't worry. I will officially bump the version (of Spirit.X4) to C++26 as soon as we feel it is appropriate. Now is not the time. |
|
@saki7, when the first discussion about splitting different version and what to name them I did not want to put even more confusion to the discussion. But when this talk about compiler support and to different ABI I just felt I needed to say something. I am very happy you are putting all this effort into spirit. There are great things in newer version of C++ to help Spirit. But at the same time. I feel like the already existing code already using spirit might not handle to big changes. Some things like constexpr and noexcep and maybe conditionally have concepts is possible. And then I thought, if you anyway is going to break things, maybe already now look at what could be great from C++26 and what design implications that would bring. And that brings a new major version that does not need to support older versions of C++. |
This can be considered as one of the valid directions for applications. However we're building library here, and we should be conservative at the least level like I mentioned. This applies to Spirit.X4, despite that it is indeed going to be a brand new library. We should also note that all of GCC/Clang/MSVC tend to emit false-positive errors on bleeding edge features. I use all three major compilers, GCC/Clang/MSVC, for my applications. I use MSVC to develop game. I use GCC/Clang to test whether my code is conforming to the standard. For this purpose I really need 3 toolchains to make sure I'm writing legit code. |
|
The Boost developers has reached a consensus that new development will take place on Spirit.X4. X3 is reverted to the C++17 baseline, officially feature-frozen, and will accept security patches only. For details, see: https://lists.boost.org/archives/list/boost@lists.boost.org/thread/K3EQLEQJHEUBROB6ODUKTZHFP2FDUS2E/?sort=date |
|
Hello, I have come to Spirit's repository to report a suspected regression bug. Some of the users here might remember me for reporting multiple bugs few years ago and other collaboration - Joel especially although I departed from Elements library in pursuit of something different in a quite ungraceful way for which I apologize. I just read all the things about X3/X4 in the project's readme and it left me somewhat sad and confused.
|
Boost.Parser and Boost.Spirit and different, unrelated libraries. You can use whichever suits your needs best.
There are several channels of communication, including Slack. See here. Though I'm not sure if Boost.Spirit or Boost.Parser maintainers are active on Slack. In general, official and community-wide discussions are better held on the devs mailing list. Discussions pertaining to specific issues and PRs are typically in the issues and PR comments. Slack is for more interactive communication. BTW, I think you can use a forum-like web interface to the mailing list. See here. |
|
Thank you for your comment, I'll reply to your points one by one.
X4 is official part of the Boost. It's just not added to the
Spirit.Classic, Spirit.V2, and Spirit.X3 are unmaintained. Spirit.X4 is maintained.
X3 cannot be maintained, for both political and technical reasons. Before the modernization work got reverted, I was the primary maintainer of X3. I decided to choose the hard path because I assumed that maintaining X3 while retaining the C++17 codebase would introduce serious maintenance burden, and even I, as the primary maintainer, can't afford my time for doing such unconstructive work. As of now, X4 has a significantly diverged (better) codebase which cannot interoperate with the X3 codebase. If we are to have an independent maintainer for X3, that would make the Boost Libraries to have two independent Spirits, which is very much unacceptable. I believe that you can agree with this point, as it is even worse than the situation you mentioned, given that we're going to have three different parser libraries (Spirit.X3 & Spirit.X4 & Boost.Parser) in the Boost Libraries. Even if someone volunteers to maintain X3, that would ultimately make the community divide into two factions. Then it can cause more and more drama over time, and it will cause pain on both the Spirit developers and the end users. This is why I stated that Spirit.V2 and Spirit.X3 are feature-frozen and only accepts security patches.
No offense, but I must admit that saying this is easy. I know there are many friendly users who are supporting Spirit. But maintaining a library is not a trivial task. I truly appreciate your past bug reports, but there is considerable distance between bug reports and maintenance work.
No. Boost.Parser differs from X3 in very significant ways. You may consider them as two independent libraries. It's the user who choose which one to use, not us. You should take a look at its codebase for couple of hours, and I'm sure you'll understand what I mean (really, look at its mega header file that consists of 9,790 lines). They even claim that X3 has some traits that are "quite difficult to get working", "suffer from limitations", "deal-breaker", "could see no way to fix it". I consider that statement a very hostile attitude coming from its maintainer, and I don't think we can have constructive discussion on this matter. If you insist on continuing this theme, please understand that I must lock the conversation as off-topic, if the argument goes to the undesired direction.
The official place for building organization-wide consensus is the Boost Developers Mailing List. I personally agree that email is not the best place for daily conversations, but I think the mailing list is still the good place for building important consensus. |
Do I understand it correctly that X3 and X4 cannot be both used in the same program, including when X3 and X4 don't inter-opereate? If so, this is a problem because there may be different parts of the program (e.g. different libraries) that independently use X3 and X4, and IMO that should not cause issues to the user.
I see nothing wrong with X3 (and v2 for that matter) being maintained while X4 is under active development. In fact, I would view that as the preferred state of things. If everything pre-X4 is unsupported (including no bug fixes) then Spirit as a whole is effectively unsuitable for production code bases. Because X4 is permanently the bleeding edge/experimental and therefore is unsuitable by definition. I realize this would require more efforts on the maintainers, but that's just how I see it. |
Don't worry, X3 and X4 are usable in the same application. X3's headers are inside
If I understand your comment correctly, your comment is based on your point of view, which itself is a legitimate opinion. The problem is, your thoughts do not empathize with the facts that Joel mentioned: "Spirit, as a whole, always pushed the boundaries of C++. Always, from the very start.". If you can't agree with this policy, it's okay. We're not even trying to change your opinion. It's just that we (Spirit maintainers) must choose some direction based on our own policy, and that's the outcome we have had in the mailing list. |
|
"Pushing the boundaries of C++" doesn't mean leaving previous (stable) versions unsupported. That's the purpose of bumping the versions in the first place - to branch off the bleeding edge version from the previous stable version that gained external users in order to avoid breaking them. I don't have a problem with branching off the bleeding edge version, quite the contrary. I do have a problem with leaving stable versions, with existing users, purposely unsupported, including without bug fixes. |
If you have bug fixes then just go ahead and submit the PR right now. I promise to merge it, as long as it does not change the existing semantics and does not add any new features. |
So bug reports for pre-X4 won't be closed as "unsupported" and bug fixes will be accepted going forward, correct? If so, that's good to hear. Please, update the readme accordingly (it currently says "security patches only"). Thanks. |
Sigh. Bug fixes that "does not change the existing semantics and does not add any new features" is technically a security patch. There's no margin between the security patch and the "bug fixes" I mentioned. Seriously, we're building a parser combinator library. If there exists some "bug fixes" that does not change the semantics of the parser combinator library, that is strictly equal to security patches. The Boost Organization have a very strong consensus that sublibraries decide what to maintain. We (Spirit maintainers) decide that we make the obsolete versions feature-frozen. That's the end of the story. |
|
There's a wide range of potential problems that are not security issues. From compiler warnings (including, from yet to be released compilers) to compilation errors to runtime misbehavior (i.e. behavior not according to the documentation). Of course, you can decide to maintain whatever you want. But that has a consequence for your users, existing and potential, and the responsible thing to do is to tell your users up-front whether they are supported (and whether you want users at all). For example, I have used Spirit in a few places, and right now I'm deciding what to do with that code and what the support status is. If you're saying that v2 and X3 are unsupported, I'll probably have to rewrite that code and not bother with Spirit again because of this support policy. Please, understand me correctly, this isn't a threat or anything, this is me trying to clarify things for myself and other users. |
V2 and X3 are NOT unsupported, they are technically supported, as we are accepting security patches. They are just "not actively maintained" and "feature-frozen". If you can't cope with Spirit's policy, which existed from the very beginning of its introduction, just go ahead and make your own decisions on your own application. There's nothing, absolutely nothing, that we can do about your personal work. |
Compiler warnings are not bugs. They are messages that imply the existence of potential mistakes. If that mistake has the risk of uninitialized memory access, for instance, then it is technically a security bug. We accept patches that resolves the security bug, which might resolve the warnings as a consequence.
Define "misbehavior". We define the "behavior" as the status quo of the Spirit library. That is exactly the "existing semantics" I mentioned. If we even try to change a single meaning, that would break someone else's assumptions, which requires the full maintenance work to deploy. We don't accept such changes, because V2 and X3 are feature-frozen. |
|
Thanks you both for a fast response.
This looks attractive to me, but if I create an account and post through the forum, will my private email be exposed?
Yes, I agree with this point. I don't want to be a maintainer in such case and don't want to fragment the community.
I have read significant part of Boost.Parser documentation and it looks almost exactly like X3. Nearly every function has the same name and even the meanings of operator overloads are the same. I looks very much like someone forked X3 or wants to support X3 users.
I did had some bad experience with Spirit X3, exactly the points Boost.Parser mentions as problematic or "deal-breaker". X3 has complicated attribute rules and tuple/container matching, with lots of surprises like requiring intermediate grammars in some cases to break dependencies/complexity or disallowing tuples of size 1. I reported multiple issues (most were real bugs) but personally I still found documentation lacking in this area, attributes were the hardest part to get right from my (user's) perspective. C++23 is too big requirement for me and Boost.Parser (which requires C++17, optionally C++20) looks very similar - a potential viable alternative.
What ways? You mean the attribute design or something else? The library's interface looks extremely similar and I predict I could change the namespace and ~80-90% of my code would already be compilable. If Boost is going to have 2 parsing libraries (X4 and Parser) I think it is in the interest of Boost to inform Boost users about library differences. Spirit (any version) is a complex idea and I think it is a great value to users if library authors explain fundamental design decisions and key differences from other projects. There might be more users of X3 who are in the same position as me and C++23 can also be too much of a requirement for them. I think X3 should inform about Boost.Parser as another potential alternative library to X3. I hope you don't view Boost.Parser as hostile competition.
That's not a definition I had in mind and @Lastique seems to also be confused. I think lots of users understand "security patch" only as something that fixes a crash or security exploit, not ordinary "implementation has a mistake" bug. I would edit Readme for clarification. |
I'm not sure, but probably yes, the email address that you use to subscribe to the mailing list will be present in the post, when received by recipients. |
|
I think there's a big misunderstanding here regarding maintenance and feature freeze. These are not mutually exclusive. While a library, e.g. X3, V2 and even Classic, along with the infrastructure libraries such as fusion, can be feature frozen, with no new features planned, there can always be some basic level of maintenance, like say, ensuring that the tests are running fine (basically github admin tasks) and that the code base is still keeping up with any potential breaking changes to the boost core (that could be very trivial like macro name changes, for example), clearing up or accepting (or not accepting) trivial non-breaking PRs such as documentation tweaks (seldom, but possible), or even keeping up with new compiler updates (e.g. even if we lock the code to say C++ 17, there may still be potential breaking changes that may cause some minor changes, again very seldom or not at all, but it can happen), etc. I'm with @Lastique here. V2 and X3 can be feature-frozen, but still maintained, with some basic level of maintenance—just enough to keep things running smoothly.
Sure, compiler warnings are not bugs, but these are still technically maintenance tasks that need to be dealt with. Even the mere act of disabling a compiler warning is maintenance. Compiler warnings need not necessarily imply the existence of potential mistakes. It can simply be, for example, a new compiler version adding a compiler warning that was not there before. It can happen. |
Speaking of maintenance, my initial reason for visiting this project's issues today was to report a potential regression. I will open a new issue before end of the week. I'm not sure if it is a bug (might be another case of misunderstanding parsers and attributes) but I would like someone to look into it because parser's output is very surprising to me and a full Boost update on my project caused some of my unit tests to fail. |
Not sure about this, but the mere act of handling such a github issue as "won't-fix" because we are in feature-freeze is maintenance. |
True, this may happen. I won't blame the maintainers, C++ is hard and it takes time to verify or debug code, especially libraries as complex as Spirit. I just want the hours I spent on reproducing it to not go to waste. Maybe I discovered something big again, something that could help X4's development or something that maintainers will manage to fix very fast. I will try to git bisect Spirit's code and check if my issue is indeed a regression. |
We’ll see. But my point is that handling a GitHub issue is, in and of itself, a maintenance task. |
|
Fixed in 07c58b4 |
This PR defines the concept of "parser" in X3. This is essentially the starting point for a complete C++23-style overhaul of X3, enabling proper constraints and significantly improved compile times in derived parsers.
I have experimented with this "modernization" attempt for 4 months and I strongly believe this PR is the requirement for updating X3 to C++23-ish codebase. I have already modernized most of the X3 codebase in my private fork, and it consists of 145 changed files, 15,000 additions, and 13,000 deletions.
This PR is intentionally limited to include only the fundamental parts. Almost all core components need overhaul based on this PR, and I am going to submit dozens of follow-up PRs later.
Defining the "parser" concept
The most fundamental definitions are described below. More detailed description can be found in the code comments.
This definition is carefully designed based on my experience on self-implementing the bleeding edge (C++23 and later) features of the STL itself. (I have expert knowledge on how the standard library is implemented, and I occasionally submit LWG issues.)
x3::X3Subject<T>:Tmodels an iterator-agnostic parser type; meaning thatTis castable (byx3::as_parser(T&&)) to some well-defined parser class that inherits fromx3::detail::parser_base.x3::X3Parser<Parser, It, Se>:ParsermodelsX3Subjectand the member function.parse(...)is callable for givenItandSe, with remaining parameters beingunused_type.Above two concepts enable X3 to properly constrain many functions. For instance, the sequence operator (
>>) was never constrained correctly, exposing unconstrainedtypename Leftandtypename Righton the public API. This sort of unconstrained features are considered bad design in C++23+ era and cause issues on library users.Refine
x3::as_parserx3::as_parseris now a C++23-style CPO. It prevents ADL in the first place, and provides the same semantics.The ADL customization point
as_spirit_parseris removed because (1) based on a GitHub full-text search I couldn't find any use case except for the blog post by the person who initially implemented it in Spirit, and (2) it unnecessarily complicates the implementation details.More background info available in
parser.hpp.Modernization on the entry points
x3::parse(...)andx3::phrase_parse(...)have been modernized accordingly. This includes the range support.Overhaul on
x3::any_parserThis PR includes overhaul on
any_parseras it operates on iterator/sentinel now. I can't split this change into a separate PR because theany_parseris tightly coupled with iterators and become completely broken without these changes.Deprecation of
x3::any_parserFurthermore, this PR deprecates
x3::any_parser.I strongly believe that the system of "type-erased parser" in Spirit is historically abused and led to people creating overly generic parser definitions. Academically saying, PEGs are unambiguous, deterministic and always produce static grammar definition which operates on ordered choices (left to right). There's no need to dynamically create a RHS parser that depends on the left hand side of the runtime input.
During 10+ years of my experience in using Spirit, I myself have also used
any_parserin the past. But now I admit that 100% of my usage was due to the misunderstanding of how PEG is defined in formal manner, and I indeed relied onany_parserto bypass the statically-typed nature of PEG.A GitHub full-text search also proves my opinion, as the use case is limited to only few repositories. I've manually confirmed that every existing use case was involving the misunderstanding described above, and I strongly believe they can be rewritten without requiring complete type-erasure.
Another notable example is #838, which initially demanded adding support for "lazy". However, after the long discussion, the thread has concluded with the fact that the solution does NOT require neither "lazy" nor "any_parser" functionality.
The reasoning is also described in
any_parser.hpp.