From d23df152fd62c32b398acf543165c41f3460a5bc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Tue, 17 Sep 2024 19:06:40 +0200 Subject: [PATCH 01/13] Add 'const_char_encode_utf8' RFC. --- text/0000-const-char-encode-utf8.md | 74 +++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 text/0000-const-char-encode-utf8.md diff --git a/text/0000-const-char-encode-utf8.md b/text/0000-const-char-encode-utf8.md new file mode 100644 index 00000000000..04df410d806 --- /dev/null +++ b/text/0000-const-char-encode-utf8.md @@ -0,0 +1,74 @@ +- Feature Name: `const_char_encode_utf8` +- Start Date: 2024-09-17 +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +`char::encode_utf8` should be marked const to allow for compile-time conversions. + +# Motivation +[motivation]: #motivation + +The `encode_utf8` method (in `char`) is currently **not** marked as "const" and is therefore rendered unusable in scenarios that require const-compatibility. + +With the recent stabilisation of [`const_mut_refs`](https://github.com/rust-lang/rust/issues/57349/), implementing `encode_utf8` with the same signature would yield no inconsistencies with other parts of the standard library. + +I expect that implementing this RFC will support compile-time string handling in the future. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +Currently, the `encode_utf8` method has the following prototype: + +```rust +pub fn encode_utf8(self, dst: &mut [u8]) -> &mut str; +``` + +This is to simply be marked as const: + +```rust +pub const fn encode_utf8(self, dst: &mut [u8]) -> &mut str; +``` + +This is not a breaking change. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +Other than just adding the `const` qualifier to the function signature, the function body would have to be changed due to some constructs currently not being supported in constant expressions. + +A working implementation can be found at [`bjoernager/rust:const-char-encode-utf8`](https://github.com/bjoernager/rust/tree/const-char-encode-utf8). + +# Drawbacks +[drawbacks]: #drawbacks + +Implementing this RFC at the current moment could degenerate diagnostics as the `assert` call in `encode_utf8_raw` relies on formatters that are non-const. + +The reference implementation resolves this by instead using a generic message, although this may not be desired. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +If the initial diagnostics are deemed to be worth more than const-compatibility then an `encode_utf8_unchecked` method could be considered instead: + +```rust +pub const unsafe fn encode_utf8_unchecked(self, dst: &mut [u8]) -> &mut str; +``` + +This function would perform the same operation but without testing the length of `dst`. +This would in turn allow const conversions – if very needed – without changing diagonstic messages. + +# Prior art +[prior-art]: #prior-art + +Currently none that I know of. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +None at the moment. + +# Future possibilities +[future-possibilities]: #future-possibilities From 328759bf1ccd07efe136f79267ba381bc71284c4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Tue, 17 Sep 2024 19:12:01 +0200 Subject: [PATCH 02/13] Elaborate motivation. --- text/0000-const-char-encode-utf8.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-const-char-encode-utf8.md b/text/0000-const-char-encode-utf8.md index 04df410d806..0dd87352474 100644 --- a/text/0000-const-char-encode-utf8.md +++ b/text/0000-const-char-encode-utf8.md @@ -13,9 +13,9 @@ The `encode_utf8` method (in `char`) is currently **not** marked as "const" and is therefore rendered unusable in scenarios that require const-compatibility. -With the recent stabilisation of [`const_mut_refs`](https://github.com/rust-lang/rust/issues/57349/), implementing `encode_utf8` with the same signature would yield no inconsistencies with other parts of the standard library. +With the recent stabilisation of [`const_mut_refs`](https://github.com/rust-lang/rust/issues/57349/), implementing `encode_utf8` with the same parameters is trivial would yield no incompatibilities with existing code. -I expect that implementing this RFC will support compile-time string handling in the future. +I expect that implementing this RFC – despite its limited scope – will however prove useful in supporting compile-time string handling in the future. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation From 21b61aaef7346df9270c7fcd702b18a2bd75db9b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Tue, 17 Sep 2024 19:14:55 +0200 Subject: [PATCH 03/13] Elaborate motivation and future possibilities. --- text/0000-const-char-encode-utf8.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-const-char-encode-utf8.md b/text/0000-const-char-encode-utf8.md index 0dd87352474..93d699f71bf 100644 --- a/text/0000-const-char-encode-utf8.md +++ b/text/0000-const-char-encode-utf8.md @@ -13,7 +13,7 @@ The `encode_utf8` method (in `char`) is currently **not** marked as "const" and is therefore rendered unusable in scenarios that require const-compatibility. -With the recent stabilisation of [`const_mut_refs`](https://github.com/rust-lang/rust/issues/57349/), implementing `encode_utf8` with the same parameters is trivial would yield no incompatibilities with existing code. +With the recent stabilisation of [`const_mut_refs`](https://github.com/rust-lang/rust/issues/57349/), implementing `encode_utf8` with the same parameters is trivial and would yield no incompatibilities with existing code. I expect that implementing this RFC – despite its limited scope – will however prove useful in supporting compile-time string handling in the future. @@ -72,3 +72,5 @@ None at the moment. # Future possibilities [future-possibilities]: #future-possibilities + +I suspect that having a similar `decode_utf8` may be desired. From 492bdbdebab0d1058a41623060cfe63a453e6235 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Tue, 17 Sep 2024 19:18:34 +0200 Subject: [PATCH 04/13] Elaborate future possibilities. --- text/0000-const-char-encode-utf8.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-const-char-encode-utf8.md b/text/0000-const-char-encode-utf8.md index 93d699f71bf..bbda14f8ad2 100644 --- a/text/0000-const-char-encode-utf8.md +++ b/text/0000-const-char-encode-utf8.md @@ -73,4 +73,4 @@ None at the moment. # Future possibilities [future-possibilities]: #future-possibilities -I suspect that having a similar `decode_utf8` may be desired. +I suspect that having a similar `decode_utf8` method may be desired. From bdd95989be263fb0912fecbafa667c639c86a8dc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Tue, 17 Sep 2024 19:19:34 +0200 Subject: [PATCH 05/13] Elaborate motivation. --- text/0000-const-char-encode-utf8.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-const-char-encode-utf8.md b/text/0000-const-char-encode-utf8.md index bbda14f8ad2..9a6543d94ac 100644 --- a/text/0000-const-char-encode-utf8.md +++ b/text/0000-const-char-encode-utf8.md @@ -13,7 +13,7 @@ The `encode_utf8` method (in `char`) is currently **not** marked as "const" and is therefore rendered unusable in scenarios that require const-compatibility. -With the recent stabilisation of [`const_mut_refs`](https://github.com/rust-lang/rust/issues/57349/), implementing `encode_utf8` with the same parameters is trivial and would yield no incompatibilities with existing code. +With the recent stabilisation of [`const_mut_refs`](https://github.com/rust-lang/rust/issues/57349/), implementing `encode_utf8` with the same parameters is trivial and would (in practice) yield no incompatibilities with existing code. I expect that implementing this RFC – despite its limited scope – will however prove useful in supporting compile-time string handling in the future. From 60d78827e1c8c7ea68ed83c6a1b2f37c5be17fe0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Wed, 18 Sep 2024 10:46:49 +0200 Subject: [PATCH 06/13] Elaborate motivation, reference-level explanation, drawbacks, and rationale and alternatives. --- text/0000-const-char-encode-utf8.md | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/text/0000-const-char-encode-utf8.md b/text/0000-const-char-encode-utf8.md index 9a6543d94ac..6cb971d8a76 100644 --- a/text/0000-const-char-encode-utf8.md +++ b/text/0000-const-char-encode-utf8.md @@ -13,7 +13,7 @@ The `encode_utf8` method (in `char`) is currently **not** marked as "const" and is therefore rendered unusable in scenarios that require const-compatibility. -With the recent stabilisation of [`const_mut_refs`](https://github.com/rust-lang/rust/issues/57349/), implementing `encode_utf8` with the same parameters is trivial and would (in practice) yield no incompatibilities with existing code. +With the recent stabilisation of [`const_mut_refs`](https://github.com/rust-lang/rust/issues/57349/), implementing `encode_utf8` with the current signature is trivial and would (in practice) yield no incompatibilities with existing code. I expect that implementing this RFC – despite its limited scope – will however prove useful in supporting compile-time string handling in the future. @@ -37,7 +37,7 @@ This is not a breaking change. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -Other than just adding the `const` qualifier to the function signature, the function body would have to be changed due to some constructs currently not being supported in constant expressions. +Other than just adding the `const` qualifier to the function prototype, the function body would have to be changed due to some constructs currently not being supported in constant expressions. A working implementation can be found at [`bjoernager/rust:const-char-encode-utf8`](https://github.com/bjoernager/rust/tree/const-char-encode-utf8). @@ -46,7 +46,13 @@ A working implementation can be found at [`bjoernager/rust:const-char-encode-utf Implementing this RFC at the current moment could degenerate diagnostics as the `assert` call in `encode_utf8_raw` relies on formatters that are non-const. -The reference implementation resolves this by instead using a generic message, although this may not be desired. +The reference implementation resolves this by instead using a generic message, although this may not be desired: + +``` +"encode_utf8: buffer does not have enough bytes to encode code point" +``` + +This *could* be changed to have the number of bytes required hard-coded, but doing so may instead sacrifice code readability. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -55,6 +61,10 @@ If the initial diagnostics are deemed to be worth more than const-compatibility ```rust pub const unsafe fn encode_utf8_unchecked(self, dst: &mut [u8]) -> &mut str; + +// ... or... + +pub const unsafe fn encode_utf8_unchecked(self, dst: *mut u8) -> *mut str; ``` This function would perform the same operation but without testing the length of `dst`. From 0d99d43f0f0adfb7fbfd3d14b5b15ade9ef7e86e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Wed, 18 Sep 2024 10:53:22 +0200 Subject: [PATCH 07/13] Elaborate summary, drawbacks, and unresolved questions. --- text/0000-const-char-encode-utf8.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/text/0000-const-char-encode-utf8.md b/text/0000-const-char-encode-utf8.md index 6cb971d8a76..0bac1cdfaeb 100644 --- a/text/0000-const-char-encode-utf8.md +++ b/text/0000-const-char-encode-utf8.md @@ -7,6 +7,7 @@ [summary]: #summary `char::encode_utf8` should be marked const to allow for compile-time conversions. +Considering mutable references now being stable in const environments, this implementation would be trivial even without compiler magic. # Motivation [motivation]: #motivation @@ -44,7 +45,7 @@ A working implementation can be found at [`bjoernager/rust:const-char-encode-utf # Drawbacks [drawbacks]: #drawbacks -Implementing this RFC at the current moment could degenerate diagnostics as the `assert` call in `encode_utf8_raw` relies on formatters that are non-const. +Implementing this RFC at the current moment could degenerate diagnostics as the `assert` call in the `encode_utf8_raw` function relies on formatters that are non-const. The reference implementation resolves this by instead using a generic message, although this may not be desired: @@ -78,7 +79,8 @@ Currently none that I know of. # Unresolved questions [unresolved-questions]: #unresolved-questions -None at the moment. +The problem with diagnostic degeneration could be solved by allowing the used formatters in const environments. +I do not know if there already exists such as feature for use by the standard library. # Future possibilities [future-possibilities]: #future-possibilities From d79772894803ee886ed4b8289769548051191185 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Wed, 18 Sep 2024 10:55:42 +0200 Subject: [PATCH 08/13] Elaborate rationale and alternatives. --- text/0000-const-char-encode-utf8.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/text/0000-const-char-encode-utf8.md b/text/0000-const-char-encode-utf8.md index 0bac1cdfaeb..06bce4ad4f2 100644 --- a/text/0000-const-char-encode-utf8.md +++ b/text/0000-const-char-encode-utf8.md @@ -68,8 +68,7 @@ pub const unsafe fn encode_utf8_unchecked(self, dst: &mut [u8]) -> &mut str; pub const unsafe fn encode_utf8_unchecked(self, dst: *mut u8) -> *mut str; ``` -This function would perform the same operation but without testing the length of `dst`. -This would in turn allow const conversions – if very needed – without changing diagonstic messages. +This function would perform the same operation but without testing the length of `dst`, allowing for const conversions at least in the short-term (until formatters are stabilised). # Prior art [prior-art]: #prior-art From a39be8b5ee4eade3f605786677c29e005fe2b7de Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Wed, 18 Sep 2024 11:03:26 +0200 Subject: [PATCH 09/13] Elaborate reference-level explanation. --- text/0000-const-char-encode-utf8.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/0000-const-char-encode-utf8.md b/text/0000-const-char-encode-utf8.md index 06bce4ad4f2..a5beafd136c 100644 --- a/text/0000-const-char-encode-utf8.md +++ b/text/0000-const-char-encode-utf8.md @@ -41,6 +41,7 @@ This is not a breaking change. Other than just adding the `const` qualifier to the function prototype, the function body would have to be changed due to some constructs currently not being supported in constant expressions. A working implementation can be found at [`bjoernager/rust:const-char-encode-utf8`](https://github.com/bjoernager/rust/tree/const-char-encode-utf8). +Note that this implementation assumes [`const_slice_from_raw_parts_mut`](https://github.com/rust-lang/rust/issues/67456/). # Drawbacks [drawbacks]: #drawbacks From 16504dc2653c4c6a3baad95a66ba5a46a1c04f6b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Wed, 18 Sep 2024 11:14:01 +0200 Subject: [PATCH 10/13] Elaborate drawbacks. --- text/0000-const-char-encode-utf8.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-const-char-encode-utf8.md b/text/0000-const-char-encode-utf8.md index a5beafd136c..3104d2387bb 100644 --- a/text/0000-const-char-encode-utf8.md +++ b/text/0000-const-char-encode-utf8.md @@ -51,7 +51,7 @@ Implementing this RFC at the current moment could degenerate diagnostics as the The reference implementation resolves this by instead using a generic message, although this may not be desired: ``` -"encode_utf8: buffer does not have enough bytes to encode code point" +encode_utf8: buffer does not have enough bytes to encode code point ``` This *could* be changed to have the number of bytes required hard-coded, but doing so may instead sacrifice code readability. From 5950b776b5ec2005e33f9a717cbdd194a54cf971 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Wed, 18 Sep 2024 11:15:20 +0200 Subject: [PATCH 11/13] Elaborate unresolved questions. --- text/0000-const-char-encode-utf8.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-const-char-encode-utf8.md b/text/0000-const-char-encode-utf8.md index 3104d2387bb..d7c3ac0da43 100644 --- a/text/0000-const-char-encode-utf8.md +++ b/text/0000-const-char-encode-utf8.md @@ -80,7 +80,7 @@ Currently none that I know of. [unresolved-questions]: #unresolved-questions The problem with diagnostic degeneration could be solved by allowing the used formatters in const environments. -I do not know if there already exists such as feature for use by the standard library. +I do not know if there already exists such a feature for use by the standard library. # Future possibilities [future-possibilities]: #future-possibilities From bf43378670ed730c6550a63179ff44192efcdb68 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Wed, 18 Sep 2024 11:39:30 +0200 Subject: [PATCH 12/13] Specify pull request identifier. --- ...nst-char-encode-utf8.md => 3696-const-char-encode-utf8.md} | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) rename text/{0000-const-char-encode-utf8.md => 3696-const-char-encode-utf8.md} (93%) diff --git a/text/0000-const-char-encode-utf8.md b/text/3696-const-char-encode-utf8.md similarity index 93% rename from text/0000-const-char-encode-utf8.md rename to text/3696-const-char-encode-utf8.md index d7c3ac0da43..6a573b900a6 100644 --- a/text/0000-const-char-encode-utf8.md +++ b/text/3696-const-char-encode-utf8.md @@ -1,6 +1,6 @@ - Feature Name: `const_char_encode_utf8` - Start Date: 2024-09-17 -- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- RFC PR: [rust-lang/rfcs#3696](https://github.com/rust-lang/rfcs/pull/3696) - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) # Summary @@ -41,6 +41,8 @@ This is not a breaking change. Other than just adding the `const` qualifier to the function prototype, the function body would have to be changed due to some constructs currently not being supported in constant expressions. A working implementation can be found at [`bjoernager/rust:const-char-encode-utf8`](https://github.com/bjoernager/rust/tree/const-char-encode-utf8). +Required changes are in [`/library/core/src/char/methods.rs`](https://github.com/bjoernager/rust/blob/const-char-encode-utf8/library/core/src/char/methods.rs/). + Note that this implementation assumes [`const_slice_from_raw_parts_mut`](https://github.com/rust-lang/rust/issues/67456/). # Drawbacks From 72a10f9e10b0734a271b71f1f02cc51f94402c74 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gabriel=20Bj=C3=B8rnager=20Jensen?= Date: Wed, 18 Sep 2024 16:06:32 +0200 Subject: [PATCH 13/13] Specify tracking issue. --- text/3696-const-char-encode-utf8.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3696-const-char-encode-utf8.md b/text/3696-const-char-encode-utf8.md index 6a573b900a6..335255f0b94 100644 --- a/text/3696-const-char-encode-utf8.md +++ b/text/3696-const-char-encode-utf8.md @@ -1,7 +1,7 @@ - Feature Name: `const_char_encode_utf8` - Start Date: 2024-09-17 - RFC PR: [rust-lang/rfcs#3696](https://github.com/rust-lang/rfcs/pull/3696) -- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) +- Rust Issue: [rust-lang/rust#130512](https://github.com/rust-lang/rust/issues/130512) # Summary [summary]: #summary