Injective: Promoting Safety in Encodings
As detailed in the Lossy Operation Protection section, is_encode_injective
and is_decode_injective
help the library understand when a conversion you are doing cannot be guaranteed at compile time to be lossless. Injectivity is a high-brow mathematical term:
In mathematics, an injective function (also known as injection, or one-to-one function) is a function that maps distinct elements of its domain to distinct elements of its codomain.
This is very fancy speak for the fact that for every complete, well-formed input value, there is a well-formed, distinct output value. It does not have to cover all of the potential output values: so long as there is a one-to-one mapping that is unambiguous for all the input values, it is injective. For practical purposes, it means that all of the code unit sequences that are valid can produce a unique code point sequence (“the decode operation is injective”). And, in the reverse case, it means that all the code point sequences that are valid can produce a unique code unit sequence (“the encode operation is injective”).
These two properties appear on the type itself, and is a way to opt-in to saying that a conversion is not lossy (e.g., it preserves information perfectly if the input is well-formed). You can define them by placing them on your Encoding Object Type’s definition:
1struct any_unicode_byte_encoding {
2 using is_decode_injective = std::true_type;
3 using is_encode_injective = std::true_type;
4 using code_unit = std::byte;
5 using code_point = ztd::text::unicode_scalar_value;
6 // …
7};
This signals that the encode_one
and decode_one
functions — if they are given well-formed input — will never be lossy between their code_point
type and their code_unit
types when performing the desired operation. If only one half of that equation is lossy, then you can mark only one, or the other. For example, ztd::text::ascii is lossy only in for the encode_one
operation, so it has is_decode_injective = std::true_type;
for decode
operations, but is_encode_injective = std::false_type;
for encode
operations:
1 /// because one type suffices for both.
2 using state = __txt_detail::__empty_state;
3 //////
4 /// @brief A range of code units representing the values to use when a replacement happen. For ASCII, this
5 /// must be '?' instead of the usual Unicode Replacement Character U'�'.
6 static constexpr ::ztd::span<const code_unit, 1> replacement_code_units() noexcept {
7 return __txt_detail::__question_mark_replacement_units<code_unit>;
8 }
9
10 //////
11 /// @brief Decodes a single complete unit of information as code points and produces a result with the
12 /// input and output ranges moved past what was successfully read and written; or, produces an error and
13 /// returns the input and output ranges untouched.
14 ///
15 /// @param[in] __input The input view to read code uunits from.
16 /// @param[in] __output The output view to write code points into.
17#include <ztd/epilogue.hpp>
If the type definition is not present and is not std::true_type
, then the implementation assumes that this is false for a given encoding. See ztd::text::is_decode_injective and ztd::text::is_encode_injective for more information.