Injective: Promoting Safety in Encodings

As detailed in the Lossy Operation Protection section, is_encode_injective and is_decode_injective help the library understand when a conversion you are doing cannot be guaranteed at compile time to be lossless. Injectivity is a high-brow mathematical term:

In mathematics, an injective function (also known as injection, or one-to-one function) is a function that maps distinct elements of its domain to distinct elements of its codomain.

This is very fancy speak for the fact that for every complete, well-formed input value, there is a well-formed, distinct output value. It does not have to cover all of the potential output values: so long as there is a one-to-one mapping that is unambigious for all the input values, it is injective. For practical purposes, it means that all of the code unit sequences that are valid can produce a unique code point sequence (“the decode operation is injective”). And, in the reverse case, it means that all the code point sequences that are valid can produce a unique code unit sequence (“the encode operation is injective”).

These two properties appear on the type itself, and is a way to opt-in to saying that a conversion is not lossy (e.g., it preserves information perfectly if the input is well-formed). You can define them by placing them on your Encoding Object Type’s definition:

1struct any_unicode_byte_encoding {
2        using is_decode_injective = std::true_type;
3        using is_encode_injective = std::true_type;
4        using code_unit = std::byte;
5        using code_point = ztd::text::unicode_scalar_value;
6        // …
7};

This signals that the encode_one and decode_one functions — if they are given well-formed input — will never be lossy between their code_point type and their code_unit types when performing the desired operation. If only one half of that equation is lossy, then you can mark only one, or the other. For example, ztd::text::ascii is lossy only in for the encode_one operation, so it has is_decode_injective = std::true_type; for decode operations, but is_encode_injective = std::false_type; for encode operations:

1	//////
2	/// @brief The individual units that result from an encode operation or are used as input to a decode
3	/// operation.
4	/// @remarks ASCII can decode from its 7-bit (unpacked) code units to Unicode Code Points. Since the converion
5	/// is lossless, this property is true.
6	//////
7	using is_decode_injective = ::std::true_type;
8	//////
9	/// @brief Whether or not the encode operation can process all forms of input into code unit values. This is
10	/// not true for ASCII, as many Unicode Code Point and Unicode Scalar Values cannot be represented in ASCII.
11	/// Since the conversion is lossy, this property is false.
12	//////
13	using is_encode_injective = ::std::false_type;
14	//////
15	/// @brief The maximum code units a single complete operation of encoding can produce.
16	inline static constexpr const ::std::size_t max_code_units = 1;
17					             ::std::move(__inlast)),

If the type definition is not present and is not std::true_type, then the implementation assumes that this is false for a given encoding. See ztd::text::is_decode_injective and ztd::text::is_encode_injective for more information.