Injective: Promoting Safety in Encodings

As detailed in the Lossy Operation Protection section, is_encode_injective and is_decode_injective help the library understand when a conversion you are doing cannot be guaranteed at compile time to be lossless. Injectivity is a high-brow mathematical term:

In mathematics, an injective function (also known as injection, or one-to-one function) is a function that maps distinct elements of its domain to distinct elements of its codomain.

—Wikipedia, February 2nd, 2021

This is very fancy speak for the fact that for every complete, well-formed input value, there is a well-formed, distinct output value. It does not have to cover all of the potential output values: so long as there is a one-to-one mapping that is unambiguous for all the input values, it is injective. For practical purposes, it means that all of the code unit sequences that are valid can produce a unique code point sequence (“the decode operation is injective”). And, in the reverse case, it means that all the code point sequences that are valid can produce a unique code unit sequence (“the encode operation is injective”).

These two properties appear on the type itself, and is a way to opt-in to saying that a conversion is not lossy (e.g., it preserves information perfectly if the input is well-formed). You can define them by placing them on your Encoding Object Type’s definition:

struct any_unicode_byte_encoding {
        using is_decode_injective = std::true_type;
        using is_encode_injective = std::true_type;
        using code_unit = std::byte;
        using code_point = ztd::text::unicode_scalar_value;
        // …
};

This signals that the encode_one and decode_one functions — if they are given well-formed input — will never be lossy between their code_point type and their code_unit types when performing the desired operation. If only one half of that equation is lossy, then you can mark only one, or the other. For example, ztd::text::ascii is lossy only in for the encode_one operation, so it has is_decode_injective = std::true_type; for decode operations, but is_encode_injective = std::false_type; for encode operations:

		/// because one type suffices for both.
		using state = __txt_detail::__empty_state;
		//////
		/// @brief A range of code units representing the values to use when a replacement happen. For ASCII, this
		/// must be '?' instead of the usual Unicode Replacement Character U'�'.
		static constexpr ::ztd::span<const code_unit, 1> replacement_code_units() noexcept {
			return __txt_detail::__question_mark_replacement_units<code_unit>;
		}

		//////
		/// @brief Decodes a single complete unit of information as code points and produces a result with the
		/// input and output ranges moved past what was successfully read and written; or, produces an error and
		/// returns the input and output ranges untouched.
		///
		/// @param[in] __input The input view to read code uunits from.
		/// @param[in] __output The output view to write code points into.
#include <ztd/epilogue.hpp>

If the type definition is not present and is not std::true_type, then the implementation assumes that this is false for a given encoding. See ztd::text::is_decode_injective and ztd::text::is_encode_injective for more information.