Injective: Promoting Safety in Encodings¶

As detailed in the Lossy Operation Protection section, is_encode_injective and is_decode_injective help the library understand when a conversion you are doing cannot be guaranteed at compile time to be lossless. Injectivity is a high-brow mathematical term:

In mathematics, an injective function (also known as injection, or one-to-one function) is a function that maps distinct elements of its domain to distinct elements of its codomain.

—Wikipedia, February 2nd, 2021

This is very fancy speak for the fact that for every complete, well-formed input value, there is a well-formed, distinct output value. It does not have to cover all of the potential output values: so long as there is a one-to-one mapping that is unambigious for all the input values, it is injective. For practical purposes, it means that all of the code unit sequences that are valid can produce a unique code point sequence (“the decode operation is injective”). And, in the reverse case, it means that all the code point sequences that are valid can produce a unique code unit sequence (“the encode operation is injective”).

These two properties appear on the type itself, and is a way to opt-in to saying that a conversion is not lossy (e.g., it preserves information perfectly if the input is well-formed). You can define them by placing them on your Encoding Object Type’s definition:

struct any_unicode_byte_encoding {
        using is_decode_injective = std::true_type;
        using is_encode_injective = std::true_type;
        using code_unit = std::byte;
        using code_point = ztd::text::unicode_scalar_value;
        // …
};

This signals that the encode_one and decode_one functions — if they are given well-formed input — will never be lossy between their code_point type and their code_unit types when performing the desired operation. If only one half of that equation is lossy, then you can mark only one, or the other. For example, ztd::text::ascii is lossy only in for the encode_one operation, so it has is_decode_injective = std::true_type; for decode operations, but is_encode_injective = std::false_type; for encode operations:

	//////
	/// @brief The individual units that result from an encode operation or are used as input to a decode
	/// operation.
	/// @remarks ASCII can decode from its 7-bit (unpacked) code units to Unicode Code Points. Since the converion
	/// is lossless, this property is true.
	//////
	using is_decode_injective = ::std::true_type;
	//////
	/// @brief Whether or not the encode operation can process all forms of input into code unit values. This is
	/// not true for ASCII, as many Unicode Code Point and Unicode Scalar Values cannot be represented in ASCII.
	/// Since the conversion is lossy, this property is false.
	//////
	using is_encode_injective = ::std::false_type;
	//////
	/// @brief The maximum code units a single complete operation of encoding can produce.
	inline static constexpr const ::std::size_t max_code_units = 1;
					             ::std::move(__inlast)),

If the type definition is not present and is not std::true_type, then the implementation assumes that this is false for a given encoding. See ztd::text::is_decode_injective and ztd::text::is_encode_injective for more information.