Encoding Scheme¶

The encoding_scheme template turns any encoding into a byte-based encoding capable of reading and writing those bytes into and out of byte-value_type ranges. It prevents duplicating effort to read encodings as little endian or big endian, allowing composition for any desired encoding to interface with e.g. a UTF-16 Big Endian blob of data coming over a network or shared pipe.

Aliases¶

using ztd::text::basic_utf16_le = encoding_scheme<utf16_t, endian::little, _Byte>¶

A UTF-16 encoding, in Little Endian format, with inputs as a sequence of bytes.

Template Parameters: _Byte – The byte type to use. Typically, this is std::byte or uchar.

using ztd::text::utf16_le_t = basic_utf16_le<::std::byte>¶: A UTF-16 encoding, in Little Endian format, with inputs as a sequence of bytes.

using ztd::text::basic_utf16_be = encoding_scheme<utf16_t, endian::big, _Byte>¶

A UTF-16 encoding, in Big Endian format, with inputs as a sequence of bytes.

Template Parameters: _Byte – The byte type to use. Typically, this is std::byte or unsigned char.

using ztd::text::utf16_be_t = basic_utf16_be<::std::byte>¶: A UTF-16 encoding, in Big Endian format, with inputs as a sequence of bytes.

using ztd::text::basic_utf16_ne = encoding_scheme<utf16_t, endian::native, _Byte>¶

A UTF-16 encoding, in Native Endian format, with inputs as a sequence of bytes.

Template Parameters: _Byte – The byte type to use. Typically, this is std::byte or unsigned char.

using ztd::text::utf16_ne_t = basic_utf16_ne<::std::byte>¶: A UTF-16 encoding, in Native Endian format, with inputs as a sequence of bytes.

using ztd::text::basic_utf32_le = encoding_scheme<utf32_t, endian::little, _Byte>¶

A UTF-32 encoding, in Little Endian format, with inputs as a sequence of bytes.

Template Parameters: _Byte – The byte type to use. Typically, this is std::byte or unsigned char .

using ztd::text::utf32_le_t = basic_utf32_le<::std::byte>¶: A UTF-32 encoding, in Little Endian format, with inputs as a sequence of bytes.

using ztd::text::basic_utf32_be = encoding_scheme<utf32_t, endian::big, _Byte>¶

A UTF-32 encoding, in Big Endian format, with inputs as a sequence of bytes.

Template Parameters: _Byte – The byte type to use. Typically, this is std::byte or unsigned char .

using ztd::text::utf32_be_t = basic_utf32_be<::std::byte>¶: A UTF-32 encoding, in Big Endian format, with inputs as a sequence of bytes.

using ztd::text::basic_utf32_ne = encoding_scheme<utf32_t, endian::native, _Byte>¶

A UTF-32 encoding, in Native Endian format, with inputs as a sequence of bytes.

Template Parameters: _Byte – The byte type to use. Typically, this is std::byte or unsigned char .

using ztd::text::utf32_ne_t = basic_utf32_ne<::std::byte>¶: A UTF-32 encoding, in Big Endian format, with inputs as a sequence of bytes.

Base Template¶

template<typename _Encoding, endian _Endian = endian::native, typename _Byte = ::std::byte> class ztd::text::encoding_scheme : public __is_unicode_encoding_es<encoding_scheme<_Encoding, _Endian, _Byte>, remove_cvref_t<unwrap_t<_Encoding>>>, private ebco<_Encoding>¶

Decomposes the provided Encoding type into a specific endianness (big, little, or native) to allow for a single encoding type to be viewed in different ways.

Remark: For example, this can be used to construct a Big Endian UTF-16 by using encoding_scheme<ztd::text::utf16_t, ztd::endian::big>. It can be made interopable with unsigned char buffers rather than std::byte buffers by doing: ztd::text::encoding_scheme<ztd::text::utf32_t, ztd::endian::native, unsigned char>.

tparam _Encoding: The encoding type.
tparam _Endian: The endianess to use. Defaults to ztd::endian::native.
tparam _Byte: The byte type to use. Defaults to std::byte.

Public Types

using encoding_type = _Encoding ¶: The encoding type that this scheme wraps.

using code_point = code_point_t<_UBaseEncoding>¶: The individual units that result from a decode operation or as used as input to an encode operation. For most encodings, this is going to be a Unicode Code Point or a Unicode Scalar Value.

using code_unit = _Byte ¶

The individual units that result from an encode operation or are used as input to a decode operation.

Remark: Typically, this type is usually always some kind of byte type (unsigned char or std::byte or other sizeof(obj) == 1 type).

using decode_state = decode_state_t<_UBaseEncoding>¶

The state that can be used between calls to the decode function.

Remark: Even if the underlying encoding only has a single state type, we need to separate the two out in order to generically handle all encodings. Therefore, the encoding_scheme will always have both encode_state and decode_state.

using encode_state = encode_state_t<_UBaseEncoding>¶

The state that can be used between calls to the encode function.

Remark: Even if the underlying encoding only has a single state type, we need to separate the two out in order to generically handle all encodings. Therefore, the encoding_scheme will always have both encode_state and decode_state.

using is_encode_injective = ::std::integral_constant<bool, is_encode_injective_v<_UBaseEncoding>>¶

Whether or not the encode operation can process all forms of input into code point values.

Remark: Defers to what the underlying encoding_type does.

using is_decode_injective = ::std::integral_constant<bool, is_decode_injective_v<_UBaseEncoding>>¶

Whether or not the decode operation can process all forms of input into code point values.

Remark: Defers to what the underlying encoding_type does.

Public Functions

inline constexpr encoding_type &base() & noexcept¶

Retrives the underlying encoding object.

Returns: An l-value reference to the encoding object.

inline constexpr const encoding_type &base() const & noexcept¶

Retrives the underlying encoding object.

Returns: An l-value reference to the encoding object.

inline constexpr encoding_type &&base() && noexcept¶

Retrives the underlying encoding object.

Returns: An l-value reference to the encoding object.

template<typename _Unused = encoding_type, ::std::enable_if_t<is_code_units_replaceable_v<_Unused>>* = nullptr> inline decltype(auto) constexpr replacement_code_units() const noexcept¶

Returns, the desired replacement code units to use.

Remark: This is only callable if the function call exists on the wrapped encoding. It is broken down into a contiguous view type formulated from bytes if the wrapped code unit types do not match.

template<typename _Unused = encoding_type, ::std::enable_if_t<is_code_points_replaceable_v<_Unused>>* = nullptr> inline decltype(auto) constexpr replacement_code_points() const noexcept¶

Returns the desired replacement code points to use.

Remark: Is only callable if the function call exists on the wrapped encoding.

template<typename _Unused = encoding_type, ::std::enable_if_t<is_code_units_maybe_replaceable_v<_Unused>>* = nullptr> inline decltype(auto) constexpr maybe_replacement_code_units() const noexcept¶

Returns the desired replacement code units to use, or an empty optional-like type if there is nothing present.

Remark: This is only callable if the function call exists on the wrapped encoding. It is broken down into a contiguous view type formulated from bytes if the wrapped code unit types do not match.

template<typename _Unused = encoding_type, ::std::enable_if_t<is_code_points_maybe_replaceable_v<_Unused>>* = nullptr> inline decltype(auto) constexpr maybe_replacement_code_points() const noexcept¶

Returns the desired replacement code units to use.

Remark: This Is only callable if the function call exists on the wrapped encoding.

inline constexpr bool contains_unicode_encoding() const noexcept¶: Whether or not this encoding is some form of Unicode encoding.

template<typename _InputRange, typename _OutputRange, typename _ErrorHandler> inline constexpr auto decode_one(_InputRange &&__input, _OutputRange &&__output, _ErrorHandler &&__error_handler, decode_state &__s) const¶

Decodes a single complete unit of information as code points and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.

Remark: To the best ability of the implementation, the iterators will be returned untouched (e.g., the input models at least a view and a forward_range). If it is not possible, returned ranges may be incremented even if an error occurs due to the semantics of any view that models an input_range.

Parameters

__input – [in] The input view to read code uunits from.
__output – [in] The output view to write code points into.
__error_handler – [in] The error handler to invoke if encoding fails.
__s – [inout] The necessary state information. For this encoding, the state is empty and means very little.

Returns

A ztd::text::decode_result object that contains the reconstructed input range, reconstructed output range, error handler, and a reference to the passed-in state.

template<typename _InputRange, typename _OutputRange, typename _ErrorHandler> inline constexpr auto encode_one(_InputRange &&__input, _OutputRange &&__output, _ErrorHandler &&__error_handler, encode_state &__s) const¶

Encodes a single complete unit of information as code units and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.

Remark: To the best ability of the implementation, the iterators will be returned untouched (e.g., the input models at least a view and a forward_range). If it is not possible, returned ranges may be incremented even if an error occurs due to the semantics of any view that models an input_range.

Parameters

__input – [in] The input view to read code points from.
__output – [in] The output view to write code units into.
__error_handler – [in] The error handler to invoke if encoding fails.
__s – [inout] The necessary state information. For this encoding, the state is empty and means very little.

Returns

A ztd::text::encode_result object that contains the reconstructed input range, reconstructed output range, error handler, and a reference to the passed-in state.

Public Static Attributes

static constexprconst::std::size_t max_code_points = max_code_points_v<_UBaseEncoding>¶: The maximum number of code points a single complete operation of decoding can produce. This is 1 for all Unicode Transformation Format (UTF) encodings.

static constexprconst::std::size_t max_code_units = (max_code_units_v<_UBaseEncoding> * sizeof(_BaseCodeUnit)) / (sizeof(_Byte))¶: The maximum code units a single complete operation of encoding can produce.