UTF-32

Aliases

constexpr utf32_t ztd::text::utf32 = {}: An instance of the UTF-32 encoding for ease of use.

typedef basic_utf32<char32_t, unicode_code_point> ztd::text::utf32_t: A UTF-32 Encoding that traffics in ztd_char32_t. See ztd::text::basic_utf32 for more details.

constexpr wide_utf32_t ztd::text::wide_utf32 = {}: An instance of the UTF-32 that traffics in wchar_t for ease of use.

using ztd::text::wide_utf32_t = basic_utf32<wchar_t>: A UTF-32 Encoding that traffics in wchar_t. See ztd::text::basic_utf32 for more details.

Base Template

template<typename _CodeUnit, typename _CodePoint = unicode_code_point> class basic_utf32 : public ztd::text::__txt_impl::__utf32_with<basic_utf32<_CodeUnit, unicode_code_point>, _CodeUnit, unicode_code_point>

A UTF-32 Encoding that traffics in, specifically, the desired code unit type provided as a template argument.

Remark

This is a strict UTF-32 implementation that does not allow lone, unpaired surrogates either in or out.

Template Parameters:

_CodeUnit – The code unit type to use.
_CodeUnit – The code point type to use.

Public Types

using is_unicode_encoding = ::std::true_type: Whether or not this encoding that can encode all of Unicode.

using self_synchronizing_code = ::std::true_type

The start of a sequence can be found unambiguously when dropped into the middle of a sequence or after an error in reading as occurred for encoded text.

Remark

Unicode has definitive bit patterns which resemble start and end sequences. For UTF-32, there is only 1 code point per fully encoded character.

using state = __txt_detail::__empty_state: The state that can be used between calls to the encoder and decoder. It is an empty struct because there is no shift state to preserve between complete units of encoded information.

using code_unit = _CodeUnit : The individual units that result from an encode operation or are used as input to a decode operation. For UTF-32 formats, this is usually ztd_char32_t, but this can change (see ztd::text::basic_utf32).

using code_point = _CodePoint : The individual units that result from a decode operation or as used as input to an encode operation. For most encodings, this is going to be a Unicode Code Point or a Unicode Scalar Value.

using is_decode_injective = ::std::true_type: Whether or not the decode operation can process all forms of input into code point values. Thsi is true for all Unicode Transformation Formats (UTFs), which can encode and decode without a loss of information from a valid collection of code units.

using is_encode_injective = ::std::true_type: Whether or not the encode operation can process all forms of input into code unit values. This is true for all Unicode Transformation Formats (UTFs), which can encode and decode without loss of information from a valid input code point.

Public Static Functions

template<typename _Result, typename _InputProgress, typename _OutputProgress> static inline constexpr auto skip_input_error(_Result &&__result, const _InputProgress &__input_progress, const _OutputProgress &__output_progress) noexcept

Allows an encoding to discard input characters if an error occurs, taking in both the state and the input sequence (by reference) to modify.

Remark

This will skip every input value until a proper UTF-32 unicode scalar value (or code point) is found.

template<typename _Input, typename _Output, typename _ErrorHandler> static inline constexpr auto decode_one(_Input &&__input, _Output &&__output, _ErrorHandler &&__error_handler, state &__s)

Decodes a single complete unit of information as code points and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.

Remark

To the best ability of the implementation, the iterators will be returned untouched (e.g., the input models at least a view and a forward_range). If it is not possible, returned ranges may be incremented even if an error occurs due to the semantics of any view that models an input_range.

Parameters:

__input – [in] The input view to read code uunits from.
__output – [in] The output view to write code points into.
__error_handler – [in] The error handler to invoke if encoding fails.
__s – [inout] The necessary state information. For this encoding, the state is empty and means very little.

Returns:

A ztd::text::decode_result object that contains the reconstructed input range, reconstructed output range, error handler, and a reference to the passed-in state.

template<typename _Input, typename _Output, typename _ErrorHandler> static inline constexpr auto encode_one(_Input &&__input, _Output &&__output, _ErrorHandler &&__error_handler, state &__s)

Encodes a single complete unit of information as code units and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.

Remark

To the best ability of the implementation, the iterators will be returned untouched (e.g., the input models at least a view and a forward_range). If it is not possible, returned ranges may be incremented even if an error occurs due to the semantics of any view that models an input_range.

Parameters:

__input – [in] The input view to read code points from.
__output – [in] The output view to write code units into.
__error_handler – [in] The error handler to invoke if encoding fails.
__s – [inout] The necessary state information. For this encoding, the state is empty and means very little.

Returns:

A ztd::text::encode_result object that contains the reconstructed input range, reconstructed output range, error handler, and a reference to the passed-in state.

Public Static Attributes

static constexpr ::std::size_t max_code_points = 1: The maximum number of code points a single complete operation of decoding can produce. This is 1 for all Unicode Transformation Format (UTF) encodings.

static constexpr ::std::size_t max_code_units = 1: The maximum code units a single complete operation of encoding can produce.

static constexpr ::ztd::text_encoding_id encoded_id = __surrogates_allowed ? ::ztd::text_encoding_id::ucs4 : ::ztd::text_encoding_id::utf32: The encoding ID for this type. Used for optimization purposes.

static constexpr ::ztd::text_encoding_id decoded_id = __surrogates_allowed ? ::ztd::text_encoding_id::ucs4 : ::ztd::text_encoding_id::utf32: The encoding ID for this type. Used for optimization purposes.