Execution

This is the locale-based, runtime encoding. It uses a number of compile-time and runtime heuristics to eventually be resolved to an implementation-defined encoding. It is not required to work in constant expressions either: for this, use ztd::text::literal, which represents the compile-time string (e.g. "my string") encoding.

Currently, the hierachy of behaviors is like so:

  • If the platform is MacOS, then it assumes this is UTF-8;

  • Otherwise, if libiconv is available, then it attempts to use iconv configured to the "char"-identified encoding;

  • Otherwise, if the headers <cuchar> or <uchar.h> are available, then it attempts to use a gnarly, lossy, and dangerous encoding that potentially traffics through the C Standard Library and Locale APIs;

  • Otherwise, it produces a compile-time error.

Warning

The C Standard Library has many design defects in its production of code points, which may make it unsuitable even if your C Standard Library recognizes certain locales (e.g., Big5-HKSCS). The runtime will always attempt to load iconv if the definition is turned on, since it may do a better job than the C Standard Library’s interfaces until C23.

Even if, on a given platform, it can be assumed to be a static encoding (e.g., Apple/MacOS where it always returns the “C” Locale but processes text as UTF-8), ztd::text::execution will always present itself as a runtime and unknowable encoding. This is to prevent portability issues from relying on, e.g., ztd::text::is_decode_injective_v<ztd::text::execution> being true during development and working with that assumption, only to have it break when ported to a platform where that assumption no longer holds.

constexpr execution_t ztd::text::execution = {}

An instance of the execution_t type for ease of use.

typedef no_encoding<char, unicode_code_point> ztd::text::execution_t

The Encoding that represents the “Execution” (narrow locale-based) encoding. The encoding is typically associated with the locale, which is tied to the C standard library’s setlocale function.

Remark

Use of this type is subject to the C Standard Library or platform defaults. Some locales (such as the Big5 Hong King Supplementary Character Set (Big5-HKSCS)) are broken when accessed without ZTD_TEXT_USE_CUNEICODE beingdefined, due to fundamental design issues in the C Standard Library and bugs in glibc/musl libc’s current locale encoding support. On Apple, this is cuurrently assumed to be UTF-8 since they do not support the <cuchar> or <uchar.h> headers.

Internal Types

Warning

⚠️ Names with double underscores, and within the __*detail and __*impl namespaces are reserved for the implementation. Referencing this entity directly is bad, and the name/functionality can be changed at any point in the future. Relying on anything not guaranteed by the documentation is ☢️☢️Undefined Behavior☢️☢️.

MacOS-based

class ztd::text::__txt_impl::__execution_mac_os : private __utf8_with<__execution_mac_os, char, char32_t>

The default (“locale”) encoding for Mac OS.

Remark

Note that for all intents and purposes, Mac OS demands that all text is in UTF-8. However, on Big Sur, Catalina, and a few other platforms locale functionality and data has been either forgotten/left behind or intentionally kept in place on these devices. It may be possible that with very dedicated hacks one can still change the desired default encoding from UTF-8 to something else in the majority of Apple text. Their documentation states that all text “should” be UTF-8, but very explicitly goes out of its way to not make that hard guarantee. Since it is a BSD-like system and they left plenty of that data behind from C libraries, this may break in extremely obscure cases. Please be careful on Apple machines!

Public Types

using code_point = code_point_t<__base_t>

The code point type that is decoded to, and encoded from. ///.

using code_unit = code_unit_t<__base_t>

The code unit type that is decoded from, and encoded to. ///.

using decode_state = decode_state_t<__base_t>

The associated state for decode operations. ///.

using encode_state = encode_state_t<__base_t>

The associated state for encode operations. ///.

using is_unicode_encoding = ::std::integral_constant<bool, is_unicode_encoding_v<__base_t>>

Whether or not this encoding is a unicode encoding or not. ///.

using is_decode_injective = ::std::false_type

Whether or not this encoding’s decode_one step is injective or not. ///.

using is_encode_injective = ::std::false_type

Whether or not this encoding’s encode_one step is injective or not. ///.

Public Static Functions

template<typename _InputRange, typename _OutputRange, typename _ErrorHandler>
static inline constexpr auto decode_one(_InputRange &&__input, _OutputRange &&__output, _ErrorHandler &&__error_handler, decode_state &__s)

Decodes a single complete unit of information as code points and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.

Parameters
  • __input[in] The input view to read code uunits from.

  • __output[in] The output view to write code points into.

  • __error_handler[in] The error handler to invoke if encoding fails.

  • __s[inout] The necessary state information. Most encodings have no state, but because this is effectively a runtime encoding and therefore it is important to preserve and manage this state.

Returns

A ztd::text::decode_result object that contains the reconstructed input range, reconstructed output range, error handler, and a reference to the passed-in state.

template<typename _InputRange, typename _OutputRange, typename _ErrorHandler>
static inline constexpr auto encode_one(_InputRange &&__input, _OutputRange &&__output, _ErrorHandler &&__error_handler, encode_state &__s)

Encodes a single complete unit of information as code units and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.

Parameters
  • __input[in] The input view to read code uunits from.

  • __output[in] The output view to write code points into.

  • __error_handler[in] The error handler to invoke if encoding fails.

  • __s[inout] The necessary state information. Most encodings have no state, but because this is effectively a runtime encoding and therefore it is important to preserve and manage this state.

Returns

A ztd::text::encode_result object that contains the reconstructed input range, reconstructed output range, error handler, and a reference to the passed-in state.

Public Static Attributes

static constexpr ::std::size_t max_code_points = 8

The maximum code units a single complete operation of encoding can produce.

Remark

There are encodings for which one input can produce 3 code points (some Tamil encodings) and there are rumours of an encoding that can produce 7 code points from a handful of input. We use a protective/conservative 8, here, to make sure ABI isn’t broken later.

static constexpr ::std::size_t max_code_units = MB_LEN_MAX

The maximum number of code points a single complete operation of decoding can produce.

Remark

This is bounded by the platform’s MB_LEN_MAX macro, which is an integral constant expression representing the maximum value of output all C locales can produce from a single complete operation.

Private Static Functions

static inline constexpr auto encode_one(_InputRange &&__input, _OutputRange &&__output, _ErrorHandler &&__error_handler, encode_state &__s)

Encodes a single complete unit of information as code units and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.

Remark

To the best ability of the implementation, the iterators will be returned untouched (e.g., the input models at least a view and a forward_range). If it is not possible, returned ranges may be incremented even if an error occurs due to the semantics of any view that models an input_range.

Parameters
  • __input[in] The input view to read code points from.

  • __output[in] The output view to write code units into.

  • __error_handler[in] The error handler to invoke if encoding fails.

  • __s[inout] The necessary state information. For this encoding, the state is empty and means very little.

Returns

A ztd::text::encode_result object that contains the reconstructed input range, reconstructed output range, error handler, and a reference to the passed-in state.

static inline constexpr auto decode_one(_InputRange &&__input, _OutputRange &&__output, _ErrorHandler &&__error_handler, decode_state &__s)

Decodes a single complete unit of information as code points and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.

Remark

To the best ability of the implementation, the iterators will be returned untouched (e.g., the input models at least a view and a forward_range). If it is not possible, returned ranges may be incremented even if an error occurs due to the semantics of any view that models an input_range.

Parameters
  • __input[in] The input view to read code uunits from.

  • __output[in] The output view to write code points into.

  • __error_handler[in] The error handler to invoke if encoding fails.

  • __s[inout] The necessary state information. For this encoding, the state is empty and means very little.

Returns

A ztd::text::decode_result object that contains the reconstructed input range, reconstructed output range, error handler, and a reference to the passed-in state.