Encoding Schemeļ
The encoding_scheme
template turns any encoding into a byte-based encoding capable of reading and writing those bytes into and out of byte-value_type
ranges. It prevents duplicating effort to read encodings as little endian or big endian, allowing composition for any desired encoding to interface with e.g. a UTF-16 Big Endian blob of data coming over a network or shared pipe.
Aliasesļ
-
using ztd::text::basic_utf16_le = encoding_scheme<utf16_t, endian::little, _Byte>ļ
A UTF-16 encoding, in Little Endian format, with inputs as a sequence of bytes.
- Template Parameters
_Byte ā The byte type to use. Typically, this is
std::byte
oruchar.
-
using ztd::text::utf16_le_t = basic_utf16_le<::std::byte>ļ
A UTF-16 encoding, in Little Endian format, with inputs as a sequence of bytes.
-
using ztd::text::basic_utf16_be = encoding_scheme<utf16_t, endian::big, _Byte>ļ
A UTF-16 encoding, in Big Endian format, with inputs as a sequence of bytes.
- Template Parameters
_Byte ā The byte type to use. Typically, this is
std::byte
orunsigned char
.
-
using ztd::text::utf16_be_t = basic_utf16_be<::std::byte>ļ
A UTF-16 encoding, in Big Endian format, with inputs as a sequence of bytes.
-
using ztd::text::basic_utf16_ne = encoding_scheme<utf16_t, endian::native, _Byte>ļ
A UTF-16 encoding, in Native Endian format, with inputs as a sequence of bytes.
- Template Parameters
_Byte ā The byte type to use. Typically, this is
std::byte
orunsigned char
.
-
using ztd::text::utf16_ne_t = basic_utf16_ne<::std::byte>ļ
A UTF-16 encoding, in Native Endian format, with inputs as a sequence of bytes.
-
using ztd::text::basic_utf32_le = encoding_scheme<utf32_t, endian::little, _Byte>ļ
A UTF-32 encoding, in Little Endian format, with inputs as a sequence of bytes.
- Template Parameters
_Byte ā The byte type to use. Typically, this is
std::byte
orunsigned char
.
-
using ztd::text::utf32_le_t = basic_utf32_le<::std::byte>ļ
A UTF-32 encoding, in Little Endian format, with inputs as a sequence of bytes.
-
using ztd::text::basic_utf32_be = encoding_scheme<utf32_t, endian::big, _Byte>ļ
A UTF-32 encoding, in Big Endian format, with inputs as a sequence of bytes.
- Template Parameters
_Byte ā The byte type to use. Typically, this is
std::byte
orunsigned char
.
-
using ztd::text::utf32_be_t = basic_utf32_be<::std::byte>ļ
A UTF-32 encoding, in Big Endian format, with inputs as a sequence of bytes.
-
using ztd::text::basic_utf32_ne = encoding_scheme<utf32_t, endian::native, _Byte>ļ
A UTF-32 encoding, in Native Endian format, with inputs as a sequence of bytes.
- Template Parameters
_Byte ā The byte type to use. Typically, this is
std::byte
orunsigned char
.
-
using ztd::text::utf32_ne_t = basic_utf32_ne<::std::byte>ļ
A UTF-32 encoding, in Big Endian format, with inputs as a sequence of bytes.
Base Templateļ
-
template<typename _Encoding, endian _Endian = endian::native, typename _Byte = ::std::byte>
class encoding_scheme : public __txt_detail::__is_unicode_encoding_es<encoding_scheme<_Encoding, _Endian, _Byte>, unwrap_remove_cvref_t<_Encoding>>, private ebco<_Encoding>ļ Decomposes the provided Encoding type into a specific endianness (big, little, or native) to allow for a single encoding type to be viewed in different ways.
Remark
For example, this can be used to construct a Big Endian UTF-16 by using
encoding_scheme<ztd::text::utf16_t, ztd::endian::big>
. It can be made interopable withunsigned char
buffers rather thanstd::byte
buffers by doing:ztd::text::encoding_scheme<ztd::text::utf32_t, ztd::endian::native, unsigned char>
.- Template Parameters
_Encoding ā The encoding type.
_Endian ā The endianess to use. Defaults to ztd::endian::native.
_Byte ā The byte type to use. Defaults to
std::byte
.
Public Types
-
using encoding_type = _Encodingļ
The encoding type. ///.
The encoding type that this scheme wraps.
-
using code_point = code_point_t<_UBaseEncoding>ļ
The individual units that result from a decode operation or as used as input to an encode operation. For most encodings, this is going to be a Unicode Code Point or a Unicode Scalar Value.
-
using code_unit = _Byteļ
The individual units that result from an encode operation or are used as input to a decode operation.
Remark
Typically, this type is usually always some kind of byte type (unsigned char or std::byte or other
sizeof(obj) == 1
type)./
-
using decode_state = decode_state_t<_UBaseEncoding>ļ
The state that can be used between calls to the decode function.
Remark
Even if the underlying encoding only has a single
state
type, we need to separate the two out in order to generically handle all encodings. Therefore, the encoding_scheme will always have bothencode_state
anddecode_state.
-
using encode_state = encode_state_t<_UBaseEncoding>ļ
The state that can be used between calls to the encode function.
Remark
Even if the underlying encoding only has a single
state
type, we need to separate the two out in order to generically handle all encodings. Therefore, the encoding_scheme will always have bothencode_state
anddecode_state.
-
using is_encode_injective = ::std::integral_constant<bool, is_encode_injective_v<_UBaseEncoding>>ļ
Whether or not the encode operation can process all forms of input into code point values.
Remark
Defers to what the underlying
encoding_type
does.
-
using is_decode_injective = ::std::integral_constant<bool, is_decode_injective_v<_UBaseEncoding>>ļ
Whether or not the decode operation can process all forms of input into code point values.
Remark
Defers to what the underlying
encoding_type
does.
Public Functions
-
encoding_scheme() = defaultļ
Default constructs a ztd::text::encoding_scheme.
-
template<typename _Arg0, typename ..._Args, ::std::enable_if_t<!::std::is_same_v<remove_cvref_t<_Arg0>, encoding_scheme> && !::std::is_same_v<remove_cvref_t<_Arg0>, ::std::in_place_t>>* = nullptr>
inline constexpr encoding_scheme(_Arg0 &&__arg0, _Args&&... __args) noexcept(::std::is_nothrow_constructible_v<_UBaseEncoding, _Arg0, _Args...>)ļ Constructs a ztd::text::encoding_scheme with the encoding object and any additional arguments.
- Parameters
__arg0 ā [in] The first argument used to construct the stored encoding.
__args ā [in] Any additional arguments used to construct the stored encoding.
-
template<typename ..._Args>
inline constexpr encoding_scheme(::std::in_place_t, _Args&&... __args) noexcept(::std::is_nothrow_constructible_v<_UBaseEncoding, _Args...>)ļ Constructs a ztd::text::encoding_scheme with the encoding object and any additional arguments.
Remark
If the provided encoding does not have a byte code_unit type, it is wrapped in an ztd::text::encoding_scheme first.
- Parameters
__args ā [in] Any additional arguments used to construct the encoding in the erased storage.
-
encoding_scheme(const encoding_scheme&) = defaultļ
Cannot copy-construct a ztd::text::encoding_scheme object.
-
encoding_scheme &operator=(const encoding_scheme&) = defaultļ
Cannot copy-assign a ztd::text::encoding_scheme object.
-
encoding_scheme(encoding_scheme&&) = defaultļ
Move-constructs a ztd::text::encoding_scheme from the provided r-value reference.
Remark
This leaves the passed-in r-value reference without an encoding object. Calling any function on a moved-fron ztd::text::encoding_scheme, except for destruction, is a violation and invokes Undefined Behavior (generally, a crash).
-
encoding_scheme &operator=(encoding_scheme&&) = defaultļ
Move-assigns a ztd::text::encoding_scheme from the provided r-value reference.
Remark
This leaves the passed-in r-value reference without an encoding object. Calling any function on a moved-fron ztd::text::encoding_scheme, except for destruction, is a violation and may invoke Undefined Behavior (generally, a crash).
-
inline constexpr encoding_type &base() & noexceptļ
Retrives the underlying encoding object.
- Returns
An l-value reference to the encoding object.
-
inline constexpr const encoding_type &base() const & noexceptļ
Retrives the underlying encoding object.
- Returns
An l-value reference to the encoding object.
-
inline constexpr encoding_type &&base() && noexceptļ
Retrives the underlying encoding object.
- Returns
An l-value reference to the encoding object.
-
template<typename _Unused = encoding_type, ::std::enable_if_t<is_code_units_replaceable_v<_Unused>>* = nullptr>
inline decltype(auto) constexpr replacement_code_units() const noexceptļ Returns, the desired replacement code units to use.
Remark
This is only callable if the function call exists on the wrapped encoding. It is broken down into a contiguous view type formulated from bytes if the wrapped code unit types do not match.
-
template<typename _Unused = encoding_type, ::std::enable_if_t<is_code_points_replaceable_v<_Unused>>* = nullptr>
inline decltype(auto) constexpr replacement_code_points() const noexceptļ Returns the desired replacement code points to use.
Remark
Is only callable if the function call exists on the wrapped encoding.
-
template<typename _Unused = encoding_type, ::std::enable_if_t<is_code_units_maybe_replaceable_v<_Unused>>* = nullptr>
inline decltype(auto) constexpr maybe_replacement_code_units() const noexceptļ Returns the desired replacement code units to use, or an empty optional-like type if there is nothing present.
Remark
This is only callable if the function call exists on the wrapped encoding. It is broken down into a contiguous view type formulated from bytes if the wrapped code unit types do not match.
-
template<typename _Unused = encoding_type, ::std::enable_if_t<is_code_points_maybe_replaceable_v<_Unused>>* = nullptr>
inline decltype(auto) constexpr maybe_replacement_code_points() const noexceptļ Returns the desired replacement code units to use.
Remark
This Is only callable if the function call exists on the wrapped encoding.
-
inline constexpr bool contains_unicode_encoding() const noexceptļ
Whether or not this encoding is some form of Unicode encoding.
-
template<typename _Result, typename _InputProgress, typename _OutputProgress, ::std::enable_if_t<is_input_error_skippable_v<const encoding_type&, _Result, const _InputProgress&, const _OutputProgress&>>* = nullptr>
inline decltype(auto) constexpr skip_input_error(_Result &&__result) const noexcept(::ztd::text::is_nothrow_skip_input_error_v<const encoding_type&, _Result, const _InputProgress&, const _OutputProgress&>)ļ Skips any consecutive input errors in the encoded input, where possible.
Remark
This Is only callable if the function call exists on the wrapped encoding.
-
template<typename _Input, typename _Output, typename _ErrorHandler>
inline constexpr auto decode_one(_Input &&__input, _Output &&__output, _ErrorHandler &&__error_handler, decode_state &__s) constļ Decodes a single complete unit of information as code points and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.
Remark
To the best ability of the implementation, the iterators will be returned untouched (e.g., the input models at least a view and a forward_range). If it is not possible, returned ranges may be incremented even if an error occurs due to the semantics of any view that models an input_range.
- Parameters
__input ā [in] The input view to read code uunits from.
__output ā [in] The output view to write code points into.
__error_handler ā [in] The error handler to invoke if encoding fails.
__s ā [inout] The necessary state information. For this encoding, the state is empty and means very little.
- Returns
A ztd::text::decode_result object that contains the input range, output range, error handler, and a reference to the passed-in state.
-
template<typename _Input, typename _Output, typename _ErrorHandler>
inline constexpr auto encode_one(_Input &&__input, _Output &&__output, _ErrorHandler &&__error_handler, encode_state &__s) constļ Encodes a single complete unit of information as code units and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.
Remark
To the best ability of the implementation, the iterators will be returned untouched (e.g., the input models at least a view and a forward_range). If it is not possible, returned ranges may be incremented even if an error occurs due to the semantics of any view that models an input_range.
- Parameters
__input ā [in] The input view to read code points from.
__output ā [in] The output view to write code units into.
__error_handler ā [in] The error handler to invoke if encoding fails.
__s ā [inout] The necessary state information. For this encoding, the state is empty and means very little.
- Returns
A ztd::text::encode_result object that contains the input range, output range, error handler, and a reference to the passed-in state.
Public Static Attributes
-
static constexprconst::std::size_t max_code_points = max_code_points_v<_UBaseEncoding>ļ
The maximum number of code points a single complete operation of decoding can produce. This is 1 for all Unicode Transformation Format (UTF) encodings.
-
static constexprconst::std::size_t max_code_units = (max_code_units_v<_UBaseEncoding> * sizeof(_BaseCodeUnit)) / (sizeof(_Byte))ļ
The maximum code units a single complete operation of encoding can produce.
-
static constexprconst::ztd::text_encoding_id decoded_id = decoded_id_v<_UBaseEncoding>ļ
The id representing the decoded text.
-
static constexprconst::ztd::text_encoding_id encoded_id = ::ztd::to_byte_text_encoding_id(encoded_id_v<_UBaseEncoding>, _Endian, sizeof(code_unit_t<_UBaseEncoding>))ļ
The id representing the encoded text.