any_encodingο
any_encoding
is a class type whose sole purpose is to provide a type-generic, byte-based, runtime-deferred way of handling encodings.
Aliasesο
-
using ztd::text::any_encoding = any_byte_encoding<::std::byte>ο
The canonical erased encoding type which uses a
std::byte
as its code unit type and anunicode_code_point
as its code point type, with spans for input and output operations.Remark
If the input encoding does not match
std::byte
, it will be first wrapped in a ztd::text::encoding_scheme first.
-
using ztd::text::compat_any_encoding = any_byte_encoding<char>ο
The canonical erased encoding type which uses a
char
as its code unit type and anunicode_code_point
as its code point type, with spans for input and output operations.Remark
If the input encoding does not match
char
, it will be first wrapped in a ztd::text::encoding_scheme first. Use this type when dealing with what are effectively byte stream inputs but oriented in a legacy manner, such as oldstd::string
or<iostream>
-based work.
-
using ztd::text::ucompat_any_encoding = any_byte_encoding<unsigned char>ο
The canonical erased encoding type which uses a
unsigned char
as its code unit type and anunicode_code_point
as its code point type, with spans for input and output operations.Remark
If the input encoding does not match
unsigned char
, it will be first wrapped in a ztd::text::encoding_scheme first. Use this type when dealing with what are effectively byte stream inputs but oriented around a slightly more modern approach to proper unsigned data handling withunsigned char
.
Base Templateο
-
template<typename _Byte, typename _CodePoint = unicode_code_point>
class any_byte_encoding : public ztd::text::any_encoding_with<_Byte, const unicode_code_point, const _Byte, unicode_code_point>ο An encoding type that wraps up other encodings to specifically traffic in the given
_Byte
type provided, which is typically set tostd::byte
.Remark
This type traffics solely in
std::span
s, which for most people is fine. Others may want to interface with different iterator types (e.g., from a custom Rope implementation or other). For those, one must first create ranges that can operate with those iterators, then use them themselves. (Itβs not an ideal process at the moment, and we are looking to make this experience better.) It is recommended to use the provided ztd::text::any_encoding type definition instead of accessing this directly, unless you have a reason for using a different byte type (e.g., interfacing with legacy APIs).- Template Parameters:
_Byte β The byte type to use. Typically, this is either
unsigned char
orstd::byte
.
Public Types
-
using decode_state = any_decode_stateο
The state that can be used between calls to
decode
.Remark
This is an opaque struct with no members. It follows the βencoding-dependent stateβ model, which means it has a constructor that takes an ztd::text::any_encoding_with so it can properly initialize its state.
-
using encode_state = any_encode_stateο
The state that can be used between calls to
encode
.Remark
This is an opaque struct with no members. It follows the βencoding-dependent stateβ model, which means it has a constructor that takes an ztd::text::any_encoding_with so it can properly initialize its state.
-
using code_unit = ranges::range_value_type_t<_EncodeCodeUnits>ο
The individual units that result from an encode operation or are used as input to a decode operation.
-
using code_point = ranges::range_value_type_t<_DecodeCodePoints>ο
The individual units that result from a decode operation or as used as input to an encode operation.
-
using is_encode_injective = ::std::false_typeο
Whether or not the encode operation can process all forms of input into code point values.
Remark
This is always going to be false because this is a type-erased encoding; this value is determined by a runtime decision, which means that the most conservative and truthful answer is selected for this property.
-
using is_decode_injective = ::std::false_typeο
Whether or not the decode operation can process all forms of input into code point values.
Remark
This is always going to be false because this is a type-erased encoding; this value is determined by a runtime decision, which means that the most conservative and truthful answer is selected for this property.
Public Functions
-
any_byte_encoding() = deleteο
Cannot default-construct a ztd::text::any_byte_encoding object.
-
template<typename _EncodingArg, typename ..._Args, ::std::enable_if_t<!::std::is_same_v<remove_cvref_t<_EncodingArg>, any_byte_encoding> && !::std::is_same_v<__txt_detail::__code_unit_or_void_t<remove_cvref_t<_EncodingArg>>, _Byte> && !is_specialization_of_v<remove_cvref_t<_EncodingArg>, ::ztd::text::any_byte_encoding> && !::std::is_same_v<remove_cvref_t<_EncodingArg>, __base_t> && !is_specialization_of_v<remove_cvref_t<_EncodingArg>, ::std::in_place_type_t>>* = nullptr>
inline any_byte_encoding(_EncodingArg &&__encoding, _Args&&... __args)ο Constructs a ztd::text::any_byte_encoding with the encoding object and any additional arguments.
Remark
If the provided encoding does not have a byte code_unit type, it is wrapped in an ztd::text::encoding_scheme first.
- Parameters:
__encoding β [in] The encoding object that informs the ztd::text::any_byte_encoding what encoding object to store.
__args β [in] Any additional arguments used to construct the encoding in the erased storage.
-
template<typename _EncodingArg, typename ..._Args, ::std::enable_if_t<!::std::is_same_v<_Byte, code_unit_t<remove_cvref_t<_EncodingArg>>>>* = nullptr>
inline any_byte_encoding(::std::in_place_type_t<_EncodingArg>, _Args&&... __args)ο Constructs a ztd::text::any_byte_encoding with the encoding object and any additional arguments.
Remark
If the provided encoding does not have a byte code_unit type, it is wrapped in an ztd::text::encoding_scheme first.
- Template Parameters:
_EncodingArg β The Encoding specified by the
std::in_place_type<...>
argument.- Parameters:
__args β [in] Any additional arguments used to construct the encoding in the erased storage.
-
template<typename _EncodingArg, typename ..._Args, ::std::enable_if_t<::std::is_same_v<_Byte, code_unit_t<remove_cvref_t<_EncodingArg>>>>* = nullptr>
inline any_byte_encoding(::std::in_place_type_t<_EncodingArg> __tag, _Args&&... __args)ο Constructs a ztd::text::any_byte_encoding with the encoding object and any additional arguments.
Remark
If the provided encoding does not have a byte code_unit type, it is wrapped in an ztd::text::encoding_scheme first.
- Template Parameters:
_EncodingArg β The Encoding specified by the
std::in_place_type<...>
argument.- Parameters:
__tag β [in] A tag containing the encoding type.
__args β [in] Any additional arguments used to construct the encoding in the erased storage.
-
any_byte_encoding(const any_byte_encoding&) = deleteο
Cannot copy-construct a ztd::text::any_byte_encoding object.
-
any_byte_encoding &operator=(const any_byte_encoding&) = deleteο
Cannot copy-assign a ztd::text::any_byte_encoding object.
-
any_byte_encoding(any_byte_encoding&&) = defaultο
Move-constructs a ztd::text::any_byte_encoding from the provided r-value reference.
Remark
This leaves the passed-in r-value reference without an encoding object. Calling any function on a moved-fron ztd::text::any_byte_encoding, except for destruction, is a violation and invokes Undefined Behavior (generally, a crash).
-
any_byte_encoding &operator=(any_byte_encoding&&) = defaultο
Move-assigns a ztd::text::any_byte_encoding from the provided r-value reference.
Remark
This leaves the passed-in r-value reference without an encoding object. Calling any function on a moved-fron ztd::text::any_byte_encoding, except for destruction, is a violation and invokes Undefined Behavior (generally, a crash).
-
inline ::std::optional<::ztd::span<const code_point>> maybe_replacement_code_points() const noexceptο
Retrieves the replacement code points for when conversions fail and ztd::text::replacement_handler_t (or equivalent) needs to make a substitution.
- Returns:
A
std::optional
ofztd::span
ofconst code_point
s. The returnedstd::optional
value is engaged (has a value) if the stored encoding has a validreplacement_code_points
function and it can be called. If it does not, then the library checks to see if themaybe_replacement_code_points
function exists, and returns thestd::optional
from that type directly. If neither are present, an unengagedstd::optional
is returned.
-
inline ::std::optional<::ztd::span<const code_unit>> maybe_replacement_code_units() const noexceptο
Retrieves the replacement code units for when conversions fail and ztd::text::replacement_handler_t (or equivalent) needs to make a substitution.
- Returns:
A
std::optional
ofztd::span
ofconst code_unit
s. The returnedstd::optional
value is engaged (has a value) if the stored encoding has a validreplacement_code_units
function and it can be called. If it does not, then the library checks to see if themaybe_replacement_code_units
function exists, and returns thestd::optional
from that type directly. If neither are present, an unengagedstd::optional
is returned.
-
inline bool contains_unicode_encoding() const noexceptο
Returns whether or not the encoding stored in this ztd::text::any_encoding_with is a Unicode encoding.
Remark
This can be useful to know, in advance, whether or not there is a chance for lossy behavior. Even if, at compile time, various functions will demand you use an error handler, this runtime property can help you get a decent idea of just how bad and lossy this conversion might be compared to normal UTF conversion formats.
-
inline __decode_result decode_one(_DecodeCodeUnits __input, _DecodeCodePoints __output, __decode_error_handler __error_handler, decode_state &__state) constο
Decodes a single complete unit of information as code points and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.
Remark
To the best ability of the implementation, the iterators will be returned untouched (e.g., the input models at least a view and a forward_range). If it is not possible, returned ranges may be incremented even if an error occurs due to the semantics of any view that models an input_range.
- Parameters:
__input β [in] The input view to read code uunits from.
__output β [in] The output view to write code points into.
__error_handler β [in] The error handler to invoke if encoding fails.
__state β [inout] The necessary state information. For this encoding, the state is empty and means very little.
- Returns:
A ztd::text::decode_result object that contains the input range, output range, error handler, and a reference to the passed-in state.
-
inline __encode_result encode_one(_EncodeCodePoints __input, _EncodeCodeUnits __output, __encode_error_handler __error_handler, encode_state &__state) constο
Encodes a single complete unit of information as code units and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.
Remark
To the best ability of the implementation, the iterators will be returned untouched (e.g., the input models at least a view and a forward_range). If it is not possible, returned ranges may be incremented even if an error occurs due to the semantics of any view that models an input_range.
- Parameters:
__input β [in] The input view to read code points from.
__output β [in] The output view to write code units into.
__error_handler β [in] The error handler to invoke if encoding fails.
__state β [inout] The necessary state information. For this encoding, the state is empty and means very little.
- Returns:
A ztd::text::encode_result object that contains the input range, output range, error handler, and a reference to the passed-in state.
Public Static Attributes
-
static constexpr ::std::size_t max_code_points = _MaxCodePointsο
The maximum number of code points a single complete operation of decoding can produce. This is 1 for all Unicode Transformation Format (UTF) encodings.
-
static constexpr ::std::size_t max_code_units = _MaxCodeUnitsο
The maximum code units a single complete operation of encoding can produce.
-
static constexpr ::ztd::text_encoding_id decoded_id = ::ztd::text_encoding_id::unknownο
The decoded id. Because this is a type-erased encoding, anything can come out: therefore, it is set to βunknownβ at all times.
-
static constexpr ::ztd::text_encoding_id encoded_id = ::ztd::text_encoding_id::unknownο
The encoded id. Because this is a type-erased encoding, anything can come out: therefore, it is set to βunknownβ at all times.