Windows Code Pageļ
WideCharToMultiByte and MultiByteToWideChar on Windows platforms has a multitude of encodings with a particular set of quirks. Therefore, on Windows platforms, it may be advantageous to use this over the regular execution_encoding_t type or execution_encoding object. The constructor takes a raw int32_t whose sole purpose is to be one of the known Code Page Identifiers that Microsoft supplies on its platforms.
Warning
This type is inefficient due to the poor design of the WideCharToMultiByte/MultiByteToWideChar functions. While this will allow users to achieve parity, if the user knows their encoding ahead of time and can provide either their own encoding object or one of the encoding types provided by this library, it is highly preferred over this methodology.
Warning
For certain distributions, specific language packs may need to be installed in order to use certain conversions tied to specific Code Page Identifiers.
Aliasļ
-
using ztd::text::windows_code_page = basic_windows_code_page<char, unicode_code_point>ļ
Base Templatesļ
-
template<typename _CodeUnit, typename _CodePoint = unicode_code_point>
class basic_windows_code_pageļ The encoding representing a Windows conversion using WideCharToMultiByte (encode) and MultiByteToWideChar (decode) conversion sequences.
Remark
This is slow due to the bad design of WideCharToMultiByte/MultiByteToWideChar, but it does guarantee access to all of the wide variety of legacy encodings Windows supports. If possible, a different named encoding type should be used to avoid the performance penalties that comes from API limitations of this encoding object. Occasionally, for correct text, this API limitations can be mitigated when using the bulk APIs: for the single-conversion APIs that do not perform bulk conversion, the performance penalty is endemic.
Public Types
-
using code_point = _CodePointļ
The code point type that is decoded to, and encoded from.
Public Functions
-
inline constexpr ::std::optional<::ztd::span<const code_point>> maybe_replacement_code_points(const decode_state &__state) noexceptļ
Returns a potential view into the code points to use when an error happens and a replacement strategy is being used.
- Parameters:
__state ā [in] The decode state of this particular conversion operation.
- Returns:
A potential view of code points to do replacements with. This is influenced by either what the state for the conversion was initialized with, and it defaults to what was stored in the actual encoding object.
-
inline constexpr ::std::optional<::ztd::span<const code_unit>> maybe_replacement_code_units(const encode_state &__state) noexceptļ
Returns a potential view into the code points to use when an error happens and a replacement strategy is being used.
- Parameters:
__state ā [in] The encode state of this particular conversion operation.
- Returns:
A potential view of code units to do replacements with. This is influenced by either what the state for the conversion was initialized with, and it defaults to what was stored in the actual encoding object.
-
inline constexpr basic_windows_code_page() noexceptļ
Default constructor: assumes the code page is the default (thread) code page with the value
CP_THREAD_ACP.
-
inline constexpr basic_windows_code_page(uint32_t __win32_code_page) noexceptļ
Constructor that applies the given code page to the underlying conversions.
- Parameters:
__win32_code_page ā The code page. It is one of the allowed code page values (https://learn.microsoft.com/en-us/windows/win32/intl/code-page-identifiers) that are blessed by the WideCharToMultiByte and MultiByteToWideChar.
-
inline constexpr basic_windows_code_page(uint32_t __win32_code_page, uint16_t __replacement_code_point, int8_t __replacement_code_unit) noexceptļ
Constructor that applies the given code page to the conversions, and sets the default-used replacement code point and replacement code unit for the underlying algorithm where possible.
- Parameters:
__win32_code_page ā The code page. It is one of the allowed code page values (https://learn.microsoft.com/en-us/windows/win32/intl/code-page-identifiers) that are blessed by the WideCharToMultiByte and MultiByteToWideChar.
__replacement_code_point ā [in] The replacement code point (a single 2-byte UTF-16 code unit) to be used with the underlying algorithm, where permissible.
__replacement_code_unit ā [in] The replacement code unit (a single 1-byte
__win32_code_page-encoded code unit) to be used with the underlying algorithm, where permissible.
-
template<typename _Input, typename _Output, typename _ErrorHandler>
inline auto decode_one(_Input &&__input, _Output &&__output, _ErrorHandler &&__error_handler, decode_state &__s) constļ Decodes a single complete unit of information as code points and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.
- Parameters:
__input ā [in] The input view to read code uunits from.
__output ā [in] The output view to write code points into.
__error_handler ā [in] The error handler to invoke if encoding fails.
__s ā [inout] The necessary state information. Most encodings have no state, but because this is effectively a runtime encoding and therefore it is important to preserve and manage this state.
- Returns:
A ztd::text::decode_result object that contains the reconstructed input range, reconstructed output range, error handler, and a reference to the passed-in state.
-
template<typename _Input, typename _Output, typename _ErrorHandler>
inline auto encode_one(_Input &&__input, _Output &&__output, _ErrorHandler &&__error_handler, encode_state &__s) constļ Encodes a single complete unit of information as code units and produces a result with the input and output ranges moved past what was successfully read and written; or, produces an error and returns the input and output ranges untouched.
- Parameters:
__input ā [in] The input view to read code uunits from.
__output ā [in] The output view to write code points into.
__error_handler ā [in] The error handler to invoke if encoding fails.
__s ā [inout] The necessary state information. Most encodings have no state, but because this is effectively a runtime encoding and therefore it is important to preserve and manage this state.
- Returns:
A ztd::text::encode_result object that contains the reconstructed input range, reconstructed output range, error handler, and a reference to the passed-in state.
Public Static Attributes
-
static constexprconst::std::size_t max_code_units = 8ļ
The maximum code units a single complete operation of encoding can produce.
-
static constexprconst::std::size_t max_code_points = 8ļ
The maximum number of code points a single complete operation of decoding can produce.
-
struct decode_stateļ
The associated state for decode operations.
-
struct encode_stateļ
The associated state for encode operations.
-
using code_point = _CodePointļ