punycode

Punycode is an encoding that turns Unicode-encoded text into a sequence of ASCII code units, making it readily inspectable by simple systems and human eyes to see differences between text even if confusable characters are used in the Unicode text.

It is most frequently used in the Internationalized Domain Name in Applications system, specifically for DNS purposes. Despite it’s unfriendliness to stream-based, one-at-a-time interfaces, sufficiently heap-based state allows encoding and decoding in a useful manner.

It is also used as Rust’s choice encoding for its symbol names when stored in binary form for use with any given platform’s library/executable’s linker and loader.

Aliases

constexpr punycode_t ztd::text::punycode = punycode_t{}

A preexisting object of type ztd::text::punycode_t for ease-of-use.

constexpr punycode_idna_t ztd::text::punycode_idna = punycode_idna_t{}

A preexisting object of type ztd::text::punycode_idna_t for ease-of-use.

using ztd::text::punycode_t = basic_punycode<char, unicode_code_point>

A ztd::text::basic_any_punycode type using char and ztd::text::unicode_code_point.

using ztd::text::punycode_idna_t = basic_punycode_idna<char, unicode_code_point>

A ztd::text::basic_any_punycode type using char and ztd::text::unicode_code_point.

template<typename _CodeUnit = char, typename _CodePoint = unicode_code_point>
using ztd::text::basic_punycode = basic_any_punycode<idna::no, _CodeUnit, _CodePoint>

A convenience typedef for ztd::text::basic_any_punycode with the ztd::text::idna::no provided.

Template Parameters:
  • _CodeUnit – The code unit type for encoded text.

  • _CodePoint – The code point type for decoded text.

template<typename _CodeUnit, typename _CodePoint>
using ztd::text::basic_punycode_idna = basic_any_punycode<idna::yes, _CodeUnit, _CodePoint>

A convenience typedef for ztd::text::basic_any_punycode with the ztd::text::idna::yes provided.

Template Parameters:
  • _CodeUnit – The code unit type for encoded text.

  • _CodePoint – The code point type for decoded text.

Base Template

template<idna _IsIdna, typename _CodeUnit = char, typename _CodePoint = unicode_code_point>
class basic_any_punycode : public __fixed_cuneicode<basic_any_punycode<_IsIdna, char, unicode_code_point>, char, unicode_code_point, 1, 1, decltype(_IsIdna == idna::yes ? &::cnc_mcnrtoc32n_punycode_idna : &::cnc_mcnrtoc32n_punycode), (_IsIdna == idna::yes ? &::cnc_mcnrtoc32n_punycode_idna : &::cnc_mcnrtoc32n_punycode), decltype(_IsIdna == idna::yes ? &::cnc_c32nrtomcn_punycode_idna : &::cnc_c32nrtomcn_punycode), (_IsIdna == idna::yes ? &::cnc_c32nrtomcn_punycode_idna : &::cnc_c32nrtomcn_punycode), cnc_pny_decode_state_t, decltype(&::cnc_pny_decode_state_is_complete), &::cnc_pny_decode_state_is_complete, cnc_pny_encode_state_t, decltype(&::cnc_pny_encode_state_is_complete), &::cnc_pny_encode_state_is_complete, true, true, char, ztd_char32_t>

The punycode encoding, as envisioned by RFC 3492 and (potentially) influenced by IDNA (RFC 5890).

Template Parameters:
  • _IsIdna – Whether or not this punycode type looks for IDNA prefixes and obeys its encoding rules.

  • _CodeUnit – The code unit type for the encoded ASCII text.

  • _CodePoint – The code point type for the Unicode Code Point decoded text.

Public Types

using code_unit = char

The code unit type for encoded text.

using code_point = unicode_code_point

The code point type for decoded text.

using is_decode_injective = ::std::integral_constant<bool, _IsDecodeInjective>

The whether or not the decode portion of this encoding is injective.

using is_encode_injective = ::std::integral_constant<bool, _IsEncodeInjective>

The whether or not the decode portion of this encoding is injective.

Public Static Functions

static inline auto decode_one(_Input &&__input, _Output &&__output, _ErrorHandler &&__error_handler, decode_state &__state) noexcept

Decodes a provided range of punycode-encoded data and pulls out data.

Remark

This function may need to be called with empty input data repeatedly to fully drain any stored information in the provided __state. The entire input may be consumed before any information is output.

Template Parameters:
  • _Input – The input range type.

  • _Output – The output range type.

  • _ErrorHandler – The error handler type.

Parameters:
  • __input – The input range.

  • __output – The output range.

  • __error_handler – The error handler; this will be called whenever an error occurs during decoding.

  • __state – A reference to the decode state, which contains most of the running information about a punycode decoding operation. Discarding this between calls is not advised for this encoding type.

Returns:

A ztd::text::decode_result structure with the appropriate input and output types recosntructed, possibly filtered through an error handler if necessary.

static inline auto encode_one(_Input &&__input, _Output &&__output, _ErrorHandler &&__error_handler, encode_state &__state)

Encodes a provided range of punycode-encoded data and pulls out data.

Remark

This function may need to be called with empty input data repeatedly to fully drain any stored information in the provided __state. The entire input may be consumed before any information is output.

Template Parameters:
  • _Input – The input range type.

  • _Output – The output range type.

  • _ErrorHandler – The error handler type.

Parameters:
  • __input – The input range.

  • __output – The output range.

  • __error_handler – The error handler; this will be called whenever an error occurs during decoding.

  • __state – A reference to the encode state, which contains most of the running information about a punycode decoding operation. Discarding this between calls is not advised for this encoding type.

Returns:

A ztd::text::encode_result structure with the appropriate input and output types recosntructed, possibly filtered through an error handler if necessary.

Public Static Attributes

static constexpr ::std::size_t max_code_points

The maximum number of code points output by a decode operation. In this case, only 1 code point is output at a time.

static constexpr ::std::size_t max_code_units

The maximum number of code units output by a encode operation. In this case, only 1 code unit is output at a time.