⚠️ INTERNAL ⚠️ Transcoding - Unicode Encodings

Warning

The benchmarks in this section are solely for tracking the behavior and runtime of effectively niche or internal behaviors, such as single conversion from bulk functionality built into ztd.cuneicode’s registry or how much it costs to run a basic loop.

Please keep this in mind as you browse.

The purpose of these graphs is to see how expensive it is for various styles of encoding. It includes all of the other benchmarks, but adds a few more for ztd.cuneicode and ztd.text. These labels are the more important ones:

  • “single from bulk loop”

  • “basic”

These mirror the points documented here; specifically, using a limiting technique to recreate a one-at-a-time, indivisible unit of work from a single loop, or using a technique to process an entire set of input by using the one-by-one functions. The latter technique is the foundation of both ztd.text and ztd.cuneicode, so it is illuminating to see how (poorly) the former technique (“single from bulk”) performs against the latter technique (“basic”, the bulk-from-single functionality). This described :doc:here.

All Unicode Code Points

The following benchmarks test the conversion of the basic source character set (from C and C++), going from one UTF encoding to another UTF encoding. It helps test speed under larger conversions.

../_images/utf8_to_utf16_well_formed.png
../_images/utf16_to_utf8_well_formed.png
../_images/utf32_to_utf8_well_formed.png
../_images/utf8_to_utf32_well_formed.png
../_images/utf16_to_utf32_well_formed.png
../_images/utf8_to_utf8_well_formed.png
../_images/utf16_to_utf16_well_formed.png
../_images/utf32_to_utf32_well_formed.png

All Unicode Code Points, With Initialization

The following benchmarks measure the same conversion, but with any initialization object included within the conversion (e.g., for modeling a library that cannot store a registry/conversion descriptor somewhere).

../_images/utf8_to_utf16_well_formed_init.png
../_images/utf16_to_utf8_well_formed_init.png
../_images/utf32_to_utf8_well_formed_init.png
../_images/utf8_to_utf32_well_formed_init.png
../_images/utf16_to_utf32_well_formed_init.png
../_images/utf8_to_utf8_well_formed_init.png
../_images/utf16_to_utf16_well_formed_init.png
../_images/utf32_to_utf32_well_formed_init.png

C Basic Source Character Set

The following benchmarks test a much smaller conversion on the basic source character set, checking not only ASCII but how well the conversion works for a much smaller inputs.

../_images/utf8_to_utf16_well_formed1.png
../_images/utf16_to_utf8_well_formed1.png
../_images/utf32_to_utf8_well_formed1.png
../_images/utf8_to_utf32_well_formed1.png
../_images/utf16_to_utf32_well_formed1.png
../_images/utf8_to_utf8_well_formed1.png
../_images/utf16_to_utf16_well_formed1.png
../_images/utf32_to_utf32_well_formed1.png

C Basic Source Character Set, With Initialization

The following benchmarks measure the same conversion, but with any initialization object included within the conversion (e.g., for modeling a library that cannot store a registry/conversion descriptor somewhere).

../_images/utf8_to_utf16_well_formed_init1.png
../_images/utf16_to_utf8_well_formed_init1.png
../_images/utf32_to_utf8_well_formed_init1.png
../_images/utf8_to_utf32_well_formed_init1.png
../_images/utf16_to_utf32_well_formed_init1.png
../_images/utf8_to_utf8_well_formed_init1.png
../_images/utf16_to_utf16_well_formed_init1.png
../_images/utf32_to_utf32_well_formed_init1.png