Transcoding - Unicode Encodings

Conversion speed is one of the biggest points of a transcoding API. The benchmarks here are meant to test how fast a conversion could potentially be, taking into account any affordances the API can give for speed (such as not checking the output write, not validating for input, or doing an indivisible unit of conversion in a loop/a range conversion/etc.). All the input text right now is completely valid and well-formed input; some of the benchmarks will take advantage of this to assume the input is valid. Such benchmarks are presented as separate graphs, and only for the APIs that allow it.

All Unicode Code Points

The following benchmarks test the conversion of the basic source character set (from C and C++), going from one UTF encoding to another UTF encoding. It helps test speed under larger conversions.

../_images/utf8_to_utf16_well_formed2.png
../_images/utf16_to_utf8_well_formed2.png
../_images/utf32_to_utf8_well_formed2.png
../_images/utf8_to_utf32_well_formed2.png
../_images/utf16_to_utf32_well_formed2.png
../_images/utf8_to_utf8_well_formed2.png
../_images/utf16_to_utf16_well_formed2.png
../_images/utf32_to_utf32_well_formed2.png

All Unicode Code Points, With Initialization

The following benchmarks measure the same conversion, but with any initialization object included within the conversion (e.g., for modeling a library that cannot store a registry/conversion descriptor somewhere).

../_images/utf8_to_utf16_well_formed_init2.png
../_images/utf16_to_utf8_well_formed_init2.png
../_images/utf32_to_utf8_well_formed_init2.png
../_images/utf8_to_utf32_well_formed_init2.png
../_images/utf16_to_utf32_well_formed_init2.png
../_images/utf8_to_utf8_well_formed_init2.png
../_images/utf16_to_utf16_well_formed_init2.png
../_images/utf32_to_utf32_well_formed_init2.png

C Basic Source Character Set

The following benchmarks test a much smaller conversion on the basic source character set, checking not only ASCII but how well the conversion works for a much smaller inputs.

../_images/utf8_to_utf16_well_formed3.png
../_images/utf16_to_utf8_well_formed3.png
../_images/utf32_to_utf8_well_formed3.png
../_images/utf8_to_utf32_well_formed3.png
../_images/utf16_to_utf32_well_formed3.png
../_images/utf8_to_utf8_well_formed3.png
../_images/utf16_to_utf16_well_formed3.png
../_images/utf32_to_utf32_well_formed3.png

C Basic Source Character Set, With Initialization

The following benchmarks measure the same conversion, but with any initialization object included within the conversion (e.g., for modeling a library that cannot store a registry/conversion descriptor somewhere).

../_images/utf8_to_utf16_well_formed_init3.png
../_images/utf16_to_utf8_well_formed_init3.png
../_images/utf32_to_utf8_well_formed_init3.png
../_images/utf8_to_utf32_well_formed_init3.png
../_images/utf16_to_utf32_well_formed_init3.png
../_images/utf8_to_utf8_well_formed_init3.png
../_images/utf16_to_utf16_well_formed_init3.png
../_images/utf32_to_utf32_well_formed_init3.png