Count as Transcoded

This operation counts how much text will result from a transcode operation. Essentially, we run the encoding algorithm loop, but instead of giving the end user the re-encoded values, we instead simply provide the count for running that bulk operation.

Thusly, we use the algorithm as below to do the work. Given an input of code_units with an encoding, an initial count set at 0, and any necessary additional state, we can generically predict how many code units will result from a decoding operation by running the following loop:

⏩ Is the input value empty? If so, is the state finished and have nothing to output? If both are true, return the current results with the the empty input, current count, and state, everything is okay ✅!
⏩ Otherwise,
1. Set up an intermediate storage location of code_points (of the input encoding), using the max_code_points of the input encoding; and, set up an intermediate_output storage location of code_units (of the output encoding), for the next operations.
2. Do the decode_one step from input (using its begin() and end()) into the intermediate code_point storage location, saving the returned intermediate_output from the decode_one call.
  - 🛑 If it failed, return with the current input (unmodified from before this iteration, if possible), current count, and states.
3. Do the encode_one step from intermediate (using its begin() and end()) into the intermediate_output code_unit storage location, saving the returned intermediate_output from the encode_one call.
  - 🛑 If it failed, return with the current input (unmodified from before this iteration, if possible), current count, and states.
4. Compute the difference between the begin(intermediate_output) from the original step, and the begin(result.output) returned by encode_one; add that difference to the current count.
⏩ Update input‘s begin() value to point to after what was read by the encode_one step.
⤴️ Go back to the start.

This involves a single encoding type, and so does not need any cooperation to count the code_points. Note that this algorithm doesn’t show what the error handler does; if the error handler “erases” the failure by setting the result type’s .error_code == ztd::text::encoding_error::ok, then the algorithm will keep going. This is useful to, for example, detect the maximum size of an operation even if it errors and would result in replacement characters being inserted (e.g., from ztd::text::replacement_handler_t).

Check out the API documentation for ztd::text::count_as_transcoded to learn more.