ToyGine2 26.2.0
Game Engine for retro consoles
Loading...
Searching...
No Matches
Common text processing functions

String encoding conversion, length and reversal. More...

Functions

wchar_t * toy::utf8toWChar (wchar_t *dest, size_t destSize, const char *src, size_t count) noexcept
 Converts a UTF-8 C string to a wide-character string with a character count limit.
constexpr wchar_t * toy::utf8toWChar (wchar_t *dest, size_t destSize, const char *src) noexcept
 Converts a UTF-8 C string to a wide-character string (full source length).
template<StringLike T>
constexpr wchar_t * toy::utf8toWChar (wchar_t *dest, size_t destSize, const T &src) noexcept
 Converts a UTF-8 toy::StringLike object to a wide-character string.
char * toy::wcharToUtf8 (char *dest, size_t destSize, const wchar_t *src) noexcept
 Converts a wide-character C string to UTF-8.
size_t toy::utf8Len (const char *string) noexcept
 Returns the number of Unicode code points in a UTF-8 encoded C string.
constexpr char * toy::reverseString (char *str, size_t count=0) noexcept
 Reverses a C string in-place.

Variables

constexpr size_t toy::WCHAR_IN_UTF8_MAX_SIZE = 3
 Maximum UTF-8 byte length for one BMP character.

Detailed Description

String encoding conversion, length and reversal.

  • utf8toWChar: UTF-8 C string or toy::StringLike to wide-character string (with optional character limit).
  • wcharToUtf8: Wide-character C string to UTF-8.
  • utf8Len: Unicode code point count in a UTF-8 C string.
  • reverseString: In-place reversal of a C string.

Related: toy::StringLike, toy::FixedString; constant WCHAR_IN_UTF8_MAX_SIZE.

Key Features

  • UTF-8 / wide conversion: BMP-only; invalid UTF-8 skipped; destination null-terminated on success.
  • No allocation: All functions write into caller-provided buffers.
  • Constexpr: utf8toWChar overloads and reverseString where applicable.
  • Exception safety: All operations are noexcept.

Function Documentation

◆ reverseString()

char * toy::reverseString ( char * str,
size_t count = 0 )
constexprnoexcept

Reverses a C string in-place.

Swaps characters from both ends toward the center. If count is 0, the length is obtained via strlen(str).

Parameters
strC string to reverse (must be writable).
countNumber of characters to reverse (default: 0 for null-terminated length).
Returns
Pointer to str (the reversed string).
Precondition
If count > 0, str has at least count valid characters. If count is 0, str is null-terminated.
Postcondition
Characters in the reversed range are swapped in place; str is still null-terminated when count was 0.

◆ utf8Len()

size_t toy::utf8Len ( const char * string)
nodiscardnoexcept

Returns the number of Unicode code points in a UTF-8 encoded C string.

Parses UTF-8 sequences in string and counts code points. Stops at the first null byte. Invalid sequences cause the function to return 0.

Parameters
stringSource UTF-8 encoded C string (may be null).
Returns
Number of Unicode code points, or 0 if string is nullptr or contains invalid UTF-8.
Note
Multi-byte sequences (2–3 bytes) count as one code point. BMP only.

◆ utf8toWChar() [1/3]

wchar_t * toy::utf8toWChar ( wchar_t * dest,
size_t destSize,
const char * src )
constexprnoexcept

Converts a UTF-8 C string to a wide-character string (full source length).

Same as utf8toWChar(dest, destSize, src, count) with count set to the length of src. Stops when the source ends or dest is full. BMP-only; invalid UTF-8 sequences are skipped.

Parameters
destDestination buffer for the wide-character string.
destSizeSize of dest in wide characters (not bytes).
srcSource UTF-8 encoded C string.
Returns
Pointer to dest, or nullptr if dest or destSize is invalid.
Precondition
dest points to a valid buffer with capacity destSize; src is valid UTF-8.
Postcondition
On success, dest is null-terminated. On overflow or invalid input, returns nullptr.
See also
utf8toWChar(dest, destSize, src, count)

◆ utf8toWChar() [2/3]

wchar_t * toy::utf8toWChar ( wchar_t * dest,
size_t destSize,
const char * src,
size_t count )
noexcept

Converts a UTF-8 C string to a wide-character string with a character count limit.

Writes the converted wide-character string into dest. Stops when count source bytes have been processed, dest is full, or the source ends. Only BMP (≤ 0xFFFF) is supported.

Parameters
destDestination buffer for the wide-character string.
destSizeSize of dest in wide characters (not bytes).
srcSource UTF-8 encoded C string.
countMaximum number of source bytes to process.
Returns
Pointer to dest, or nullptr if dest or destSize is invalid.
Precondition
dest points to a valid buffer with capacity destSize; src is valid UTF-8.
count is at most the number of code points in src (if bounded).
Postcondition
On success, dest is null-terminated. On overflow or invalid input, returns nullptr.

◆ utf8toWChar() [3/3]

template<StringLike T>
wchar_t * toy::utf8toWChar ( wchar_t * dest,
size_t destSize,
const T & src )
constexprnoexcept

Converts a UTF-8 toy::StringLike object to a wide-character string.

Converts the UTF-8 content of src (via c_str()) into dest. Stops when the source ends or dest is full. BMP-only; invalid UTF-8 sequences are skipped.

Template Parameters
TType satisfying toy::StringLike (e.g. toy::FixedString, std::string).
Parameters
destDestination buffer for the wide-character string.
destSizeSize of dest in wide characters (not bytes).
srcSource object with UTF-8 encoded content.
Returns
Pointer to dest, or nullptr if dest or destSize is invalid.
Precondition
dest points to a valid buffer with capacity destSize; src.c_str() returns valid UTF-8.
Postcondition
On success, dest is null-terminated. On overflow or invalid input, returns nullptr.
See also
utf8toWChar (C string overloads)

◆ wcharToUtf8()

char * toy::wcharToUtf8 ( char * dest,
size_t destSize,
const wchar_t * src )
noexcept

Converts a wide-character C string to UTF-8.

Writes the UTF-8 encoding of src into dest. Stops when the source ends or dest is full. Each wide character may produce 1–3 UTF-8 bytes.

Parameters
destDestination buffer for the UTF-8 string.
destSizeSize of dest in bytes (not wide characters).
srcSource wide-character C string, or nullptr to write an empty UTF-8 string.
Returns
Pointer to dest, or nullptr only if dest is nullptr or destSize is 0.
Precondition
dest points to a valid writable buffer of at least destSize bytes (the function always writes at least a null terminator when it returns a non-null pointer).
If src is not nullptr, it must point to a null-terminated wide-character string.
When src is not nullptr, destSize must account for possible UTF-8 expansion (e.g. 3× wide length for BMP).
Postcondition
If src is nullptr, writes '\0' to *dest (empty UTF-8) and returns dest.
If src is not nullptr, dest is null-terminated UTF-8; output may be truncated if the buffer fills before the wide string ends.
See also
utf8toWChar

Variable Documentation

◆ WCHAR_IN_UTF8_MAX_SIZE

size_t toy::WCHAR_IN_UTF8_MAX_SIZE = 3
constexpr

Maximum UTF-8 byte length for one BMP character.

One wide character in the BMP (≤ 0xFFFF) encodes to at most 3 UTF-8 bytes.