String encoding conversion, length and reversal.
More...
|
| wchar_t * | toy::utf8toWChar (wchar_t *dest, size_t destSize, const char *src, size_t count) noexcept |
| | Converts a UTF-8 C string to a wide-character string with a character count limit.
|
| constexpr wchar_t * | toy::utf8toWChar (wchar_t *dest, size_t destSize, const char *src) noexcept |
| | Converts a UTF-8 C string to a wide-character string (full source length).
|
| template<StringLike T> |
| constexpr wchar_t * | toy::utf8toWChar (wchar_t *dest, size_t destSize, const T &src) noexcept |
| | Converts a UTF-8 toy::StringLike object to a wide-character string.
|
| char * | toy::wcharToUtf8 (char *dest, size_t destSize, const wchar_t *src) noexcept |
| | Converts a wide-character C string to UTF-8.
|
| size_t | toy::utf8Len (const char *string) noexcept |
| | Returns the number of Unicode code points in a UTF-8 encoded C string.
|
| constexpr char * | toy::reverseString (char *str, size_t count=0) noexcept |
| | Reverses a C string in-place.
|
String encoding conversion, length and reversal.
- utf8toWChar: UTF-8 C string or toy::StringLike to wide-character string (with optional character limit).
- wcharToUtf8: Wide-character C string to UTF-8.
- utf8Len: Unicode code point count in a UTF-8 C string.
- reverseString: In-place reversal of a C string.
Related: toy::StringLike, toy::FixedString; constant WCHAR_IN_UTF8_MAX_SIZE.
Key Features
- UTF-8 / wide conversion: BMP-only; invalid UTF-8 skipped; destination null-terminated on success.
- No allocation: All functions write into caller-provided buffers.
- Constexpr: utf8toWChar overloads and reverseString where applicable.
- Exception safety: All operations are
noexcept.
◆ reverseString()
| char * toy::reverseString |
( |
char * | str, |
|
|
size_t | count = 0 ) |
|
constexprnoexcept |
Reverses a C string in-place.
Swaps characters from both ends toward the center. If count is 0, the length is obtained via strlen(str).
- Parameters
-
| str | C string to reverse (must be writable). |
| count | Number of characters to reverse (default: 0 for null-terminated length). |
- Returns
- Pointer to str (the reversed string).
- Precondition
- If count > 0, str has at least count valid characters. If count is
0, str is null-terminated.
- Postcondition
- Characters in the reversed range are swapped in place; str is still null-terminated when count was
0.
◆ utf8Len()
| size_t toy::utf8Len |
( |
const char * | string | ) |
|
|
nodiscardnoexcept |
Returns the number of Unicode code points in a UTF-8 encoded C string.
Parses UTF-8 sequences in string and counts code points. Stops at the first null byte. Invalid sequences cause the function to return 0.
- Parameters
-
| string | Source UTF-8 encoded C string (may be null). |
- Returns
- Number of Unicode code points, or
0 if string is nullptr or contains invalid UTF-8.
- Note
- Multi-byte sequences (2–3 bytes) count as one code point. BMP only.
◆ utf8toWChar() [1/3]
| wchar_t * toy::utf8toWChar |
( |
wchar_t * | dest, |
|
|
size_t | destSize, |
|
|
const char * | src ) |
|
constexprnoexcept |
Converts a UTF-8 C string to a wide-character string (full source length).
Same as utf8toWChar(dest, destSize, src, count) with count set to the length of src. Stops when the source ends or dest is full. BMP-only; invalid UTF-8 sequences are skipped.
- Parameters
-
| dest | Destination buffer for the wide-character string. |
| destSize | Size of dest in wide characters (not bytes). |
| src | Source UTF-8 encoded C string. |
- Returns
- Pointer to dest, or
nullptr if dest or destSize is invalid.
- Precondition
- dest points to a valid buffer with capacity destSize; src is valid UTF-8.
- Postcondition
- On success, dest is null-terminated. On overflow or invalid input, returns
nullptr.
- See also
- utf8toWChar(dest, destSize, src, count)
◆ utf8toWChar() [2/3]
| wchar_t * toy::utf8toWChar |
( |
wchar_t * | dest, |
|
|
size_t | destSize, |
|
|
const char * | src, |
|
|
size_t | count ) |
|
noexcept |
Converts a UTF-8 C string to a wide-character string with a character count limit.
Writes the converted wide-character string into dest. Stops when count source bytes have been processed, dest is full, or the source ends. Only BMP (≤ 0xFFFF) is supported.
- Parameters
-
| dest | Destination buffer for the wide-character string. |
| destSize | Size of dest in wide characters (not bytes). |
| src | Source UTF-8 encoded C string. |
| count | Maximum number of source bytes to process. |
- Returns
- Pointer to dest, or
nullptr if dest or destSize is invalid.
- Precondition
- dest points to a valid buffer with capacity destSize; src is valid UTF-8.
-
count is at most the number of code points in src (if bounded).
- Postcondition
- On success, dest is null-terminated. On overflow or invalid input, returns
nullptr.
◆ utf8toWChar() [3/3]
template<StringLike T>
| wchar_t * toy::utf8toWChar |
( |
wchar_t * | dest, |
|
|
size_t | destSize, |
|
|
const T & | src ) |
|
constexprnoexcept |
Converts a UTF-8 toy::StringLike object to a wide-character string.
Converts the UTF-8 content of src (via c_str()) into dest. Stops when the source ends or dest is full. BMP-only; invalid UTF-8 sequences are skipped.
- Template Parameters
-
- Parameters
-
| dest | Destination buffer for the wide-character string. |
| destSize | Size of dest in wide characters (not bytes). |
| src | Source object with UTF-8 encoded content. |
- Returns
- Pointer to dest, or
nullptr if dest or destSize is invalid.
- Precondition
- dest points to a valid buffer with capacity destSize; src.c_str() returns valid UTF-8.
- Postcondition
- On success, dest is null-terminated. On overflow or invalid input, returns
nullptr.
- See also
- utf8toWChar (C string overloads)
◆ wcharToUtf8()
| char * toy::wcharToUtf8 |
( |
char * | dest, |
|
|
size_t | destSize, |
|
|
const wchar_t * | src ) |
|
noexcept |
Converts a wide-character C string to UTF-8.
Writes the UTF-8 encoding of src into dest. Stops when the source ends or dest is full. Each wide character may produce 1–3 UTF-8 bytes.
- Parameters
-
| dest | Destination buffer for the UTF-8 string. |
| destSize | Size of dest in bytes (not wide characters). |
| src | Source wide-character C string, or nullptr to write an empty UTF-8 string. |
- Returns
- Pointer to dest, or
nullptr only if dest is nullptr or destSize is 0.
- Precondition
- dest points to a valid writable buffer of at least destSize bytes (the function always writes at least a null terminator when it returns a non-null pointer).
-
If src is not
nullptr, it must point to a null-terminated wide-character string.
-
When src is not
nullptr, destSize must account for possible UTF-8 expansion (e.g. 3× wide length for BMP).
- Postcondition
- If src is
nullptr, writes '\0' to *dest (empty UTF-8) and returns dest.
-
If src is not
nullptr, dest is null-terminated UTF-8; output may be truncated if the buffer fills before the wide string ends.
- See also
- utf8toWChar
◆ WCHAR_IN_UTF8_MAX_SIZE
| size_t toy::WCHAR_IN_UTF8_MAX_SIZE = 3 |
|
constexpr |
Maximum UTF-8 byte length for one BMP character.
One wide character in the BMP (≤ 0xFFFF) encodes to at most 3 UTF-8 bytes.