Names specified here
Name Description Notes Source Availability
__STDC_MB_MIGHT_NEQ_WC__ Indicator for lack of guaranteed equivalence of multibyte and wide characters in the basic character set L ? M Predefined C11
btowc() Convert multibyte character to wide character (·) <wchar.h> C95 C99 C11
c16rtomb() Convert 16-bit Unicode character to multibyte character (·) <uchar.h> C11
c32rtomb() Convert 32-bit Unicode character to multibyte character (·) <uchar.h> C11
EILSEQ Indicator of illegal multibyte sequence M <errno.h> C95 C99 C11
MB_CUR_MAX Maximum multibyte character length in current locale M <stdlib.h> C89 C90 C95 C99 C11
MB_LEN_MAX Maximum multibyte character length L M <limits.h> C89 C90 C95 C99 C11
mblen() Compute multibyte character length (·) <stdlib.h> C89 C90 C95 C99 C11
mbrlen() Compute multibyte character length (·) <wchar.h> C95 C99 C11
mbrtoc16() Convert multibyte character to 16-bit Unicode character (·) <uchar.h> C11
mbrtoc32() Convert multibyte character to 32-bit Unicode character (·) <uchar.h> C11
mbrtowc() Convert multibyte character to wide character (·) <wchar.h> C95 C99 C11
mbsinit() Test for initial state (·) <wchar.h> C95 C99 C11
mbsrtowcs() Convert multibyte string to wide string (·) <wchar.h> C95 C99 C11
mbsrtowcs_s() Convert multibyte string to wide string (·) <wchar.h> C11
mbstate_t Multibyte conversion state T Headers C95 C99 C11
mbstowcs() Convert multibyte string to wide string (·) <stdlib.h> C89 C90 C95 C99 C11
mbstowcs_s() Convert multibyte string to wide string (·) <stdlib.h> C11
mbtowc() Convert multibyte character to wide character (·) <stdlib.h> C89 C90 C95 C99 C11
wcrtomb() Convert wide character to multibyte character (·) <wchar.h> C95 C99 C11
wcrtomb_s() Convert wide character to multibyte character ? (·) <wchar.h> C11
wcsrtombs() Convert wide string to multibyte string (·) <wchar.h> C95 C99 C11
wcsrtombs_s() Convert wide string to multibyte string ? (·) <wchar.h> C11
wcstombs() Convert wide string to multibyte string (·) <stdlib.h> C95 C99 C11
wcstombs_s() Convert wide string to multibyte string ? (·) <stdlib.h> C11
wctob() Convert wide character to multibyte character (·) <wchar.h> C95 C99 C11
wctomb() Convert wide character to multibyte character (·) <stdlib.h> C89 C90 C95 C99 C11
wctomb_s() Convert wide character to multibyte character ? (·) <stdlib.h> C11

Arrays of type char are normally used to hold strings using a multibyte encoding, in which:

Arrays of char can also hold strings encoded in UTF-8. The other element types used to hold strings are:

The encodings of char and wchar_t are locale-dependent, according to the locale category LC_CTYPE. However, guarantees are made about characters in the the basic execution character set:

Several functions are provided to convert between the multibyte encoding and other types. MB_CUR_MAX is a size_t expression yielding the maximum number of bytes required to represent any supported character and a preceding shift sequence, in the current locale. It is never bigger than the constant expression MB_LEN_MAX in any supported locale.

Some byte sequences might be invalid, not corresponding to any multibyte character or shift sequence. A number of functions indicate this error with EILSEQ, either as a return value or by setting errno.

Multibyte conversion state

Conversion state may have to be retained after each conversion of a character, especially if shift sequences exist. Every string literal begins and ends in an initial conversion state.

The type mbstate_t records conversion state while converting between multibyte sequences and wide characters in either direction. You should initialize one like this:

mbstate_t mbs;
memset(&mbs, 0, sizeof mbs);

…or like this:

mbstate_t mbs = { 0 };

…or like this:

mbstate_t mbs = 0;

That puts it into the initial conversion state. You can test for this state with this:

#include <wchar.h>
int mbsinit(const mbstate_t *mbs);

mbsinit returns non-zero if the referenced object is in the initial state. mbsinit(NULL) also returns non-zero.

Conversion from multibyte to wide characters

#include <wchar.h>
wint_t btowc(int src);

btowc converts the 1-byte character in src to a wide character, and returns it. It returns WEOF if src is EOF or cannot be converted.

#include <stdlib.h>
int mblen(const char *src size_t n);
#include <wchar.h>
int mbrlen(const char *src,
           size_t n,
           mbstate_t *mbs);

mblen(NULL, anything) returns non-zero if the multibyte encoding is state-dependent (i.e., it uses shift sequenes), or zero otherwise. When src points to a character array of at least n bytes, it returns the length of the first multibyte character in bytes, according to the current locale, assuming an initial conversion state. The returned length is never greater than n or MB_CUR_MAX. If n is zero, or src[0] is a null character, the function returns zero. If the initial bytes of the array do not form a valid character, -1 is returned.

mbrlen is identical to mblen, except that it uses an external conversion state *mbs. It also has its own internal state, which is used if mbs is null.

#include <stdlib.h>
int mbtowc(wchar_t *out,
           const char *src,
           size_t n);

mbtowc does the same as mblen, but additionally writes the converted character to *out, if out is not null. It returns the number of bytes consumed, which is never greater than n or MB_CUR_MAX. The function keeps its own internal conversion state, which can be reset with mbtowc(NULL, NULL, 0).

#include <wchar.h>
size_t mbrtowc(wchar_t *out,
               const char *src,
               size_t n,
               mbstate_t *mbs);

mbrtowc is almost identical to mbtowc, except that it uses an external conversion state *mbs, or its own internal state if mbs is null. It returns zero if the input is the null character, optionally prefixed by a shift sequence that resets the conversion state. Also, if src is not a complete multibyte sequence, it returns (size_t) -2. If src is not a valid multibyte character, it sets errno to EILSEQ, and returns (size_t) -1.

#include <stdlib.h>
size_t mbstowcs(wchar_t *dst,
                const char *src,
                size_t dstlen);
errno_t mbstowcs_s(size_t *retval
                 wchar_t *dst,
                 rsize_t dstmax,
                 const char *src,
                 rsize_t dstlen);
#include <wchar.h>
size_t mbsrtowcs(wchar_t *dst,
                 const char **srcp,
                 size_t dstlen,
                 mbstate_t *mbs);
errno_t mbsrtowcs_s(size_t *retval,
                    wchar_t *dst,
                    rsize_t dstmax,
                    const char **srcp,
                    rsize_t dstlen,
                    mbstate_t *mbs);
[ Work in progress ]

Conversion from wide to multibyte characters

#include <wchar.h>
int wctob(wint_t src);

wctob attempts to convert the wide character src into a 1-byte character, and return it. It returns EOF if src is WEOF or cannot be converted.

#include <stdlib.h>
int wctomb(char *dst, wchar_t src);
#include <wchar.h>
int wcrtomb(char *dst,
            wchar_t src,
            mbstate_t *mbs);
errno_t wcrtomb_s(size_t * retval,
                  char *dst,
                  rsize_t dstlen,
                  wchar_t src,
                  mbstate_t *mbs);
#include <stdlib.h>
size_t wcstombs(char *dst,
                const wchar_t *src,
                size_t n);
#define __STDC_WANT_LIB_EXT1__ 1
#include <stdlib.h>
errno_t wctomb_s(int *retval,
                 char *dst,
                 rsize_t dstlen,
                 wchar_t src);

These functions convert the wide character src into a multibyte character, according to the current locale and a conversion state. wcrtomb and wcrtomb_s use an external conversion state *mbs. wctomb and wctomb_s each use their own internal states. wcrtomb also has an internal state, which is used if mbs is null. Whichever state is used, if src is a null character, the state is reset to the initial state.

If dst is null, wctomb and wctomb_s indicate whether multibyte encodings [in the current locale?] are state-dependent with a non-zero result. wctomb returns that result, while wctomb_s writes it into *retval.

If dst is null, wcrtomb and wcrtomb_s reset their conversion states. wcrtomb then returns the number of bytes required for a shift sequence that would reset the conversion state, while wcrtomb_s writes the number into *retval.

If dst is not null, and the wide character is valid and has a multibyte equivalent, all functions write the equivalent bytes to dst, including any shift sequence corresponding to a change in conversion state. wctomb and wcrtomb return the number of bytes written, while wctomb_s and wcrtomb_s write the number into *retval. No function writes more than MB_CUR_MAX bytes to dst for the current locale, and no more than MB_LEN_MAX for any supported locale.

Additionally, wcrtomb and wcrtomb_s interpret non-null dst as an array of dstlen bytes. If that is not sufficient to represent the character and any necessary preceding shift sequences, wcrtomb returns -1, while wcrtomb_s sets *retval to (size_t) -1 and returns non-zero.

The null wide character is converted to any shift sequences required to return to the initial state, plus a null byte. This final null byte is not included in any count of bytes written or returned.

If src is not a valid wide character/does not correspond to a valid multibyte character, wcrtomb returns -1, wcrtomb sets errno to EILSEQ and returns (size_t) -1, wctomb_s sets *retval to -1 and returns non-zero, and wcrtomb_s sets *retval to (size_t) -1 and returns non-zero.

wctomb wcrtomb wctomb_s wcrtomb_s
Conversion state Internal *mbs, or internal if mbs is null *mbs
Meaning of byte count when dst is null Non-zero indicates state-dependence Number of bytes required to return to initial conversion state Non-zero indicates state-dependence Number of bytes required to return to initial conversion state
Provision of byte count Return value *retval
Other effects if dst is null None Conversion state reset None Conversion state reset
Invalid wide character Return -1 Set errno to EILSEQ and return (size_t) -1 Set *retval to -1 and return non-zero Set *retval to (size_t) -1 and return non-zero
Buffer overflow Not detected Return non-zero Set dst[0] to zero if dstlen greater than 0 and less than RSIZE_MAX, set *retval to (size_t) -1 if retval not null, and return non-zero

Conversion from multibyte characters to UTF-16 and UCS-32

#include <uchar.h>
size_t mbrtoc16(char16_t *pc,
                const char *s,
                size_t n,
                mbstate_t *mbs);
size_t mbrtoc32(char32_t *pc,
                const char *s,
                size_t n,
                mbstate_t *mbs);
[ Work in progress ]

Conversion from UTF-16 and UCS-32 to multibyte characters

#include <uchar.h>
size_t c16rtomb(const char *s,
                char16_t c,
                mbstate_t *mbs);
size_t c32rtomb(const char *s,
                char32_t c,
                mbstate_t *mbs);
[ Work in progress ]

Missing

  • wcsrtombs_s
  • wcsrtombs
  • wcstombs_s

CHaR
Sitemap Supported
Site format updated 2024-06-05T22:37:07.391+0000
Data updated 1970-01-01T00:00:00.000+0000
Page updated 2022-06-17T21:43:05.000+0000