A Java variable of type char
can hold any 16-bit Unicode
character. In C, the char
type can represent any character in a character set that
depends on the type of system or platform for which the
program is compiled. This is usually a variation of
US ASCII, but it doesn't have to be, so beware. In
particular, it could be a multibyte encoding, where a
larger set of characters are represented by several
char
objects, e.g. UTF-8. A basic set
of characters, however, are always represented as single
char
s.
Java strings are objects of class String
or StringBuilder
,
and represent sequences of char
.
Strings in C are just arrays of, or pointers to,
char
,
and don't exist as a formal type. Functions which handle
strings typically assume that the string is terminated
with a null character '\0'
,
rather than being passed length parameter. A character
array can be initialised like other arrays:
char word[] = { 'H', 'e', 'l', 'l', 'o', '!', '\0' }; char another[] = "Hello!";
Note that the second initialiser is a shorter form of the first, including the terminating null character. Such a string literal can also appear in an expression. It evaluates to a pointer to the first character.
const char *ptr; ptr = "Hello!";
ptr
now points to an
anonymous, statically allocated array of characters.
Attempting to write to a string literal like this has
undefined behaviour, so the use of const
ensures
that such attempts are detected while compiling.
Utilities for handling character strings are declared
in <string.h>
. For
example, the function to copy a string from one place to
another is declared as:
char *strcpy(char *to, const char *from);
…and may be used like this:
#include <string.h> char words[100]; strcpy(words, "Madam, I'm Adam.");
Like many of the other <string.h>
functions, strcpy
assumes that you have already allocated
sufficient space to store the string.