NAME
ctype
—
character classification and mapping
functions
LIBRARY
library “libc”
SYNOPSIS
#include
<ctype.h>
isalpha
(int
c);
isupper
(int
c);
islower
(int
c);
isdigit
(int
c);
isxdigit
(int
c);
isalnum
(int
c);
isspace
(int
c);
ispunct
(int
c);
isprint
(int
c);
isgraph
(int
c);
iscntrl
(int
c);
isblank
(int
c);
toupper
(int
c);
tolower
(int
c);
DESCRIPTION
The above functions perform character tests and conversions on the integer c.See the specific manual pages for information about the test or conversion performed by each function.
EXAMPLES
To print an upper-case version of a string to stdout, the following code can be used:
const char *s = "xyz"; while (*s != '\0') { putchar(toupper((unsigned char)*s)); s++; }
SEE ALSO
isalnum(3), isalpha(3), isblank(3), iscntrl(3), isdigit(3), isgraph(3), islower(3), isprint(3), ispunct(3), isspace(3), isupper(3), isxdigit(3), tolower(3), toupper(3), ascii(7)
STANDARDS
These functions, with the exception of
isblank
(), conform to ANSI
X3.159-1989 (“ANSI C89”). All described
functions, including isblank
(), also conform to
IEEE Std 1003.1-2001 (“POSIX.1”).
CAVEATS
The argument of these functions is of type
int, but only a very restricted subset of values are
actually valid. The argument must either be the value of the macro
EOF
(which has a negative value), or must be a
non-negative value within the range representable as
unsigned char. Passing invalid values leads to
undefined behavior.
Values of type int that were returned by
getc(3), fgetc(3), and similar functions or macros are already in the
correct range, and may be safely passed to these
ctype
functions without any casts.
Values of type char or
signed char must first be cast to
unsigned char, to ensure that the values are within
the correct range. Casting a negative-valued char or
signed char directly to int will
produce a negative-valued int, which will be outside
the range of allowed values (unless it happens to be equal to
EOF
, but even that would not give the desired
result).
Because the bugs may manifest as silent misbehavior or as crashes
only when fed input outside the US-ASCII range, the
NetBSD implementation of the
ctype
functions is designed to elicit a compiler
warning for code that passes inputs of type char in
order to flag code that may pass negative values at runtime that would lead
to undefined behavior:
#include <ctype.h> #include <locale.h> #include <stdio.h> int main(int argc, char **argv) { if (argc < 2) return 1; setlocale(LC_ALL, ""); printf("%d %d\n", *argv[1], isprint(*argv[1])); printf("%d %d\n", (int)(unsigned char)*argv[1], isprint((unsigned char)*argv[1])); return 0; }
When compiling this program, GCC reports a warning for the line that passes char. At runtime, you may get nonsense answers for some inputs without the cast — if you're lucky and it doesn't crash:
% gcc -Wall -o test test.c test.c: In function 'main': test.c:12:2: warning: array subscript has type 'char' % LC_CTYPE=C ./test $(printf '\270') -72 5 184 0 % LC_CTYPE=C ./test $(printf '\377') -1 0 255 0 % LC_CTYPE=fr_FR.ISO8859-1 ./test $(printf '\377') -1 0 255 2
Some implementations of libc, such as glibc as of 2018, attempt to
avoid the worst of the undefined behavior by defining the functions to work
for all integer inputs representable by either unsigned
char or char, and suppress the warning. However,
this is not an excuse for avoiding conversion to unsigned
char: if EOF
coincides with any such value, as
it does when it is -1 on platforms with signed char,
programs that pass char will still necessarily confuse
the classification and mapping of EOF
with the
classification and mapping of some non-EOF inputs.