title: C23 implications for C libraries
author: Jens Gustedt
self_contained: yes
date: "${DATE}"
output:
html_document:
toc_depth: 4
documentclass: scrartcl
header-includes: |
\def\thesection{}
\def\thesubsection{\arabic{subsection}}
\usepackage{listings-modernC}
\usepackage{listings-C}
\lstloadlanguages{C11,C99}
\lstset{
language=[errnoPOSIX]{C},
language=[tgmath]{C},
language=[threads]{C},
language=[stdatomic]{C},
language=[boundschecking]{C},
language=[99]{C},
language={C11},
style=modernC,
basicstyle=\footnotesize\ttfamily,
}
classoption: 10pt
papersize: a4
Introduction
The upcomming standard C23 has a lot of additions to the C library clause. Most of them are small, some of them are big but optional.
This document only marginally covers the latter, namely changes to floating point types and their implications for the C library, simply because that is not my domain of expertise.
The last publicly available working draft is https://open-std.org/JTC1/SC22/WG14/www/docs/n3096.pdf and section numbers below refer to that document. But this will not be the final version of C23. Against that document 239 national body comments had been raised to which WG14 answered in their June 2023 session https://open-std.org/JTC1/SC22/WG14/www/docs/n3148.doc. Most of the changes have no normative impact on the C library, those that we are aware of are marked below.
Unfortunately, the versions thereafter and in particular the final document fall under the weird ISO laws and can't be made available.
The document will now be updated by the editors to the latest changes
and an editorial review will follow. After that the document will be
submitted for final approval by the national bodies. If there are no
serious changes the final document will be submitted to ISO for
publication January or February of 2024. Nevertheless, WG14 decided to
stick to C23 as a name for this version, as this reflects the new
version number 202311L
in __STDC_VERSION__
(and others) which they
will not be able to change in the procedure.
Impacted headers
There are a lot of impacted headers by the changes,
<assert.h> |
minor changes |
<complex.h> |
minor changes |
<fenv.h> |
major changes |
<inttypes.h> |
minor changes |
<limits.h> |
major changes |
<math.h> |
major changes |
<setjmp.h> |
minor changes |
<stdalign.h> |
now empty |
<stdarg.h> |
minor changes |
<stdatomic.h> |
medium changes |
<stdbit.h> |
new header |
<stdbool.h> |
mostly useless, now |
<stdckdint.h> |
new header |
<stddef.h> |
minor changes |
<stdint.h> |
minor changes plus some optional ones |
<stdio.h> |
major changes |
<stdlib.h> |
new interfaces |
<string.h> |
new interfaces |
<tgmath.h> |
major changes |
<time.h> |
some new interfaces |
<uchar.h> |
major changes |
<wchar.h> |
For C23, all these headers have to provide feature test macros of the
form __STDC_VERSION_
NAME_H__
where NAME is the capitalized
form of the header name without .h
suffix. The value of these macros
has to be 202311L
and should only be set to this once the transition
is complete and the header is conforming.
Nevertheless, even before, such a macro can already be used as an
include guard, it just has to be set to be empty or to a smaller
numerical value, e.g. 0
.
Unicode support (mandatory change)
C23 improves Unicode support. char8_t
, char16_t
, char32_t
and
strings and characters prefixed with u8
, u
and U
are encoded
with the proper UTF encoding. See <uchar.h>
, below.
Unfortunately, there are still some escape hatches for hereditary
implementations that are controlled by the feature test macros
__STDC_ISO_10646__
and __STDC_MB_MIGHT_NEQ_WC__
.
Unicode Identifier support has been updated to UAX #31 (Unicode Identifier and Pattern Syntax). This means that validity of identifiers rule as if the encoding where Unicode.
-
The character that starts an identifier must correspond to a Unicode codepoint that has the
XID_Startproperty (including the characters of the Latin alphabet) or be the character_
. The possible use of the character$
as identifier start character is implementation-defined. -
The set of continuation characters extents this to all characters that correspond to the
XID_Continueproperty (including decimal digits).
This means in particular that identifiers may not contain characters
that have no Unicode equivalent or where the codepoint does not have
the required properties. So any identifier ID
can be stringified to
a multi-byte encoded string mbs
and then be transformed by
mbrtoc32
to a UTF-32 encoded string s32
where the characters have
the above properties. Transforming with mbrtoc16
or mbrtoc8
results in a valid UTF-16 or UTF-8 encoded string s16
and s8
,
respectively.
#include <stdbit.h>
#include <stdlib.h>
#include <uchar.h>
#define STRINGIFY_(X) #X
#define STRINGIFY(X) STRINGIFY_(X)
// For any accepted identifier or pp-number PP this is a valid mb-string
char const mbs[] = STRINGIFY(PP);
// For any accepted identifier or pp-number PP this will be a valid UTF-32 string
// where s32[0] has the XID_Start property or is an underscore character (for identifiers) or
// a decimal digit or a period character (for pp-numbers), and all
// subsequent characters have the XID_Continue property.
char32_t s32[sizeof mbs];
char32_t* p32 = s32;
size_t len = sizeof mbs;
mbstate_t state32 = { };
for (char const* p = mbs; *p;) {
// No error return possible, all mb characters must be
// complete and have UTF-32 codepoints, os is never 0
register size_t const os = mbrtoc32(p32, p, len, &state32);
if (os-1 > len-1) unreachable();
if (p32[0] > 0x10'FF'FF) unreachable();
p32 += 1;
p += os;
len -= os;
}
*p32 = 0;
// For any accepted identifier or pp-number PP this will be a valid UTF-16 string
char16_t s16[2 * sizeof mbs];
char16_t* p16 = s16;
size_t len16 = sizeof mbs;
mbstate_t state16 = { };
for (char const* p = mbs; *p;) {
// No error return possible, all mb characters must be
// complete and have UTF-32 codepoints, os is never 0
register size_t const os = mbrtoc16(p16, p, len16, &state16);
if (os-1 > len-1) unreachable();
// A surrogate character has been stored
if ((0xD8'00 <= p16[0]) && (p16[0] < 0xE0'00)) {
p16 += 1;
// No error return possible, this provides the second surrogate.
if (mbrtoc16(p16, p, len16, &state16) != -3) unreachable();
if ((p16[0] < 0xD8'00) || (0xE0'00 >= p16[0])) unreachable();
}
p16 += 1;
p += os;
len16 -= os;
}
*p16 = 0;
/* currently untested, use with care! */
// For any accepted identifier or pp-number PP this will be a valid UTF-8 string
char8_t s8[4 * sizeof mbs];
char8_t* p8 = s8;
size_t len8 = sizeof mbs;
mbstate_t state8 = { };
for (char const* p = mbs; *p;) {
// No error return possible, all mb characters must be
// complete and have UTF-32 codepoints, os is never 0
register size_t const os = mbrtoc8(p8, p, len8, &state8);
if (os-1 > len-1) unreachable();
// Determine the number of continuation bytes, if any.
// This is found by looking at the highest order bit that is 0.
// Mask with 0xFF is needed because p8[0] is promoted to int
// before doing the complement.
size_t const plus = 8-stdc_first_leading_one((~p8[0])&0xFFu);
for (size_t i = 0; i < plus; i++) {
p8 += 1;
// No error return possible, this provides the next continuation character.
if (mbrtoc8(p8, p, len8, &state8) != -3) unreachable();
}
p8 += 1;
p += os;
len8 -= os;
}
*p8 = 0;
Similarly, preprocessing numbers have to adhere to admit all the above continuation characters as possible continuation after initial digits or a decimal point.
These changes may marginally impact C library implementations where they use preprocessor concatenation or stringification if their multi-byte or wide character encodings are not UTF-8 or UTF-32, respectively.
Thread safety of the C library (mostly optional changes) {#threads}
In several places C23 now explicitly grants permission to C library
implementations to use thread local state for stateful functions.
This implies permission to change behavior of setlocale
, multibyte
character conversion functions, strtok
, strerror
, and time
conversion functions. Some of the changes can be done silently. For
others, the choice needs to be documented as they are now
implementation-defined; so even if an implementation sticks to static
state (as it was previously) this has to be documented, now.
Previously, the C standard was ambiguous about this; but for the parts
that make this an implementation-defined property (and thus force
implementations to document) the possible changes are not normative
but for quality of the implementation, only.
This is now implementation-defined for all functions that have
character conversion state held in an internal object of type
mbstate_t
. Here now a static or a thread local object may be used,
if necessary. A change to thread local could be of service for
threaded applications, although this has to be taken with a grain of
salt. Querying for thread local state can be about as expensive as the
whole conversion operation. Therefore applications that are
performance critical should never use the variants of the interfaces
that use an internal buffer, anyhow. As this has been a serious safety
risk prior to C23, hopefully not many applications use this feature,
anyhow.
The following functions each have their own internal state object that has either static or thread local storage duration:
c16rtomb
c8rtomb
mbrtoc16
mbrtoc32
mbrtoc8
mbrtowc
The functions mbsrtowcs
and mbrlen
are impacted implicitly because
they use the state of mbrtowc
.
The functions need to use this state because they are "restartable":
- They can be called with partial encodings for the input and a whole wide character can be collected over several calls. Here the state variable collects the input state.
- They may produce several output characters. For UTF-8 (up to 4 output characters) and UTF-16 (up to two output characters) the state is also used for the output. The functions that write multi-byte encodings expect sufficient storage to store a whole sequence for any such encoding at once; so these functions don't need the state variable to manage their output.
Thus, if they are called with a null pointer for the state, all of them have to use a per-function internal state.
The following functions only need to use state if the C library implementation supports encodings that have so-called shift states, that is, where different types of encodings can be switched during parsing. Otherwise they should just ignore a possible state parameter (if any) and should not use any internal state.
c32rtomb
mbtowc
wcrtomb
wcsrtombs
wctomb
A call mbtowc(nullptr, nullptr, 0)
can be used to determine if the
current locale setting is such that multi-byte encodings have or don't
have shift states.
Note that C library implementations that only use UTF-8 as multi-byte encoding will never need shift state.
Rationale and wording for this change can be found in:
Const-contract of the C library (mandatory change)
In C17 there are still interfaces that violate a const
-contract: in
some cases, pointer values to const
-qualified objects that are
passed as parameters to a function call can be returned as pointers to
unqualified objects. C23 closes a lot of these loopholes per its
default interfaces. This concerns the identifiers
bsearch_s
bsearch
memchr
strchr
strpbrk
strrchr
strstr
wcschr
wmemchr
wcspbrk
wcsrchr
wcsstr
which now interface type-generic macros, see below. The function
interfaces that may violate the const
-contract still remain, they
are kept for backwards compatibility, but unless they force the use of
functions, user code sees the macros.
Code compiled with the C23 library interfaces is possibly rejected or
diagnosed by compilers if they do not respect the
const
-contract. This is intentional.
Changes to integer types
One major change in the language is that two's complement is now mandatory as a representation for signed types.
New feature test macros (mandatory change) {#WIDTH}
This makes it easier to characterize integer types. For all of them
there are now ..._WIDTH
macros, that, together with the sizeof
operator, completely describe the types. Annex E gives a good overview
of the macros that are required and inform about minimal values for
these.
Note that in the future the ..._MAX
values are not necessarily
always suitable for comparison in #if
conditionals. Usage of these
should be changed to comparing the ..._WIDTH
values, which are much
smaller and usually more comprehensive numbers.
New valid token sequences for integer constants
All macros that deal with integer constants have to be capable to deal with new formats for these literals.
- Binary constants with a
0b
or0B
prefix. - Digit separators. The
'
becomes a digit separator that can be placed anywhere between consecutive digits such as1'000
or0.333'333
.
This concerns most importantly ..._C
macros in <stdint.h>
.
Exact-width integer types (mandatory change)
This holds for all width for which the platform has integer types without padding.
intmax_t
(optional change)
Extended integer types may be wider than This concerns types that could be used as exact-width types, in
particular [u]int128_t
and [u]int256_t
where many implementations
already have extensions, but which cannot be "extended integer types"
in the sense of C17.
On platforms with for example gcc
(64 bit) that already have an
extension __int128
(with predefined macro __SIZEOF_INT128__
) the
necessary addition to <stdint.h>
could look as follows:
// No language version macro needed
#ifdef __SIZEOF_INT128__
typedef signed __int128 int128_t;
typedef unsigned __int128 uint128_t;
typedef signed __int128 int_fast128_t;
typedef unsigned __int128 uint_fast128_t;
typedef signed __int128 int_least128_t;
typedef unsigned __int128 uint_least128_t;
# define UINT128_MAX ((uint128_t)-1)
# define INT128_MAX ((int128_t)+(UINT128_MAX/2))
# define INT128_MIN (-INT128_MAX-1)
# define UINT_LEAST128_MAX UINT128_MAX
# define INT_LEAST128_MAX INT128_MAX
# define INT_LEAST128_MIN INT128_MIN
# define UINT_FAST128_MAX UINT128_MAX
# define INT_FAST128_MAX INT128_MAX
# define INT_FAST128_MIN INT128_MIN
# define INT128_WIDTH 128
# define UINT128_WIDTH 128
# define INT_LEAST128_WIDTH 128
# define UINT_LEAST128_WIDTH 128
# define INT_FAST128_WIDTH 128
# define UINT_FAST128_WIDTH 128
# if UINT128_WIDTH > ULLONG_WIDTH
# define INT128_C(N) ((int_least128_t)+N ## WB)
# define UINT128_C(N) ((uint_least128_t)+N ## WBU)
# else
# define INT128_C(N) ((int_least128_t)+N ## LL)
# define UINT128_C(N) ((uint_least128_t)+N ## LLU)
# endif
#endif
This might use the new mandatory bitprecise integer constants with
suffix WB
and WBU
respectively by supposing that these will be
implemented on these platforms to support at least 128 bit.
Note that the names for these types have been reserved since several C
versions, so they are immediately available to the implementation and
need not to be #ifdef
ed for the C version of the
compilation. Nevertheless, integer literals and format specifiers for
these types might then not be available.
#if UINT128_WIDTH > UINTMAX_WIDTH
# if __STDC_VERSION__ < 202311L
# warning "extended integer type for 128 type is wider than intmax_t"
# endif
#endif
After implementing the mandatory changes to printf
and scanf
with
length specifiers %w
and %wf
the corresponding macros should also
be added to <inttypes.h>
, see %w
and
%wf
below.
Bit-precise integer types (mandatory change, compiler dependent) {#BitInt}
There is a whole new series of integer types called bit-precise
integer types. They are coded by the token sequence _BitInt(
N)
where N is an integer constant expression, and which can be combined
with signed
(default) and unsigned
. The minimum value for N is
1
for unsigned types and 2
for signed. The maximum value is
provided by a new macro BITINT_MAX
which has to be added to
<limits.h>
.
Clang already implements these types in their recent compilers and
exports a predefined macro __BITINT_MAX__
. Hopefully gcc will
choose the same. The name for the new macro had not previously been
reserved, so its definition should be protected by a version macro.
#if __STDC_VERSION__ > 202300L
# ifdef __BITINT_MAX__
# define BITINT_MAX __BITINT_MAX__
# endif
#endif
The constants for these types have the suffixes WB
, wb
, WBU
,
wbU
, , WBu
, or wbu
, such that a suffixed constant has the type
_BitInt(
N)
where N is minimal for the constant to fit. This
can be useful (see above) to express integer constants in headers of a
perhaps wider range than otherwise would be supported.
Otherwise these types have no direct library interfaces by themselves.
Nevertheless, these type may occur as arguments to generic interfaces
- all N for
atomic_
functions in<stdatomic.h>
, see below; - certain N for the new
<stdbit.h>
interfaces, see below.
String to integer conversion
See strtol
ptrdiff_t
May now be 16 bit wide, instead of previously 17 bit. Only implementations that previously were not conforming because of this (some compilers for 16 bit hardware) should be concerned about this. If that was their only miss, they become conforming to C23 where they weren't for C17.
char8_t
added (mandatory change)
See <uchar.h>
, below.
<stdbit.h>
(mandatory change, independent) {#stdbit}
New header This adds 14 \times 5 functions and 14 tg macros for bit-manipulation; so in total 84 new interfaces.
On many architectures the functions themselves probably have builtins that just have to be interfaced.
The type-generic interfaces are a bit more tricky, since that have to work for
- standard unsigned integer types (
bool
excluded) - extended unsigned integer types (don't forget additions such as
uint128_t
) - bit-precise types that have the same width as a standard or extended
unsigned integer type (
bool
excluded).
On architectures where these integer types don't have padding bits all of these should probably just switched by the size of the argument and the function arguments should just be cast to these.
ABI choice: There is one ABI choice to be made for this, the type
described as generic_return_type
for the type-generic functions. The
question that has to be answered here is if an ABI community estimates
that one day they will have extended integer types that have more than
UINT_MAX
bits. I personally don't think that this is likely.
If we assume that unsigned int
as a return type is just good enough,
something along the lines of the following would probably work:
inline unsigned int stdc_leading_zeros_u8(uint8_t __x) [[__unsequenced__]] { /* do your thing */ }
...
inline int stdc_leading_zeros_u128(uint128_t __x) [[__unsequenced__]] { /* do your thing */ }
#define stdc_leading_zero(X) \
_Generic((char(*)[sizeof(X)]){ 0 }, \
char(*)[8]: stdc_leading_zero_u8((uint8_t)X), \
... \
char(*)[128]: stdc_leading_zeros_u128((uint128_t)X))
Here, the cast in the chosen branch would never lose bits; the cast
in the other branches is well-defined and should not issue
diagnostics. Also, the inline
functions can be static
since these
are implementation details that do not have to be exported as linker
interfaces on which user code may rely upon.
As an extension, this would also work for
-
bool
because the conversion will just have a0
or1
of the appropriate size; - signed integer types because conversion to the unsigned type is always well-defined.
<stdckdint.h>
(mandatory change, independent) {#stdckdint}
New header The specification of the interfaces in C23 is type-generic:
#include <stdckdint.h>
bool ckd_add(type1 *result, type2 a, type3 b);
bool ckd_sub(type1 *result, type2 a, type3 b);
bool ckd_mul(type1 *result, type2 a, type3 b);
This adds overflow and wrap-around safe interfaces for integer
addition, subtraction and multiplication. The result target is filled
with the truncated result of the mathematical operation, the return
value holds false
if that result is correct, true
otherwise.
Admissible types for these operations are all standard integer types
with the exception of bool
and char
.
The intent of these interfaces is that they use the underlying hardware as efficient as possible. Most modern processors have overflow flags in hardware and these functions should in general just query these flags.
On most architectures there are probably builtins that just have to be interfaced, but since they are type-generic there might some more to do than just functions. On the other hand, compilers such as gcc and clang already have similar type-generic interfaces:
bool __builtin_add_overflow (type1 a, type2 b, type3 *res);
bool __builtin_sub_overflow (type1 a, type2 b, type3 *res);
bool __builtin_mul_overflow (type1 a, type2 b, type3 *res);
When relying on such an extension the contents of the header could just read
#ifndef __STDC_VERSION_STDCKDINT_H__
#define __STDC_VERSION_STDCKDINT_H__ 202311L
#ifdef __GNUC__
# define ckd_add(R, A, B) __builtin_add_overflow ((A), (B), (R))
# define ckd_sub(R, A, B) __builtin_sub_overflow ((A), (B), (R))
# define ckd_mul(R, A, B) __builtin_mul_overflow ((A), (B), (R))
#else
# error "we need a compiler extension for this"
#endif
#endif
Because of the type-generic nature, this is a header-only addition to the standard. No support for linker symbols for any variant of these functions is required. Effectively, this is an extension to the language and not a library feature.
The addition that has been integrated into C23 is the core proposal and some amendments. The history of this looks a bit confusing because later editions of the paper have removed parts that had already been adopted.
Alignment to IEC 60559 for standard floating point
Unfortunately not my field of expertise. There might be subtle changes to the FP model.
The following functions are added to <math.h>, with the usual three
versions for double
(without suffix), float
(with f
suffix) and
\code{long double} (with l
suffix), and as type-generic macros to
<tgmath.h>:
acospi
,
asinpi
,
atan2pi
,
atanpi
,
compoundn
,
cospi
,
fmaximum_mag_num
,
fmaximum_mag
,
fmaximum_num
,
fmaximum
,
fminimum_mag_num
,
fminimum_mag
,
fminimum_num
,
fminimum
,
nextup
,
pown
,
powr
,
rootn
,
roundeven
,
sinpi
,
tanpi
,
fadd
,
dadd
,
fsub
,
dsub
,
fmul
,
dmul
,
fdiv
,
ddiv
,
Similarly, the following functions and their variants are added to
<math.h>, but there are no type-generic macros in <tgmath.h>:
canonicalize
,
fromfp
,
fromfpx
,
ufromfp
,
ufromfpx
,
Optional IEC 60559 support for decimal floating point
Unfortunately not my field of expertise, but see some discussion for
<stdarg.h>
and <stdio.h>
, below.
The main chunk of work here goes into to the huge amount of function
that this requires for <math.h>
, <stdlib.h>
and maybe some other
headers. This effort should probably not be repeated for every C
library out there, but should be concentrated into sublementary
libraries that can be added on demand, independently of the base C
library.
The things that a genuin C library probably should support to make this possible is basic language support for these types
- Implementation of the feature macros in
<float.h>
such as described in section 5.2.4.2.3 p2 to p6 of C23. - Implementation of the classification macros in
<math.h>
. - Implementation of the functions
frexpd32
,frexpd64
,frexpd128
functions in<math.h>
. - Implementation of the formatted IO in
<stdio.h>
.
This can be based on feature macros and builtins that e.g gcc already has since long, namely for N one of 32, 64 or 128:
-
__DEC
N_EPSILON__
__DEC
N_MIN__
__DEC
N_MAX__
__DEC
N_MIN_EXP__
__DEC
N_SUBNORMAL_MIN__
__DEC
N_MAX_EXP__
__DEC
N_MANT_DIG__
-
__builtin_signbitd
N,__builtin_infd
N,__builtin_nand
N
Attributes
The introduction of attributes in C23 only implies a few changes in the library.
_Noreturn
becomes [[noreturn]]
(mandatory change)
Adapting to that will be a bit tedious because, _Noreturn
lives on
as a keyword for some time. The easiest would probably be to change
occurrences of _Noreturn
in the C library headers <setjmp.h>
,
<threads.h>
and <stdlib.h>
by something like __noreturn
and then
add the following in a general preamble or to the preambles of these
headers:
#ifdef __has_c_attribute
# if __has_c_attribute(__noreturn__)
# ifndef __noreturn
# define __noreturn [[__noreturn__]]
# endif
# endif
#endif
#ifndef __noreturn
# define __noreturn _Noreturn
#endif
The macros __STDC_VERSION_SETJMP_H__
, __STDC_VERSION_STDLIB_H__
and __STDC_VERSION_THREADS_H__
should only be set to 202311L
if
this change is implemented.
Changes to implementation-defined occurences of _Noreturn
are not
necessary, but it would be a good occasion to do so.
The header <stdnoreturn.h>
becomes obsolescent, but does not change.
[[deprecated]]
(mandatory change)
Deprecating certain functions with Two functions in <time.h>
receive such an attribute, see below.
[[unsequenced]]
Annotating C library functions with This new function type attribute models a generalization of what usually is called a pure function in CS. Because of the rules how function attributes accumulate in conditional operators, it is not possible to add this kind of attribute to C library functions; otherwise in rare cases this could change semantics of programs on architectures that implement the attribute.
The only C library functions with this attribute are the ones in
<stdbit.h>
because these are new and do not present
problems with backwards compatibility.
In contrast to that user code that checks that the planets are aligned could augment C library interfaces with this attribute to help optimizers. This can even be done locally restricted to the body of a function, for example, where it is known that all calls to the function in question have the right properties:
typeof(sqrt) [[unsequenced]] sqrt; // We guarantee that all arguments are positive
There are a lot of functions that are unsequenced per definition of
the standard and may thus be annotated in user code or in the internal
definitions of functions of a C library implementation. All interfaces
in 7.12.3, 7.18, 7.20 and the following families of functions do not
access any global state. In particular they may never encounter
exceptional conditions that would let them have read the rounding mode
or write to the floating point flags or to errno
.
abs
atomic_init
canonialize
ceil
copysign
decodecd
N
difftime
div
encodecd
N
fabs
floor
fmax
fmaximum[_mag][_num]
fmin
fminimum[_mag][_num]
labs
ldiv
llabs
lldiv
memalignment
memccpy
memchr
memcmp
memcpy
memset
modf
nan
quantumd
N
round
roundeven
samequantumd
N
strchr
strcmp
strcpy
strcspn
strlen
strncmp
strncpy
strpbrk
strrchr
strspn
strstr
trunc
This also holds for most functions in <math.h>
and <complex.h>
if
it can be guaranteed that the function is never called with arguments
that are invalid or result in out-of-range values, and if the rounding
mode is previously fixed to a specific value for all uses.
Predefined macro names (mandatory change)
__STDC_UTF_16__
and __STDC_UTF_32__
are mandatory
They are now forced to the value 1
Remove trigraphs
Trigraphs are removed from the language. Any library headers that use them must be updated.
Generally, they are probably never used within strings or character literals in C library headers.
A more common use may be the trigraph ??=
in preprocessing as a
replacement for #
. This can be easily be replaced by the digraphs
%:
and %:%:
which have the following advantages
- They are proper tokens and will not be substituted in early compilation phases. So there is no messing with the contents of user strings anymore.
- The characters
%
and:
are present in all source file encodings that are currently still in use.
<assert.h>
{#assert}
Changes to
__STDC_VERSION_ASSERT_H__
(mandatory change)
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
assert
(mandatory change)
Before this change, the assert
macro commonly suffered from the
one-macro-parameter problem, namely that expressions that contained a
highlevel comma such as (mystruct){ a = 0, b = x}.b
would not be
seen as one argument but several.
There is a simple cure to this namely to define assert
with
...
. If we suppose that there is a builtin function
[[noreturn]] void __builtin_assert(
const char * __expr, // textual version of the expression
const char* __file, // the name of the source
unsigned long __line, // the current line number, may be larger than INT_MAX
const char* __func); // the current function
Then the macro can be defined using a static
function
__assert_expression
as
// Ensure that this version only accepts arguments that resolve to one
// C expression after preprocessing.
#ifndef __ASSERT_EXPRESSION
#define __ASSERT_EXPRESSION(...) __assert_expression(__VA_ARGS__)
static inline _Bool __assert_expression(_Bool __x) { return __x; }
#endif
#ifdef NDEBUG
#define assert(...) ((void)0)
#else
#if __STDC_VERSION >= 199901L
# define assert(...) ((__VA_ARGS__) ? (void)0 : __builtin_assert(#__VA_ARGS__, __FILE__, __LINE__, __func__))
#else
# define assert(...) ((void)(__ASSERT_EXPRESSION(__VA_ARGS__) || (__builtin_assert(#__VA_ARGS__, __FILE__, __LINE__, __func__),0)))
#endif
This ensures that an expression that is given as an argument and that happens to be several macro arguments, is still accepted and works as expected. On the other hand, an empty parameter or one that expands to more than one expression is diagnosed.
This change is conforming to all C versions after C99 where variable argument macros were introduced.
<float.h>
{#float}
Changes to
__STDC_VERSION_FLOAT_H__
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
INFINITY
and NAN
to <float.h>
.
Add Missing macros
Not my field of expertise.
<inttypes.h>
{#inttypes}
Changes to
__STDC_VERSION_INTTYPES_H__
(mandatory change)
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23. (This macro is still missing from the latest
draft, but should be added during CD2 ballot.)
PRI
macros for narrow types
In a last minute change (GB-108 in CD2
ballot) it is
now imposed that the macros for printf
and similar functions convert
values for narrow types (such as uint16_t
) back to the narrow type
before printing. This has an effect on implementations and on user
code that used format specifiers for narrow types to print arbitrary
signed
or unsigned
values.
PRI
and SCN
macros for binary numbers
Macros for the the new binary printf
and scanf
formats become
mandatory, analogously as for x
there is now b
.
The printf
specifier %B
is optional. The macros with B
(such as
PRIB128
) act as feature test macros.
<limits.h>
{#limits}
Changes to
__STDC_VERSION_LIMITS_H__
(mandatory change)
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
Missing macros (mandatory change)
-
BITINT_MAX
for the maximal supported width of a bit precise integer type -
_WIDTH
macros for all standard integer types
<stdarg.h>
{#stdarg}
Changes to This header provides a liaison between C language and C library. For
implementations that have no good compiler support for va_...
macros, support for C23 appears to be relatively challenging.
For implementations that just forward these macros to compiler builtins, there is no particular difficulty to be expected.
__STDC_VERSION_STDARG_H__
(mandatory change)
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
va_start
can be omitted (mandatory change)
Additional arguments to void va_start(va_list ap, ...);
- Only the first argument shall be used and evaluated.
- Functions specified with
...
parameter may have only that andva_start
must be able to deal with that situation.
The formulation as given above is only valid if the preprocessor is
conforming to C23, namely that it works well without a trailing
comma. A good way to take care of that is by using the
a feature test macro for the __VA_OPT__
feature:
#define __has_VA_OPT(X, ...) X ## __VA_OPT__(_OK)
#define __C23__VA_OPT__(...) 0
#define __C23_OK 1
#if __has_VA_OPT(__C23, __C23)
#define va_start(v, ...) __builtin_va_start(v __VA_OPT__(,) __VA_ARGS__)
#else
#define va_start(v,l) __builtin_va_start(v,l)
#endif
Implementations that set __STDC_VERSION_STDARG_H__
have to verify
that they conform to that reformulation.
Support for new types, standard or extended (mandatory change)
Users will expect va_arg
to work for all types that a C23 compiler
provides.
Compiler dependency: Implementations that set
__STDC_VERSION_STDARG_H__
have to verify that they at least support
the new standard types. Users would be quite surprised if a compiler
implements optional types or certain common extensions and va_arg
fails with them. This would make it in particular impossible for third
party libraries to add support for these types.
va_arg
can handle function arguments with type nullptr_t
(mandatory change)
Functions that are called with a nullptr
argument in the variable
argument list can deal with such an argument by calls
va_arg(ap, nullptr_t)
va_arg(ap, char*)
or any other pointer type that has the same representation as char*
,
such as void*
. In particular, on implementations that have exactly
one pointer model (such as POSIX) any pointer type can be used as type
name argument.
The result of the macro call is then a null pointer of the indicated type.
Compiler dependency: To support the variant that uses nullptr_t
compiler support is needed.
va_arg
may be called with _BitInt(
N)
types (mandatory change)
This is a bit tricky for narrow bit-precise types, because they do not
promote to int
as would other narrow types.
va_arg
and decimal floating types (optional change)
Full support for decimal floating point is indicated by the feature
test macro __STDC_IEC_60559_DFP__
. Setting this macro implies not
only compiler support (arithmetic, fuction calls) but also the
implementation of about 600 C library interfaces (5.5 pages in the
library summary Annex B). So implementations might be hesitant to
support this. Hopefully the open source implementations will join
forces to supply an add-on external library that completes support for
these types.
Compiler dependency: Decimal floating point types are optional,
but they are already implemented on a major compiler (gcc) and
expectations will be high that any C library supports these types on
such platforms in <stdarg.h>
, even if they do not intend to set
__STDC_IEC_60559_DFP__
themselves.
va_arg
and extended integer types (optional change)
With relaxation for wide integer types that exceed the width of
intmax_t
, compiler platforms may start to provide support for such
types. This may for example be the case of __int128
types on
x86_64, powerpc64 or aarch64, and which are already supported by
compilers.
For any width N such there is an extended integer type that has no
padding there has to be support for types int
N_t
and
uint
N_t
. So in particular ..._C
macros and w
N length
modifiers for printf
are mandatory. So C library implementations
also have to support these extended integer types for va_arg
.
<stdatomic.h>
{#stdatomic}
Changes to
__STDC_VERSION_STDATOMIC_H__
(mandatory change)
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
ATOMIC_VAR_INIT
? (optional change)
Remove This macro is removed from the standard because it is not deemed necessary to implement atomics, normal initialization syntax should be sufficient.
Implementations may keep this macro, though, because the ATOMIC_...
prefix is still reserved and so they do not impede on the users name
space for identifiers.
_BitInt(
N)
and decimal floating point types must be supported by type-generic interfaces (mandatory change)
New For implementations that mostly rely on _Generic
or similar features
to provide operations such as atomic_fetch_add
, for example, this
addition to the interface might be quite challenging.
ABI choice: Also ABI decisions have to be taken which of these new
types, if any, are to be lock-free. There are no particular feature
test macros for these types concerning this property (so no
preprocessor conditionals can be used) but the generic function
atomic_is_lock_free
has to cope with them. This function could, for
example, return true
for all types where there is a supporting basic
type with the same size that is lock-free.
ATOMIC_CHAR8_T_LOCK_FREE
and type atomic_char8_t
(mandatory change)
New macro This is necessary because of the addition of char8_t
to
<uchar.h>
. This can be done by inserting the following two lines at
the appropriate places of <stdatomic.h>
.
#define ATOMIC_CHAR8_T_LOCK_FREE ATOMIC_CHAR_LOCK_FREE
typedef _Atomic(unsigned char) atomic_char8_t;
Because the ATOMIC_
and atomic_
prefixes are reserved, no
preprocessor conditional is needed.
<stdbool.h>
{#stdbool}
Changes to
__STDC_VERSION_BOOL_H__
?
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
bool
, false
and true
Remove/protect The following could be the complete contents of this header:
#ifndef __STDC_VERSION_BOOL_H__
#define __STDC_VERSION_BOOL_H__ 202311L
#if (__STDC_VERSION__ < 202300L) && !defined(__cplusplus)
#define true 1
#define false 0
#define bool _Bool
#else
#define true true
#define false false
#define bool bool
#endif
#define __bool_true_false_are_defined 1
#endif
Note that this unconditionally defines the macros, because some application code might do preprocessor conditionals on these.
<stddef.h>
{#stddef}
Changes to
__STDC_VERSION_STDDEF_H__
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
unreachable
macro (mandatory change)
Defined in 7.21.1.
- https://open-std.org/JTC1/SC22/WG14/www/docs/n2826.pdf, the macro option has been chosen by WG14.
Compiler dependency: Quality implementations need compiler support for this.
gcc and clang already have __builtin_unreachable
so in
general here the "implementation" for the interface is only
#ifdef unreachable
# error unreachable() is a standard macro for C23
#endif
#if __STDC_VERSION__ >= 202311L
# define unreachable() __builtin_unreachable()
#else
# define unreachable static_assert(0, "unreachable becomes a standard macro for C23")
#endif
This also detects usage of the now-reserved identifier unreachable
in code that is not yet ready for C23.
Because it is just undefined in the primary sense of the word if a call to that feature is reached, theoretically low quality implementations could do nonsense as the following:
#define unreachable() ((void)0)
#define unreachable() ((void)(1/0))
#define unreachable() ((void)puts("reached the ureachable"))
#define unreachable() abort()
#define unreachable() give_me_your_credit_card_number()
All of this would result in suboptimal code for their users, because this feature is meant such that whole branches of the control flow graph can be pruned from the executable.
nullptr_t
C23 has a new keyword and constant nullptr
and provides access to
the underlying type via <stddef.h>
as
#if __STDC_VERSION__ > 202311L
typedef typeof(nullptr) nullptr_t;
#endif
On general implementations, a preprocessor conditional is needed,
because the identifiers nullptr
and nullptr_t
had not been
reserved before C23. For the type alone, such a conditional is not
needed on POSIX systems, since their types with suffix _t
are
already reserved.
NULL
Because of the ambiguity of its definition, this macro has some
portability and safety problems and so C23 has integrated nullptr
(introduced to C++ in 2011) to replace it on the long run.
There is no requirement, yet, to have this macro point to nullptr
,
though, but on non-POSIX implementations it might be a good idea to
move to something like the following:
#if (__cplusplus >= 201103L) || (__STDC_VERSION__ >= 202311L)
#define NULL nullptr
#elif defined(__cplusplus)
#define NULL 0L /* Any of 0, 0L, 0LL as wide as a void* */
#else
#define NULL ((void*)0)
#endif
POSIX implementations should for the moment stay with ((void*)0)
since this is a requirement, there, until this constraint is
lifted. Since nullptr
values have the same representation as void*
this should not result in much difficulties, anyhow.
A preprocessor conditional is needed, because nullptr
is new for
C23.
<stdio.h>
{#stdio}
Changes to
__STDC_VERSION_STDIO_H__
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
Changes to formatted IO
This implies changes for printf
, scanf
and friends.
Printing of narrow types
See <inttypes.h>
"%lc"
conversion of L'\0'
In a last minute change (GB-141 in CD2 ballot) the strategy for printing wide characters has changed. The only effective change is for a nul wide character, which previously had no output and now results in NUL. This is consistent with other printing of nul characters, but is a normative change nevertheless.
printing NaN
There is a new _PRINTF_NAN_LEN_MAX
macro that holds the maximum
length of output corresponding to a NaN
value that would be issued
by printf
and friends.
the b
conversion specifier
This specifies binary input and output analogous to hexadecimal, only
that the character b
plays the role of x
.
For scanf
, this is a semantic change, because input may be
accepted or rejected according the version of the C library that is
linked to the excutable, see strtol
below. Depending on
the solutions that WG14 might still find for that problems, it might
be possible that a whole second set of scanf
interfaces is needed.
Introduction of a similar B
conversion specifier (comparable to X
)
for printf
is optional and not imposed because
implementations could already have occupied this space. So if an
implementations does currently nothing particular for B
it is
expected that they also implement B
analogous to X
The macros in <inttypes.h> with B
(such as PRIB128
)
act as feature test macros.
- https://open-std.org/JTC1/SC22/WG14/www/docs/n2630.pdf
- https://open-std.org/JTC1/SC22/WG14/www/docs/n3072.htm
w
N length modifiers {#wlength}
Where N is any of the values (decimal without leading 0
) for all
supported minimum-width integer types provided by 7.22.1.2. The exact
width integer types (7.22.1.1) now are a subset of these, so in
particular these must all be supported. To implement these, it must be
known to which size a given width corresponds, so in particular which
widths are natively supported by the architecture. For 8
, 16
, 32
and 64
the minimum-width integer types are mandatory, so at least
these values must be supported.
Implementations should try to support N and exact-width types for
values where there might not even be compiler support, yet. Now that
we know that signed integers are two's complement, once the byte
interpretation (endianess) of int128_t
for example is known, I/O on
these represented values should not be difficult. Adding such values
for which there might be no full support for the type otherwise is
allowed, it just has to be documented. In any case, applications may
easily check with #ifdef
.
Note that it is not expected that these macros work for the new
_BitInt(
N)
types even if N is one of the standard value. This
is because the argument-passing convention for these types may be
different from the standard integer types.
For versions of the C library that support these formats (and thus set
__STDC_VERSION_STDIO_H__
) the macros PRIx
N etc should use these
new length modifiers. This could look something like
#define _PRI(F, N) "w" #N #F
#define PRIX128 _PRI(X, 128)
#define PRIX16 _PRI(X, 16)
#define PRIX256 _PRI(X, 256)
#define PRIX32 _PRI(X, 32)
#define PRIX64 _PRI(X, 64)
#define PRIX8 _PRI(X, 8)
#define PRId128 _PRI(d, 128)
#define PRId16 _PRI(d, 16)
#define PRId256 _PRI(d, 256)
#define PRId32 _PRI(d, 32)
#define PRId64 _PRI(d, 64)
#define PRId8 _PRI(d, 8)
#define PRIi128 _PRI(i, 128)
#define PRIi16 _PRI(i, 16)
#define PRIi256 _PRI(i, 256)
#define PRIi32 _PRI(i, 32)
#define PRIi64 _PRI(i, 64)
#define PRIi8 _PRI(i, 8)
#define PRIo128 _PRI(o, 128)
#define PRIo16 _PRI(o, 16)
#define PRIo256 _PRI(o, 256)
#define PRIo32 _PRI(o, 32)
#define PRIo64 _PRI(o, 64)
#define PRIo8 _PRI(o, 8)
#define PRIu128 _PRI(u, 128)
#define PRIu16 _PRI(u, 16)
#define PRIu256 _PRI(u, 256)
#define PRIu32 _PRI(u, 32)
#define PRIu64 _PRI(u, 64)
#define PRIu8 _PRI(u, 8)
#define PRIx128 _PRI(x, 128)
#define PRIx16 _PRI(x, 16)
#define PRIx256 _PRI(x, 256)
#define PRIx32 _PRI(x, 32)
#define PRIx64 _PRI(x, 64)
#define PRIx8 _PRI(x, 8)
#define PRIXLEAST128 PRIX128
#define PRIXLEAST16 PRIX16
#define PRIXLEAST256 PRIX256
#define PRIXLEAST32 PRIX32
#define PRIXLEAST64 PRIX64
#define PRIXLEAST8 PRIX8
#define PRIdLEAST128 PRId128
#define PRIdLEAST16 PRId16
#define PRIdLEAST256 PRId256
#define PRIdLEAST32 PRId32
#define PRIdLEAST64 PRId64
#define PRIdLEAST8 PRId8
#define PRIiLEAST128 PRIi128
#define PRIiLEAST16 PRIi16
#define PRIiLEAST256 PRIi256
#define PRIiLEAST32 PRIi32
#define PRIiLEAST64 PRIi64
#define PRIiLEAST8 PRIi8
#define PRIoLEAST128 PRIo128
#define PRIoLEAST16 PRIo16
#define PRIoLEAST256 PRIo256
#define PRIoLEAST32 PRIo32
#define PRIoLEAST64 PRIo64
#define PRIoLEAST8 PRIo8
#define PRIuLEAST128 PRIu128
#define PRIuLEAST16 PRIu16
#define PRIuLEAST256 PRIu256
#define PRIuLEAST32 PRIu32
#define PRIuLEAST64 PRIu64
#define PRIuLEAST8 PRIu8
#define PRIxLEAST128 PRIx128
#define PRIxLEAST16 PRIx16
#define PRIxLEAST256 PRIx256
#define PRIxLEAST32 PRIx32
#define PRIxLEAST64 PRIx64
#define PRIxLEAST8 PRIx8
Similar for scanf
:
#define _SCN(F, N) "w" #N #F
#define SCNX128 _SCN(X, 128)
#define SCNX16 _SCN(X, 16)
#define SCNX256 _SCN(X, 256)
#define SCNX32 _SCN(X, 32)
#define SCNX64 _SCN(X, 64)
#define SCNX8 _SCN(X, 8)
#define SCNd128 _SCN(d, 128)
#define SCNd16 _SCN(d, 16)
#define SCNd256 _SCN(d, 256)
#define SCNd32 _SCN(d, 32)
#define SCNd64 _SCN(d, 64)
#define SCNd8 _SCN(d, 8)
#define SCNi128 _SCN(i, 128)
#define SCNi16 _SCN(i, 16)
#define SCNi256 _SCN(i, 256)
#define SCNi32 _SCN(i, 32)
#define SCNi64 _SCN(i, 64)
#define SCNi8 _SCN(i, 8)
#define SCNo128 _SCN(o, 128)
#define SCNo16 _SCN(o, 16)
#define SCNo256 _SCN(o, 256)
#define SCNo32 _SCN(o, 32)
#define SCNo64 _SCN(o, 64)
#define SCNo8 _SCN(o, 8)
#define SCNu128 _SCN(u, 128)
#define SCNu16 _SCN(u, 16)
#define SCNu256 _SCN(u, 256)
#define SCNu32 _SCN(u, 32)
#define SCNu64 _SCN(u, 64)
#define SCNu8 _SCN(u, 8)
#define SCNx128 _SCN(x, 128)
#define SCNx16 _SCN(x, 16)
#define SCNx256 _SCN(x, 256)
#define SCNx32 _SCN(x, 32)
#define SCNx64 _SCN(x, 64)
#define SCNx8 _SCN(x, 8)
#define SCNXLEAST128 SCNX128
#define SCNXLEAST16 SCNX16
#define SCNXLEAST256 SCNX256
#define SCNXLEAST32 SCNX32
#define SCNXLEAST64 SCNX64
#define SCNXLEAST8 SCNX8
#define SCNdLEAST128 SCNd128
#define SCNdLEAST16 SCNd16
#define SCNdLEAST256 SCNd256
#define SCNdLEAST32 SCNd32
#define SCNdLEAST64 SCNd64
#define SCNdLEAST8 SCNd8
#define SCNiLEAST128 SCNi128
#define SCNiLEAST16 SCNi16
#define SCNiLEAST256 SCNi256
#define SCNiLEAST32 SCNi32
#define SCNiLEAST64 SCNi64
#define SCNiLEAST8 SCNi8
#define SCNoLEAST128 SCNo128
#define SCNoLEAST16 SCNo16
#define SCNoLEAST256 SCNo256
#define SCNoLEAST32 SCNo32
#define SCNoLEAST64 SCNo64
#define SCNoLEAST8 SCNo8
#define SCNuLEAST128 SCNu128
#define SCNuLEAST16 SCNu16
#define SCNuLEAST256 SCNu256
#define SCNuLEAST32 SCNu32
#define SCNuLEAST64 SCNu64
#define SCNuLEAST8 SCNu8
#define SCNxLEAST128 SCNx128
#define SCNxLEAST16 SCNx16
#define SCNxLEAST256 SCNx256
#define SCNxLEAST32 SCNx32
#define SCNxLEAST64 SCNx64
#define SCNxLEAST8 SCNx8
wf
N length modifiers {#wflength}
Where N is any of the values (decimal without leading 0
) for all
supported fastest minimum-width integer types provided by 7.22.1.3.
ABI choice: To implement these, it must be known to which size a
given width corresponds, this is an ABI decision. For 8
, 16
, 32
and 64
the fastest minimum-width integer types are mandatory, so at
least these values must be supported.
For versions of the C library that support these formats (and thus set
__STDC_VERSION_STDIO_H__
) the macros PRIx
N etc should use these
new length modifiers. This could look something like
#define _PRIFAST(F, N) "wf" #N #F
#define PRIXFAST128 _PRIFAST(X, 128)
#define PRIXFAST16 _PRIFAST(X, 16)
#define PRIXFAST256 _PRIFAST(X, 256)
#define PRIXFAST32 _PRIFAST(X, 32)
#define PRIXFAST64 _PRIFAST(X, 64)
#define PRIXFAST8 _PRIFAST(X, 8)
#define PRIdFAST128 _PRIFAST(d, 128)
#define PRIdFAST16 _PRIFAST(d, 16)
#define PRIdFAST256 _PRIFAST(d, 256)
#define PRIdFAST32 _PRIFAST(d, 32)
#define PRIdFAST64 _PRIFAST(d, 64)
#define PRIdFAST8 _PRIFAST(d, 8)
#define PRIiFAST128 _PRIFAST(i, 128)
#define PRIiFAST16 _PRIFAST(i, 16)
#define PRIiFAST256 _PRIFAST(i, 256)
#define PRIiFAST32 _PRIFAST(i, 32)
#define PRIiFAST64 _PRIFAST(i, 64)
#define PRIiFAST8 _PRIFAST(i, 8)
#define PRIoFAST128 _PRIFAST(o, 128)
#define PRIoFAST16 _PRIFAST(o, 16)
#define PRIoFAST256 _PRIFAST(o, 256)
#define PRIoFAST32 _PRIFAST(o, 32)
#define PRIoFAST64 _PRIFAST(o, 64)
#define PRIoFAST8 _PRIFAST(o, 8)
#define PRIuFAST128 _PRIFAST(u, 128)
#define PRIuFAST16 _PRIFAST(u, 16)
#define PRIuFAST256 _PRIFAST(u, 256)
#define PRIuFAST32 _PRIFAST(u, 32)
#define PRIuFAST64 _PRIFAST(u, 64)
#define PRIuFAST8 _PRIFAST(u, 8)
#define PRIxFAST128 _PRIFAST(x, 128)
#define PRIxFAST16 _PRIFAST(x, 16)
#define PRIxFAST256 _PRIFAST(x, 256)
#define PRIxFAST32 _PRIFAST(x, 32)
#define PRIxFAST64 _PRIFAST(x, 64)
#define PRIxFAST8 _PRIFAST(x, 8)
Similar for scanf
:
#define _SCNFAST(F, N) "wf" #N #F
#define SCNXFAST128 _SCNFAST(X, 128)
#define SCNXFAST16 _SCNFAST(X, 16)
#define SCNXFAST256 _SCNFAST(X, 256)
#define SCNXFAST32 _SCNFAST(X, 32)
#define SCNXFAST64 _SCNFAST(X, 64)
#define SCNXFAST8 _SCNFAST(X, 8)
#define SCNdFAST128 _SCNFAST(d, 128)
#define SCNdFAST16 _SCNFAST(d, 16)
#define SCNdFAST256 _SCNFAST(d, 256)
#define SCNdFAST32 _SCNFAST(d, 32)
#define SCNdFAST64 _SCNFAST(d, 64)
#define SCNdFAST8 _SCNFAST(d, 8)
#define SCNiFAST128 _SCNFAST(i, 128)
#define SCNiFAST16 _SCNFAST(i, 16)
#define SCNiFAST256 _SCNFAST(i, 256)
#define SCNiFAST32 _SCNFAST(i, 32)
#define SCNiFAST64 _SCNFAST(i, 64)
#define SCNiFAST8 _SCNFAST(i, 8)
#define SCNoFAST128 _SCNFAST(o, 128)
#define SCNoFAST16 _SCNFAST(o, 16)
#define SCNoFAST256 _SCNFAST(o, 256)
#define SCNoFAST32 _SCNFAST(o, 32)
#define SCNoFAST64 _SCNFAST(o, 64)
#define SCNoFAST8 _SCNFAST(o, 8)
#define SCNuFAST128 _SCNFAST(u, 128)
#define SCNuFAST16 _SCNFAST(u, 16)
#define SCNuFAST256 _SCNFAST(u, 256)
#define SCNuFAST32 _SCNFAST(u, 32)
#define SCNuFAST64 _SCNFAST(u, 64)
#define SCNuFAST8 _SCNFAST(u, 8)
#define SCNxFAST128 _SCNFAST(x, 128)
#define SCNxFAST16 _SCNFAST(x, 16)
#define SCNxFAST256 _SCNFAST(x, 256)
#define SCNxFAST32 _SCNFAST(x, 32)
#define SCNxFAST64 _SCNFAST(x, 64)
#define SCNxFAST8 _SCNFAST(x, 8)
H
, D
and DD
length modifiers for decimal floating point
For _Decimal32
, _Decimal64
and _Decimal128
. This is optional
depending on support for decimal floating point. Implementation should
not be too difficult. In particular these new number types have
prescribed representation formats (2 possible choices and endianess),
so implementation should even be possible without complete support for
these types on the compilation platform of the C library. Some support
is needed from <math.h>
namely classification for infinite or NaN
values.
Compiler dependency: Support for this is important such that the rest of any library support for decimal floating point can be added by an independent library.
<stdlib.h>
{#stdlib}
Changes to
__STDC_VERSION_STDLIB_H__
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
strtol
and friends {#strtol}
With base 0
or 2
, these functions now accept the new binary
integer constants. When used with the optional the prefixes 0b
or
0B
, this is a semantic change: for example the following code
long res = strtol("0b1", 0, 0);
has res ≡ 0
for C17 and res ≡ 1
for C23. This is because for the first the
interpretation stops before the b
.
C library implementations that want to support previous versions of C, have to take care of that semantic change. Therefore they probably have to have two functions for each of the interfaces
// don't recognize 0xb or 0xB for base 0 or 2
long int strtol_c17(const char *restrict nptr, char **restrict endptr, int base);
long long int strtoll_c17(const char *restrict nptr, char **restrict endptr, int base);
unsigned long int strtoul_c17(const char *restrict nptr, char **restrict endptr, int base);
unsigned long long int strtoull_c17(const char *restrict nptr, char **restrict endptr, int base);
// recognize 0xb or 0xB for base 0 or 2
long int strtol_c23(const char *restrict nptr, char **restrict endptr, int base);
long long int strtoll_c23(const char *restrict nptr, char **restrict endptr, int base);
unsigned long int strtoul_c23(const char *restrict nptr, char **restrict endptr, int base);
unsigned long long int strtoull_c23(const char *restrict nptr, char **restrict endptr, int base);
#if __STDC_VERSION__ >= 202311L
#define strtol strtol_c23
#define strtoll strtoll_c23
#define strtoul strtoul_c23
#define strtoull strtoull_c23
#else
#define strtol strtol_c17
#define strtoll strtoll_c17
#define strtoul strtoul_c17
#define strtoull strtoull_c17
#endif
Not yet stable? It is possible that there will be an NB comment
for this and that this interface might still change, e.g by adding
[[deprecated]]
to the existing interfaces and by creating new
interfaces with prefix stdc_
.
Alignment requirements for memory management functions
These were reformulated for C23. Implementations that set
__STDC_VERSION_STDLIB_H__
have to verify that they conform to that
reformulation, see
calloc
overflow
Implementations that set
__STDC_VERSION_STDLIB_H__
have to verify that they conform to
7.24.3.2 p3:
The
calloc
function returns either a pointer to the allocated space or a null pointer if the space cannot be allocated or if the productnmemb * size
would wraparoundsize_t
.
bsearch
becomes a const
-preserving tg macro {#bsearch}
This is specified as
QVoid *bsearch(const void *, QVoid*, size_t nmemb, size_t size, int (*)(const void*, const void*));
to emphasize that the return is exactly the same void
pointer type
as the argument. volatile
or restrict
types are not accepted.
An implementation of this type-generic macro could look as follows.
// The function itself stays exactly the same.
void* (bsearch)(const void*, const void*, size_t, size_t, int(*)(const void*, const void*));
#if __STDC_VERSION__ > 202300L
# define bsearch(K, B, N, S, C) \
_Generic( \
/* ensure conversion to a void pointer */ \
true ? (B) : (void*)1, \
void const*: (void const*)bsearch((K), (void const*)(B), (N), (S), (C)), \
/* volatile qualification of *B is an error for this call */ \
default: bsearch((K), (B), (N), (S), (C)) \
)
#endif
A preprocessor conditional is needed, because the type of call expressions potentially changes with this.
User code that misused these calls and stored the result for a call
with a const
qualified array in a pointer with unqualified target
type may see their code diagnosed or even rejected. This is
intentional.
free_sized
and free_aligned_sized
Functions
once_flag
and call_once
also to <stdlib.h>
Add This type and function now become mandatory for C23.
Similar to size_t
it can appear in several headers. Something along
the lines of should be added to <stdlib.h>
#if (__STDC_VERSION__ >= 201311L) && !defined(ONCE_FLAG_INIT)
typedef int once_flag;
#define ONCE_FLAG_INIT 0
void call_once(once_flag*, void (*)(void));
#endif
A protection by a preprocessor conditional is not strictly necessary because these names are otherwise only potentially reserved, so adding them to a C library is always possible without impeding on the user identifier space.
A rationale for introducing this feature to the general C library and reference implementations for C libraries that might not have that feature, yet, because they don't support threads can be found in
Implementations that know how to avoid to link against the whole threads and atomic options could have an alternative version of this function
typedef int volatile once_flag;
#define ONCE_FLAG_INIT 0
void call_once(once_flag* flag, void (*func)(void)) {
if (!*flag) {
func();
*flag = 1;
}
}
In this case, the necessary synchronization guarantees are given,
because the call to the function and the assignment to flag
are
sequenced and cannot be optimized away. Note that this version is not
guaranteed to be asynchronous signal safe: not even the type
sig_atomic_t
does not give the guarantees that are needed for
implementing this function. Only an implementation with lock-free
require-release atomics (or similar) could be asynchronous signal
safe.
strfromd
, strfromf
and strfroml
New functions
memalignment
New function
<time.h>
{#time}
Changes to
__STDC_VERSION_TIME_H__
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
strftime
and wcsftime
Two new O
modified specifications are added to strftime
and wcsftime
:
%Ob
is replaced by the locale’s abbreviated alternative month name.
%OB
is replaced by the locale’s alternative appropriate full month name.
These have also been accepted for POSIX.
The C standard says nothing about erroneous specifications or some that would be implementation-defined extensions, so using new specifiers is just undefined for pre-C23. Therefore these can be added to any implementation without making it non-conforming for any C version.
On the other hand, __STDC_VERSION_TIME_H__
should only be taken to
202311L
once this addition is implemented.
asctime
and ctime
{#asctime}
Deprecation of C23 follows POSIX in deprecating these functions. Their interfaces now have the corresponding attribute:
#if __STDC_VERSION__ > 202300L
[[deprecated]] char *asctime(const struct tm *timeptr);
[[deprecated]] char *ctime(const time_t *timer);
#else
char *asctime(const struct tm *timeptr);
char *ctime(const time_t *timer);
#endif
A preprocessor conditional is needed, because the attribute syntax is new for C23.
gmtime_r
and localtime_r
C23 inherits these two interfaces from POSIX. C libraries that implement POSIX can simply expose them to C23 aware code by something similar to
#if __STDC_VERSION__ > 202300L
struct tm *gmtime_r(const time_t*, struct tm*);
struct tm *localtime_r(const time_t*, struct tm*);
#endif
A preprocessor conditional is needed, because the names had not been reserved before C23.
mktime
In a literal last minute change (GB-159 in CD2
ballot and
N3147 ) it
the behavior for mktime
has been restricted. If the time that is
provided in the argument is not representable the tm_wday
member has
to remain untouched.
timegm
The timegm
function from BSD is added. Specifications can be found here.
https://open-std.org/JTC1/SC22/WG14/www/docs/n2833.htm
with an amendment in
https://open-std.org/JTC1/SC22/WG14/www/docs/n3147.txt
A preprocessor conditional is needed, because the name had not been reserved before C23.
#if __STDC_VERSION__ > 202300L
time_t timegm(struct tm *);
#endif
timespec_getres
This function is meant to port POSIX' clock_getres
to C by
translating base
to the equivalent POSIX clock:
#if __STDC_VERSION__ > 202300L
int timespec_getres(struct timespec*, int);
#endif
A preprocessor conditional is needed, because the name had not been reserved before C23.
TIME_MONOTONIC
, TIME_ACTIVE
, TIME_THREAD_ACTIVE
C23 adds three optional time bases TIME_MONOTONIC
, TIME_ACTIVE
and TIME_THREAD_ACTIVE
which are modeled after the POSIX clocks
CLOCK_MONOTONIC
, CLOCK_PROCESS_CPUTIME_ID
and
CLOCK_THREAD_CPUTIME_ID
, respectively.
Having time bases for C other than TIME_UTC
is at the liberty of the
implementation, so any C library that runs on a POSIX system could
easily provide the equivalent to all POSIX clocks that it
interfaces. For example, for Linux currently these are
#define CLOCK_REALTIME 0
#define CLOCK_MONOTONIC 1
#define CLOCK_PROCESS_CPUTIME_ID 2
#define CLOCK_THREAD_CPUTIME_ID 3
#define CLOCK_MONOTONIC_RAW 4
#define CLOCK_REALTIME_COARSE 5
#define CLOCK_MONOTONIC_COARSE 6
#define CLOCK_BOOTTIME 7
#define CLOCK_REALTIME_ALARM 8
#define CLOCK_BOOTTIME_ALARM 9
#define CLOCK_SGI_CYCLE 10
#define CLOCK_TAI 11
This could easily be done by using something as
#define TIME_UTC (CLOCK_REALTIME+1)
#define TIME_MONOTONIC (CLOCK_MONOTONIC+1)
#define TIME_ACTIVE (CLOCK_PROCESS_CPUTIME_ID+1)
#define TIME_THREAD_ACTIVE (CLOCK_THREAD_CPUTIME_ID+1)
#define TIME_MONOTONIC_RAW (CLOCK_MONOTONIC_RAW+1)
#define TIME_UTC_COARSE (CLOCK_REALTIME_COARSE+1)
#define TIME_MONOTONIC_COARSE (CLOCK_MONOTONIC_COARSE+1)
#define TIME_BOOTTIME (CLOCK_BOOTTIME+1)
#define TIME_UTC_ALARM (CLOCK_REALTIME_ALARM+1)
#define TIME_BOOTTIME_ALARM (CLOCK_BOOTTIME_ALARM+1)
#define TIME_SGI_CYCLE (CLOCK_SGI_CYCLE+1)
#define TIME_TAI (CLOCK_TAI+1)
and then adapting the corresponding implementation of timespec_get
a
bit. This would be conforming to current and future C, because the
TIME_
prefix is already reserved for that purpose.
ABI choice: Unfortunately the choice of the values is an ABI choice, so before doing so it has to be ensured that other C libraries on the same platform use the same values.
<string.h>
{#string}
Changes to
__STDC_VERSION_STRING_H__
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
memccpy
, strdup
and strndup
C23 borrows these interfaces from POSIX
void* memccpy(void*, const void*, int, size_t);
char* strdup(const char*);
char* strndup(const char*, size_t);
A preprocessor conditional is not needed, because the names (with
prefixes mem
and str
) had been reserved before C23.
memset_explicit
The following function is added with the same normative specification
as memset
:
void* memset_explicit(void*, int, size_t);
The difference to that is in expectation. In the description
The purpose of this function is to make sensitive information stored in the object inaccessible^379)^.
And with the following text in the footnote.
The intention is that the memory store is always performed (i.e., never elided), regardless of optimizations.
There has been long and heated debate in WG14 and WG21 about this, and this is the best that we came up with. The standard does not have the language to describe that memory that is e.g freed has to be zeroed out when given back to the system; such a feature is not observable from within the program.
So this puts the responsibility for the intended purpose (hiding information from one part of the execution to other parts and to the outer world) entirely a point of "quality of implementation". Implementations should take care:
- A call to this function should never be optimized out. This can often be achieved by having it in a separate TU from all other functions and by disabling link-time optimization for this TU or at least for this function.
- No store to any byte of the function should be optimized out.
- The return of the function should synchronize with all read and
write operations. This could for example be achieved by issuing a
call
atomic_signal_fence(memory_order_seq_cst)
or equivalent, such that even a signal that kicks in right after the call could not not read the previous contents of the byte array. - All caches for the byte array should have been invalidated on return.
- To avoid side-channel attacks, the implementation of the function should make no explicit or implicit reference to the contents of the byte array nor the value that has been chosen for the overwrite. Each write operation should use the same time (and other resources) per byte.
- Good performance is not expected, security first.
A preprocessor conditional is not needed, because the name (with
prefix mem
) had been reserved before C23.
memchr
, strchr
, strpbrk
, strrchr
and strstr
become const
-preserving tg macros {#memchr}
For example for memchr
this is specified as
QVoid* memchr(QVoid*, int, size_t);
to emphasize that the return is exactly the same void
pointer type
as the argument. volatile
or restrict
types are not accepted.
An implementation of this type-generic macro could look as follows.
// The function itself stays exactly the same.
void* (memchr)(void const*, int, size_t);
#if __STDC_VERSION__ > 202300L
# define memchr(S, C, N) \
_Generic( \
/* ensure conversion to a void pointer */ \
true ? (S) : (void*)1, \
void const*: (void const*)memchr((void const*)(S), (C), (N)), \
/* volatile qualification is an error for this call */ \
default: memchr((S), (C), (N)) \
)
#endif
A preprocessor conditional is needed, because the type of call expressions potentially changes with this.
User code that misused these calls and stored the result for a call
with a const
qualified array in a pointer with unqualified target
type may see their code diagnosed or even rejected. This is
intentional.
<uchar.h>
{#uchar}
Changes to The required Unicode support has been straightened out and
complemented. The types char
N_t
now are designated for UTF-
N
encoding without exception. For N 16 or 32 there should not be much
changes to existing C libraries: WG14 found none for which the
previously optional macros __STDC_UTF_16__
and __STDC_UTF_32__
had
not been set.
Also, see the discussion about thread safety above.
__STDC_VERSION_UCHAR_H__
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
Conversion functions and type for UTF-8
The set of conversion functions from and to UTF encodings is completed by adding functions for UTF-8. For implementations that have UTF-8 as internal representation, anyhow, these functions are almost no-opts; they just have to iterate through the character sequence that composes the multi-byte character in UTF-8 encoding.
#if __STDC_VERSION__ >= 202300L
typedef unsigned char char8_t;
size_t mbrtoc8(char8_t * restrict pc8, const char * restrict s, size_t n, mbstate_t * restrict ps);
size t c8rtomb(char * restrict s, char8_t c8, mbstate_t * restrict ps);
#endif
The preprocessor conditional is needed, because the names had not been previously reserved.
<wchar.h>
{#wchar}
Changes to
__STDC_VERSION_WCHAR_H__
This macro is mandatory and should be set to 202311L
once the header
is compliant to C23.
wcschr
, wcspbrk
, wcsrchr
, wcsstr
and wmemchr
become const
-preserving tg macros
Similar to memchr
and similar, above.
fputwc
Change to This function now also sets the error indicator of the stream if an encoding error occurs.
Changes in Annex K
bsearch_s
Similar to bsearch
, above.