[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Here we document details of how the preprocessor's implementation affects its user-visible behavior. You should try to avoid undue reliance on behaviour described here, as it is possible that it will change subtly in future implementations.
Also documented here are obsolete features and changes from previous versions of GNU CPP.
11.1 Implementation-defined behavior 11.2 Implementation limits 11.3 Obsolete Features 11.4 Differences from previous versions
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This is how GNU CPP behaves in all the cases which the C standard describes as implementation-defined. This term means that the implementation is free to do what it likes, but must document its choice and stick to it.
Currently, GNU cpp only supports character sets that are strict supersets of ASCII, and performs no translation of characters.
In textual output, each whitespace sequence is collapsed to a single space. For aesthetic reasons, the first token on each non-directive line of output is preceded with sufficient spaces that it appears in the same column as it did in the original source file.
The preprocessor and compiler interpret character constants in the same way; escape sequences such as `\a' are given the values they would have on the target machine.
Multi-character character constants are interpreted a character at a
time, shifting the previous result left by the number of bits per
character on the host, and adding the new character. For example, 'ab'
on an 8-bit host would be interpreted as 'a' * 256 + 'b'. If there
are more characters in the constant than can fit in the widest native
integer type on the host, usually a long
, the excess characters
are ignored and a diagnostic is given.
For a discussion on how the preprocessor locates header files, 2.2 Include Operation.
See section 2.5 Computed Includes.
No macro expansion occurs on any `#pragma' directive line, so the question does not arise.
Note that GCC does not yet implement any of the standard pragmas.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
GNU CPP has a small number of internal limits. This section lists the limits which the C standard requires to be no lower than some minimum, and all the others we are aware of. We intend there to be as few limits as possible. If you encounter an undocumented or inconvenient limit, please report that to us as a bug. (See the section on reporting bugs in the GCC manual.)
Where we say something is limited only by available memory, that
means that internal data structures impose no intrinsic limit, and space
is allocated with malloc
or equivalent. The actual limit will
therefore depend on many things, such as the size of other things
allocated by the compiler at the same time, the amount of memory
consumed by other processes on the same computer, etc.
We impose an arbitrary limit of 200 levels, to avoid runaway recursion. The standard requires at least 15 levels.
The C standard mandates this be at least 63. GNU CPP is limited only by available memory.
The C standard requires this to be at least 63. In preprocessor conditional expressions, it is limited only by available memory.
The preprocessor treats all characters as significant. The C standard requires only that the first 63 be significant.
The standard requires at least 4095 be possible. GNU CPP is limited only by available memory.
We allow USHRT_MAX
, which is no smaller than 65,535. The minimum
required by the standard is 127.
The C standard requires a minimum of 4096 be permitted. GNU CPP places no limits on this, but you may get incorrect column numbers reported in diagnostics for lines longer than 65,535 characters.
The standard does not specify any lower limit on the maximum size of a source file. GNU cpp maps files into memory, so it is limited by the available address space. This is generally at least two gigabytes. Depending on the operating system, the size of physical memory may or may not be a limitation.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
GNU CPP has a number of features which are present mainly for compatibility with older programs. We discourage their use in new code. In some cases, we plan to remove the feature in a future version of GCC.
11.3.1 Assertions 11.3.2 Obsolete once-only headers 11.3.3 Miscellaneous obsolete features
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Assertions are a deprecated alternative to macros in writing conditionals to test what sort of computer or system the compiled program will run on. Assertions are usually predefined, but you can define them with preprocessing directives or command-line options.
Assertions were intended to provide a more systematic way to describe the compiler's target system. However, in practice they are just as unpredictable as the system-specific predefined macros. In addition, they are not part of any standard, and only a few compilers support them. Therefore, the use of assertions is less portable than the use of system-specific predefined macros. We recommend you do not use them at all.
#predicate (answer) |
predicate must be a single identifier. answer can be any
sequence of tokens; all characters are significant except for leading
and trailing whitespace, and differences in internal whitespace
sequences are ignored. (This is similar to the rules governing macro
redefinition.) Thus, (x + y)
is different from (x+y)
but
equivalent to ( x + y )
. Parentheses do not nest inside an
answer.
To test an assertion, you write it in an `#if'. For example, this
conditional succeeds if either vax
or ns16000
has been
asserted as an answer for machine
.
#if #machine (vax) || #machine (ns16000) |
You can test whether any answer is asserted for a predicate by omitting the answer in the conditional:
#if #machine |
Assertions are made with the `#assert' directive. Its sole argument is the assertion to make, without the leading `#' that identifies assertions in conditionals.
#assert predicate (answer) |
You may make several assertions with the same predicate and different answers. Subsequent assertions do not override previous ones for the same predicate. All the answers for any given predicate are simultaneously true.
Assertions can be cancelled with the the `#unassert' directive. It has the same syntax as `#assert'. In that form it cancels only the answer which was specified on the `#unassert' line; other answers for that predicate remain true. You can cancel an entire predicate by leaving out the answer:
#unassert predicate |
In either form, if no such assertion has been made, `#unassert' has no effect.
You can also make or cancel assertions using command line options. See section 12. Invocation.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
GNU CPP supports two more ways of indicating that a header file should be read only once. Neither one is as portable as a wrapper `#ifndef', and we recommend you do not use them in new programs.
In the Objective-C language, there is a variant of `#include' called `#import' which includes a file, but does so at most once. If you use `#import' instead of `#include', then you don't need the conditionals inside the header file to prevent multiple inclusion of the contents. GCC permits the use of `#import' in C and C++ as well as Objective-C. However, it is not in standard C or C++ and should therefore not be used by portable programs.
`#import' is not a well designed feature. It requires the users of a header file to know that it should only be included once. It is much better for the header file's implementor to write the file so that users don't need to know this. Using a wrapper `#ifndef' accomplishes this goal.
In the present implementation, a single use of `#import' will prevent the file from ever being read again, by either `#import' or `#include'. You should not rely on this; do not use both `#import' and `#include' to refer to the same header file.
Another way to prevent a header file from being included more than once is with the `#pragma once' directive. If `#pragma once' is seen when scanning a header file, that file will never be read again, no matter what.
`#pragma once' does not have the problems that `#import' does, but it is not recognized by all preprocessors, so you cannot rely on it in a portable program.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Here are a few more obsolete features.
The preprocessor currently warns about this and outputs the two tokens adjacently, which is probably the behavior the programmer intends. It may not work in future, though.
Most of the time, when you get this warning, you will find that `##' is being used superstitiously, to guard against whitespace appearing between two tokens. It is almost always safe to delete the `##'.
#pragma poison
This is the same as #pragma GCC poison
. The version without the
GCC
prefix is deprecated. See section 7. Pragmas.
GCC currently allows a string constant to extend across multiple logical lines of the source file. This extension is deprecated and will be removed in a future version of GCC. Such string constants are already rejected in all directives apart from `#define'.
Instead, make use of ISO C concatenation of adjacent string literals, or use `\n' followed by a backslash-newline.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This section details behavior which has changed from previous versions of GNU CPP. We do not plan to change it again in the near future, but we do not promise not to, either.
The "previous versions" discussed here are 2.95 and before. The behavior of GCC 3.0 is mostly the same as the behavior of the widely used 2.96 and 2.97 development snapshots. Where there are differences, they generally represent bugs in the snapshots.
The standard does not specify the order of evaluation of a chain of `##' operators, nor whether `#' is evaluated before, after, or at the same time as `##'. You should therefore not write any code which depends on any specific ordering. It is possible to guarantee an ordering, if you need one, by suitable use of nested macros.
An example of where this might matter is pasting the arguments `1', `e' and `-2'. This would be fine for left-to-right pasting, but right-to-left pasting would produce an invalid token `e-2'.
GCC 3.0 evaluates `#' and `##' at the same time and strictly left to right. Older versions evaluated all `#' operators first, then all `##' operators, in an unreliable order.
See section 9. Preprocessor Output, for the current textual format. This is also the format used by stringification. Normally, the preprocessor communicates tokens directly to the compiler's parser, and whitespace does not come up at all.
Older versions of GCC preserved all whitespace provided by the user and inserted lots more whitespace of their own, because they could not accurately predict when extra spaces were needed to prevent accidental token pasting.
As an extension, GCC permits you to omit the variable arguments entirely when you use a variable argument macro. This is forbidden by the 1999 C standard, and will provoke a pedantic warning with GCC 3.0. Previous versions accepted it silently.
Formerly, in a macro expansion, if `##' appeared before a variable arguments parameter, and the set of tokens specified for that argument in the macro invocation was empty, previous versions of GNU CPP would back up and remove the preceding sequence of non-whitespace characters (not the preceding token). This extension is in direct conflict with the 1999 C standard and has been drastically pared back.
In the current version of the preprocessor, if `##' appears between a comma and a variable arguments parameter, and the variable argument is omitted entirely, the comma will be removed from the expansion. If the variable argument is empty, or the token before `##' is not a comma, then `##' behaves as a normal token paste.
Traditional mode used to be implemented in the same program as normal preprocessing. Therefore, all the GNU extensions to the preprocessor were still available in traditional mode. It is now a separate program and does not implement any of the GNU extensions, except for a partial implementation of assertions. Even those may be removed in a future release.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |