Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 354 Vote(s) - 3.56 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Why does GCC emit a warning when using trigraphs, but not when using digraphs?

#1
Code:

#include <stdio.h>

int main(void)
{
??< puts("Hello Folks!"); ??>
}
The above program, when compiled with GCC 4.8.1 with `-Wall` and `-std=c11`, gives the following warning:

source_file.c: In function ‘main’:
source_file.c:8:5: warning: trigraph ??< converted to { [-Wtrigraphs]
??< puts("Hello Folks!"); ??>
^
source_file.c:8:30: warning: trigraph ??> converted to } [-Wtrigraphs]
But when I change the body of `main` to:

<% puts("Hello Folks!"); %>
no warnings are thrown.

So, **Why does the compiler warn me when using trigraphs, but not when using digraphs?**
Reply

#2
May be because it has no negative side effects, unlike trigraphs as is stated in [gcc][1] documentation:

> Punctuators are all the usual bits of punctuation which are meaningful to C and C++. All but three of the punctuation characters in ASCII are C punctuators. The exceptions are ‘@’, ‘$’, and ‘`’. In addition, all the two- and three-character operators are punctuators. There are also six digraphs, which the C++ standard calls alternative tokens, which are merely alternate ways to spell other punctuators. This is a second attempt to work around missing punctuation in obsolete systems. It has no negative side effects, unlike trigraphs, but does not cover as much ground. The digraphs and their corresponding normal punctuators are:

Digraph: <% %> <: :> %: %:%:
Punctuator: { } [ ] # ##



[1]:

[To see links please register here]

Reply

#3
Because trigraphs have the undesirable effect of *silently* changing code. This means that the same source file is valid both with and without trigraph replacement, but leads to *different* code. This is especially problematic in string literals, like `"<em>What??</em>"`.

Language design and language evolution should strive to avoid silent changes. Having the compiler warn about trigraphs is a good thing to have.

Contrast this with digraphs, which were *new tokens* that do not lead to silent changes.
Reply

#4
This [gcc document on pre-processing](

[To see links please register here]

) gives a pretty good rationale for a warning (*emphasis mine*):

>Trigraphs are not popular and many compilers implement them incorrectly. Portable code should not rely on trigraphs being either converted or ignored. With -Wtrigraphs GCC will warn you when **a trigraph may change the meaning of your program if it were converted**.

and in this gcc document [on Tokenization](

[To see links please register here]

) explains digraphs unlike trigraphs do not potential negative side effects (*emphasis mine*):

> There are **also six digraphs**, which the C++ standard calls alternative tokens, which are merely alternate ways to spell other punctuators. This is a second attempt to work around missing punctuation in obsolete systems. **It has no negative side effects, unlike trigraphs**,
Reply

#5
Trigraphs are nasty because they use character sequences which could legally appear within valid code. A common case which used to cause compiler errors on code for classic Macintosh:

unsigned int signature = '????'; /* Should be value 0x3F3F3F3F */

Trigraph processing would would turn that into:

unsigned int signature = '??^; /* Should be value 0x3F3F3F3F */

which would of course not compile. In some slightly rarer cases, it would be possible for such processing to yield code which would compile, but with different meaning from what was intended, e.g.

char *template = "????/1234";

which would get turned into

char *template = "??S4"; // ??/ becomes \, and \123 becomes S

Not the string literal that was intended, but still perfectly legitimate nonetheless.

By contrast, digraphs are relatively benign because outside of some possible weird corner cases involving macros, no code containing processable digraphs would have a legitimate meaning in the absence of such processing.
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through