Does strict aliasing prevent you from writing to a char array through a different type?

1

My understanding is that strict aliasing in C++ is defined in basic.lval 11:

(11) If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

  • (11.1) the dynamic type of the object,
  • (11.2) a cv-qualified version of the dynamic type of the object,
  • (11.3) a type similar (as defined in conv.qual) to the dynamic type of the object,
  • (11.4) a type that is the signed or unsigned type corresponding to the dynamic type of the object,
  • (11.5) a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
  • (11.6) an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
  • (11.7) a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
  • (11.8) a char, unsigned char, or std​::​byte type.

By my reading, per 11.8, this is always legal, since the program accesses the stored value of x through a glvalue of type unsigned char:

int x = 0xdeadbeef;
auto y = reinterpret_cast<unsigned char*>(&x);
std::cout << y[1];

I am curious about using a pointer that aliases to an array of unsigned char:

alignas(int) unsigned char[4] x;
auto y = reinterpret_cast<int*>(x);
*y = 0xdeadbeef;

Is this a violation of strict aliasing? My reading is that it isn't, however I was just told on another thread that it is. Per basic.lval only, it seems to me that there is no UB, since the program does not attempt to access the stored value: it stores a new one without reading it, and so long as subsequent reads use x, then no violation occurs.

c++
language-lawyer
strict-aliasing
asked on Stack Overflow Aug 7, 2018 by zneak • edited Aug 9, 2018 by zneak

2 Answers

5

About the definition of "access":

http://eel.is/c++draft/defns.access

3.1 access [defns.access]
⟨execution-time action⟩ read or modify the value of an object

In other words, storing value is also "access". It is still UB.

answered on Stack Overflow Aug 7, 2018 by (unknown user)
2

There are many constructs which invoke UB, but which quality compilers should process correctly anyway. The use of character-typed storage to hold other types is among them. Requirement that a constructor for a char[] yield a pointer to aligned storage wouldn't make sense otherwise.

The authors of the C89 did not think it necessary to fully describe every situation where a quality implementation suitable for any particular purpose would need to behave predictably. The Rationale recognizes that implementations may be conforming while being of such low quality as to be essentially useless, and suggests that there was no perceived need to forbid implementations from behaving in ways that would impair their usefulness. Every subsequent C or C++ Standard has inherited parts of C89 which were never intended to be fully complete, and none of them have fully completed those parts.

The Standard makes no distinction between

  • actions which invoke UB but even the most obtuse compiler writer would recognize that they should behave predictably (e.g. struct foo {int x;} s; s.x=1;);

  • actions which quality compilers suitable for various purposes should process predictably, but which low-quality compilers or high-quality compilers that are suitable only for other purposes, might not;

  • actions which some compilers may handle predictably, but where such treatment should not be generally expected from any other compilers--even those targeting the same purposes (platforms, application fields, etc.).

Declaring a char[] with a particular alignment, using the named array once to capture its address (and never using the named array again), and employing it as raw storage that can hold other types, should fall into the first category above (especially since--as noted above--alignment guarantees wouldn't serve much purpose otherwise). A compiler may not recognize any pointers' relationship to the original array, and might thus not realize that actions on such pointers could interact with a char[](*), but if the array is never again used as a char[] the compiler would have no reason to care.

(*) For example, given

char foo[10];

int test(int *p)
{
  if (foo[1])
    *p = 1;
  return foo[1];
}

an implementation might cache and reuse the first value read from foo[1], not recognizing that a write to *p might alter the underlying storage. If the named lvalue foo is never used after the first time its address is taken, however, it wouldn't matter what assumptions the compiler might make about whether it would be safe to cache reads of lvalue foo, because there wouldn't be any.

answered on Stack Overflow Aug 7, 2018 by supercat • edited Aug 9, 2018 by supercat

User contributions licensed under CC BY-SA 3.0