C unions and undefined behaviour

1

In the following example code, is there any undefined or implementation defined behavior? Can I assign a value to one member of a union and read it back from another?

#include <stdio.h>
#include <stdint.h>

struct POINT
{
    union
    {
        float Position[3];
        struct { float X, Y, Z; };
    };
};

struct INT
{
    union
    {
        uint32_t Long;
        uint16_t Words[2];
        uint8_t Bytes[4];
    };
};

int main(void)
{
    struct POINT p;

    p.Position[0] = 10;
    p.Position[1] = 5;
    p.Position[2] = 2;
    printf("X: %f; Y: %f; Z: %f\n", p.X, p.Y, p.Z);

    struct INT i;

    i.Long = 0xDEADBEEF;
    printf("0x%4x%4x\n", i.Words[0], i.Words[1]);
    printf("0x%2x%2x%2x%2x\n", i.Bytes[0], i.Bytes[1], i.Bytes[2], i.Bytes[3]);

    return 0;
}

The output on my machine is:

X: 10.000000; Y: 5.000000; Z: 2.000000
0xbeefdead
0xefbeadde

It's printing the words/bytes in reverse because x86 is little endian, as expected.

c
union
undefined-behavior
asked on Stack Overflow Mar 8, 2020 by at77

3 Answers

3

is there any undefined or implementation defined behavior?

Some basic concerns:

struct { float X, Y, Z; }; may have padding between X, Y, Z rendering printf("X: %f; Y: %f; Z: %f\n", p.X, p.Y, p.Z); undefined behavior as p.Z, etc. may not be initialized.

i.Long = 0xDEADBEEF; printf("0x%4x%4x\n", i.Words[0], i.Words[1]); leads to implementation defined behavior as C does not require a particular endian. (It appears OP knows of this already.)

Can I assign a value to one member of a union and read it back from another?

Yes - within limitations. Other answers well address this part.

1

I don't think the authors of the Standard have ever reached a consensus on what constructs should be required or expected to behave usefully in what circumstances. Instead, they punt the issue as a "quality of implementation" matter, relying upon implementations to support whatever constructs their customers might need.

The C Standard specifies that reading a union object through a member other than the last member written will reinterpret the bytes therein using the new type. If one looks at the list of lvalue types that may be used to access struct or union objects, however, there is no provision to access structs or unions using objects of non-character member types. In most cases where a pointer or lvalue of a member type would be used, it would be visibly freshly derived from a pointer to, or lvalue of, the parent type, and if a compiler makes any reasonable effort to notice such derivation there would be no need for a general rule allowing use of those types. The question of when to recognize such derivation was left as a quality-of-implementation issue on the presumption that compilers who made any bona fide effort to meet the needs of their customers would probably do a better job than if the Standard tried to write out a set of precise rules.

Rather than making any effort to look for ways in which member-type pointers might be derived from struct or union objects, however, gcc and clang instead opt to go beyond what's actually specified to a far lesser degree than most committee members would have expected. They will treat an operation performed directly on an lvalue formed using value.member or ptr->member is an operation on the parent object. They will also recognize lvalues of the form value.member[index] or ptr->member.index. On the other hand, despite the fact that (array)[index] is defined as being equivalent to (*((array)+(index))), they will not recognize (*((ptr->member)+(index))) as an operation on the object identified by ptr. They will also, generally needlessly, assume that objects of structure type may interact with unrelated pointers to objects of member type.

If one is writing code that would benefit from the ability to perform type punning, my recommendation would be to explicitly say in the documentation that reliable operation requires -fno-strict-aliasing. The purpose of the aliasing rules was to give compiler writers the freedom to perform optimizations that would not interfere with what their customers needed to do. Compiler writers were expected to recognize and support their customers' needs without regard for whether the Standard required them to do so.

answered on Stack Overflow Mar 8, 2020 by supercat
0

Union type punning is allowed from C99 (despite only via the footnote - but it is the part of beauty of this language). With some restrictions it is OK.

If the member used to read the contents of a union object is not the same as the member last used tostore a value in the object, the appropriate part of the object representation of the value is reinterpretedas an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘typepunning’’). This might be a trap representation.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_257.htm

answered on Stack Overflow Mar 8, 2020 by 0___________ • edited Mar 8, 2020 by 0___________

User contributions licensed under CC BY-SA 3.0