C malloc offsets relative to struct definition locations (and padding)


C question:

Does malloc'ing a struct always result in linear placement from top to bottom of the data inside? As a second minor question: is there a standard on the padding size, or does it vary between 32 and 64 bit machines?

Using this test code:

#include <stdio.h>
#include <stdlib.h>

struct test
    char a;
    /* char pad[3]; // I'm assuming this happens from the compiler */
    int b;

int main() {
    int size;
    char* testarray;
    struct test* testarraystruct;

    size = sizeof(struct test);
    testarray = malloc(size * 4);
    testarraystruct = (struct test *)testarray;
    testarraystruct[1].a = 123;

    printf("test.a = %d\n", testarray[size]); // Will this always be test.a?

    return 0;

On my machine, size is always 8. Therefore I check testarray[8] to see if it's the second struct's 'char a' field. In this example, my machine returns 123, but this obviously isn't proof it always is.

Does the C compiler have any guarantees that struct's are created in linear order?

I am not claiming the way this is done is safe or the proper way, this is a curiosity question.

Does this change if this is becomes C++?

Better yet, would my memory look like this if it malloc'd to 0x00001000?

0x00001000 char a      // First test struct
0x00001001 char pad[0]
0x00001002 char pad[1]
0x00001003 char pad[2]
0x00001004 int  b      // Next four belong to byte b
0x00001008 char a      // Second test struct
0x00001009 char pad[0]
0x0000100a char pad[1]
0x0000100b char pad[2]
0x0000100c int  b      // Next four belong to byte b

NOTE: This question assumes int's are 32 bits

asked on Stack Overflow Aug 19, 2014 by Water

2 Answers

  • As far as I know, malloc for struct is not linear placement of data but it's a linear allocation of memory for the members with in the structure that too when you create an object of it.
  • This is also necessary for padding.
  • Padding also depends on the type of machine (i.e 32 bit or 64 bit).
  • The CPU fetches the memory based on whether it is 32 bit or 64 bit.
  • For 32 bit machine your structure will be:

    struct test
        char a; /* 3 bytes padding done to a */
        int b;
  • Here your CPU fetch cycle is 32 bit i.e 4 bytes

  • So in this case (for this example) the CPU takes two fetch cycles.
  • To make it more clear in one fetch cycle CPU allocates 4 bytes of memory. So 3 bytes of padding will be done to "char a".

  • For 64 bit machine your structure will be:

    struct test
        char a;
        int b; /* 3 bytes padding done to b */
  • Here the CPU fetch cycle is 8 bytes.

  • So in this case (for this example) the CPU takes one fetch cycles. So 3 bytes of padding here must be done to "int b".

  • However you can avoid the padding you can use #pragma pack 1

  • But this will not be efficient w.r.t time because here CPU fetch cycles will be more (for this example CPU fetch cycles will be 5).
  • This is tradeoff between CPU fetch cycles and padding.
answered on Stack Overflow Aug 20, 2014 by Adarsh • edited Aug 20, 2014 by Jonathan Leffler

For many CPU types, it is most efficient to read an N-byte quantity (where N is a power of 2 — 1, 2, 4, 8, sometimes 16) when it is aligned on an N-byte address boundary. Some CPU types will generate a SIGBUS error if you try to read an incorrectly aligned quantity; others will make extra memory accesses as necessary to retrieve an incorrectly aligned quantity. AFAICR, the DEC Alpha (subsequently Compaq and HP) had a mechanism that effectively used a system call to fix up a misaligned memory access, which was fiendishly expensive. You could control whether that was allowed with a program (uac — unaligned access control) which would stop the kernel from aborting the process and would do the necessary double reads.

C compilers are aware of the benefits and costs of accessing misaligned data, and go to lengths to avoid doing so. In particular, they ensure that data within a structure, or an array of structures, is appropriately aligned for fast access unless you hold them to ransom with quasi-standard #pragma directives like #pragma pack(1).

For your sample data structure, for a machine where sizeof(int) == 4 (most 32-bit and 64-bit systems), then there will indeed be 3 bytes of padding after an initial 1 byte char field and before a 4-byte int. If you use short s; after the single character, there would be just 1 byte of padding. Indeed, the following 3 structures are all the same size on many machines:

struct test_1
    char a;
    /* char pad[3]; // I'm assuming this happens from the compiler */
    int b;

struct test_2
    char a;
    short s;
    int b;

struct test_3
    char a;
    char c;
    short s;
    int b;

The C standard mandates that the elements of a structure are laid out in the sequence in which they are defined. That is, in struct test_3, the element a comes first, then c, then s, then b. That is, a is at the lowest address (and the standard mandates that there is no padding before the first element), then c is at an address higher than a (but the standard does not mandate that it will be one byte higher), then s is at an address higher than c, and that b is at an address higher than s. Further, the elements cannot overlap. There may be padding after the last element of a structure. For example in struct test_4, on many computers, there will be 7 bytes of padding between a and d, and there will be 7 bytes of padding after b:

struct test_4
    char   a;
    double d;
    char   b;

This ensures that every element of an array of struct test_4 will have the d member properly aligned on an 8-byte boundary for optimal access (but at the cost of space; the size of the structure is often 24 bytes).

As noted in the first comment to the question, the layout and alignment of the structure is independent of whether the space is allocated by malloc() or on the stack or in global variables. Note that malloc() does not know what the pointer it returns will be used for. Its job is simply to ensure that no matter what the pointer is used for, there will be no misaligned access. That often means the pointer returned by malloc() will fall on an 8-byte boundary; on some 64-bit systems, the address is always a multiple of 16 bytes. That means that consecutive malloc() calls each allocating 1 byte will seldom produce addresses 1 byte apart.

For your sample code, I believe that standard does require that testdata[size] does equal 123 after the assignment. At the very least, you would be hard-pressed to find a compiler where it is not the case.

For simple structures containing plain old data (POD — simple C data types), C++ provides the same layout as C. If the structure is a class with virtual functions, etc, then the layout rules depend on the compiler. Virtual bases and the dreaded 'diamond of death' multiple inheritance, etc, also make changes to the layout of structures.

answered on Stack Overflow Aug 20, 2014 by Jonathan Leffler

User contributions licensed under CC BY-SA 3.0