Some confusions about struct memory allocation mechanism?

1

During my project, I am confronted with C program.

  1. As shown below, htmp is a struct pointer. We first allocate a memory for it. But why should we allocate a memory for its element word again?
  2. If it's essential to allocate memory for each element of a struct, why not allocate memory for its other elements, id and next?
#define HASHREC bitewisehash  


typedef struct hashrec {
    char    *word;
    long long id;
    struct hashrec *next;
} HASHREC;

/* Move-to-front hashing and hash function from Hugh Williams, http://www.seg.rmit.edu.au/code/zwh-ipl/ */

/* Simple bitwise hash function */
unsigned int bitwisehash(char *word, int tsize, unsigned int seed) {
    char c;
    unsigned int h;
    h = seed;
    for (; (c =* word) != '\0'; word++) h ^= ((h << 5) + c + (h >> 2));
    return((unsigned int)((h&0x7fffffff) % tsize));
}

/* Insert string in hash table, check for duplicates which should be absent */
void hashinsert(HASHREC **ht, char *w, long long id) {
    HASHREC *htmp, *hprv;
    unsigned int hval = HASHFN(w, TSIZE, SEED);
    for (hprv = NULL, htmp = ht[hval]; htmp != NULL && scmp(htmp->word, w) != 0; hprv = htmp, htmp = htmp->next);
    if (htmp == NULL) {
        htmp = (HASHREC *) malloc(sizeof(HASHREC)); # allocate memory for htmp
        htmp->word = (char *) malloc(strlen(w) + 1); # why allocate memory again ?
        strcpy(htmp->word, w);    # 
        htmp->id = id;            # why not allocate memory for htmp->id ?
        htmp->next = NULL;        # why nor allocate memory for htmp->next?
        if (hprv == NULL) ht[hval] = htmp;
        else hprv->next = htmp;
    }
    else fprintf(stderr, "Error, duplicate entry located: %s.\n",htmp->word);
    return;
}
c
linux
dynamic-memory-allocation
asked on Stack Overflow Mar 15, 2018 by DM SCU • edited Apr 23, 2018 by pb2q

3 Answers

3

You need to separate in your mind two things (1) in what memory is the thing I want to store stored?; and (2) what variable (pointer) holds the address to where it is stored so I can find it again.

You first declare two pointers to struct hashrec:

HASHREC *htmp, *hprv;

A pointer is nothing but a variable that holds the address to something else as its value. When you first declare the two pointers, they are uninitialized and hold no address. You then, in a rather awkward manner, initialize both pointers within a for loop declaration, e.g. hprv = NULL, htmp = ht[hval] and later hprv = htmp, htmp = htmp->next so presumably both pointers now hold an address and point somewhere.

Following the loop (with an empty body), you test if (htmp == NULL), meaning that htmp does not point to an address (which can be the case if you have found the hash-index of interest empty).

Then in order to provide storage for one HASHREC (e.g. a struct hashrec) you need to allocate storage so you have a block of memory in which to store the thing you want to store. So you allocate a block to hold one struct. (See: Do I cast the result of malloc?)

Now, look at what you have allocated memory for:

typedef struct hashrec {
    char    *word;
    long long id;
    struct hashrec *next;
} HASHREC;

You have allocated storage for a struct that contains (1) a char *word; (a pointer to char - 8-bytes (4-bytes on x86)); (2) a long long id; (8-bytes on both) and (3) a pointer to hold the address of the next HASHREC in the sequence.

There is no question that id can hold a long long value, but what about word and next? They are both pointers. What do pointers hold? The address to where the thing they point to can be found. Where can word be found? The thing you want is currently pointed to by w, but there is no guarantee that w will continue to hold the word you want, so you are going to make a copy and store it as part of the HASHREC. So you see:

htmp->word = malloc(strlen(w) + 1);    /* useless cast removed */

Now what does malloc return? It returns the address to the beginning of a new block of memory, strlen(w) + 1 bytes long. Since a pointer holds the value of something else as its value, htmp->word now stores the address to the beginning of the new block of memory as its value. So htmp->word "points" to the new block of memory and you can use htmp->word as a reference to refer to that block of memory.

What happens next is important:

    strcpy(htmp->word, w);    # 
    htmp->id = id;            # why not allocate memory for htmp->id ?
    htmp->next = NULL;        # why nor allocate memory for htmp->next?

strcpy(htmp->word, w); copies w into that new block of memory. htmp->id = id; assigns the value of id to htmp->id and no allocation is required because when you allocate:

htmp = malloc(sizeof(HASHREC));  /* useless cast removed */

You allocate storage for a (1) char * pointer, (2) a long long id; and (3) a struct hashrec* pointer -- you have already allocated for a long long so htmp->id can store the value of id in the memory for the long long.

htmp->next = NULL;        # why nor allocate memory for htmp->next?

What is it that you are attempting to store that would require new allocation for htmp->next? (hint: nothing currently) It will point to the next struct hashrec. Currently it is initialize to NULL so that the next time you iterate to the end of all the struct hashrec next pointers, you know you are at the end when you reach NULL.

Another way to think of it is that the previous struct hashrec next can now point to this node you just allocated. Again no additional allocation is required, the previous node ->next pointer simply points to the next node in sequence, not requiring any specific new memory to be allocated. It is just used as a reference to refer (or point to) the next node in the chain.

There is a lot of information here, but when you go through the thought process of determining (1) in what memory is the thing I want to store stored?; and (2) what variable (pointer) holds the address to where it is stored so I can find it again... -- things start to fall into place. Hope this helps.

answered on Stack Overflow Mar 15, 2018 by David C. Rankin
1

When you allocate the memory for the struct, only enough memory to hold a pointer is allocated for the member word, that pointer also has to point to valid memory which you can then allocate.

Without making it point to valid memory, it's value is indeterminate and trying to dereference such pointer with an indeterminate value is undefined behavior.

answered on Stack Overflow Mar 15, 2018 by Iharob Al Asimi
1

malloc(size) will allocate memory of given size and returns starting address of the allocated memory. Addresses are stored in pointer variables.

In char *word, variable word is of pointer type and it can only hold a pointer of type char i.e., address of a character data in memory.

so, when you allocate memory to the struct type variable htmp, it will allocate memory as follows, 2 bytes for *word, 4 bytes for id and 2 bytes for *next

Now, lets assume that you want to store following into your struct:

{
word = Elephant
id = 1245
}

Now you can save id directly into id member by doing this htmp->id = 1245. But you also want to save a word "Elephant" in your struct, how to achieve this?

htmp->word = 'Elephant' will cause error because word is a pointer. So, now you allocate memory to store actual string literal Elephant and store the starting address in htmp->word.

Instead of using char *word in your struct, you could have used char word[size] where no need to allocate memory separately for member word. The reason behind not doing so, is that you want to select some random size for word, which can waste memory if you are storing less characters and which even fall shot if word is too big.

answered on Stack Overflow Mar 15, 2018 by Priyank

User contributions licensed under CC BY-SA 3.0