I was trying to understand how memory is allotted on the heap using malloc, and came across the following observation, and am unable to understand the reason behind it. It would be great if someone can explain.
First, let's take a look at the code I wrote:
#include<stdio.h>
#include<stdlib.h>
void print_int_heap(unsigned int *ptr, int len)
{
printf("PREV_SIZE: [%08x] SIZE: [%08x] MEM: [%08x] for INT malloc(%d)\n", *(ptr-2), *(ptr-1), ptr, len);
}
void print_char_heap(char *ptr, int len)
{
printf("PREV_SIZE: [%08x] SIZE: [%08x] MEM: [%08x] for CHAR malloc(%d)\n", *(ptr-2), *(ptr-1), ptr, len);
}
int main() {
unsigned int *ptr1 = malloc(20);
print_int_heap(ptr1, 20);
char *ptr2 = malloc(20)
print_char_heap(ptr2, 20);
return 0;
}
The output which I get for the above program is:
PREV_SIZE: [0x00000000] SIZE: [0x00000019] MEM: [0x0804b008] for INT malloc(20)
PREV_SIZE: [0x00000000] SIZE: [0x00000000] MEM: [0x0804b020] for INT malloc(20)
I can understand the output for the int malloc, but I do not understand why the value for size of chunk for char malloc is 0?
If ptr
is an int*
, *(ptr - 1)
refers to the sizeof(int)
bytes just before what ptr
references. That will ususlly be a 32-bit quantity starting four bytes before ptr
.
Similarly, if it is a char*
, *(ptr - 1)
refers to the sizeof(char)
bytes just before what ptr
references. sizeof(char)
is always 1; normally that will be the 8-bit quantity in the single byte preceding the value of ptr
.
These are obviously quite different things.
By the way, you are allowed to write ptr[-1]
. But as the above analysis show, that's really not what you want. You want to cast ptr
to a pointer to the datatype of the object(s) you believe precede ptr
, probably uint32_t
.
Technically this is all Undefined Behaviour, but if your malloc
implementation is stashing data just before the allocation and you know the type of that data, I'd say that it's fine to read it. (Although it's always a bit rude to stare at the internal data of a system function.)
Be aware that not all malloc
implementations do the same thing. You may well find one which storrs the length elsewhere or not at all.
From C programming language book by DENNIS M. RITCHIE
Rather than allocating from a compiled-in fixed-sized array, malloc will request space from the operating system as needed. Since other activities in the program may also request space without calling this allocator, the space that malloc manages may not be contiguous. Thus its free storage is kept as a list of free blocks. Each block contains a size, a pointer to the next block, and the space itself. The blocks are kept in order of increasing storage address, and the last block (highest address) points to the first. The block returned by malloc() looks like
points to next free block | --------------------------------------- | | size | | --------------------------------------- | | |..address returned to the user (ptr-2) (ptr-1) ptr --> LSB MSB
Here
void print_int_heap(unsigned int *ptr, int len) {
printf("PREV_SIZE: [%08x] SIZE: [%08x] MEM: [%08x] for INT malloc(%d)\n", *(ptr-2), *(ptr-1), ptr, len);
}
*(ptr-2)
prints the value inside "next free block"
as shown in above image and *(ptr-1)
prints the value inside "size"
block i.e how much memory allocated, & ptr
prints the address returned by user. Note that here ptr
type is unsigned int*
so *(ptr-2)
means accessing data from 2*sizeof(int)
bytes just before where ptr
points.
And here
void print_char_heap(char *ptr, int len){
printf("PREV_SIZE: [%08x] SIZE: [%08x] MEM: [%08x] for CHAR malloc(%d)\n", *(ptr-2), *(ptr-1), ptr, len);
}
while accessing *(ptr-1)
next free block (ptr-1)--> *(ptr-1) prints data from ? marked location. | | --------------------------------------- | | size |? | | --------------------------------------- | | |..address returned to the user ptr --> LSB MSB
ptr
type is char*
means when you do *(ptr-1)
it will access data from sizeof(char)
bytes just before where ptr
points.
Also better to use valgrind when memory allocated dynamically & make sure there is no where memory leakage happens, simply by running
valgrind --leak-check=full -v ./your_exe
and analyzing the messages of valgrind
. for e.g it may show something like
==3193== Invalid read of size 4
==3193== at 0x8048459: print_int_heap
==3193== Invalid read of size 4
==3193== at 0x8048461: print_int_heap
When you perform arithmetic on a pointer, the arithmetic is done in units of the size of the object that the pointer points to. So with char *ptr
, ptr-1
subtracts 1 byte from the address in ptr
. But with unsigned int *ptr
, ptr-1
subtracts sizeof(int)
from the address in ptr
.
So in your two functions, you're not subtracting the same number of bytes to get to the heap's bookkeeping data for the block.
Also, when you dereference a pointer, it only accesses the number of bytes in the pointer's data type. So in print_int_heap()
, *(ptr-1)
returns an unsigned int
, while in print_char_heap()
it returns a single char
.
You should probably just write a single print_heap()
function, and cast the argument to the appropriate type in the caller.
User contributions licensed under CC BY-SA 3.0