[C/C++] Yet another explanation of C/C++ pointers

For a moment, we can pretend that C/C++ have no arrays at all. Arrays are like syntactic sugar, nothing else.

Many C textbooks can tell you that a[i] is in fact a syntactic sugar for *(a+i).

But even more: array declarations can be seen as syntactic sugar as well.

Instead of:

int a[128];
int b[128];

... you can think of:

int *a=<address of a chunk in global memory or local stack of size 128*sizeof(int)>;
int *b=<address of a chunk in global memory or local stack of size 128*sizeof(int)>;

... or:

unsigned char some_random_chunk_in_memory[100000];

int *a=(int*)&some_random_chunk_in_memory;
int *b=(int*)&(some_random_chunk_in_memory + sizeof(int)*128);

Then you can read/write to both a or b in several ways: a[idx] or *a or *(a+idx), no matter how a was declared/defined: as int a[size] or int *a.

We can write big programs without arrays at all.

This is why Hex-Rays and Ghidra decompilers are so often shows you pointers instead of arrays. There is no reliable way to discern pointer from array in machine code compiled by C/C++ compiler. It's just impossible,

Was it an array access or pointer dereference? Go figure.

This is why both Hex-Rays and Ghidra are interactive tools with GUI. A user should help in setting right data types for variables, giving hints.

Read more: there are couple of other pointers explanations in my Reverse Engineering for Beginners book. Also, see my C-notes.


UPD: As seen on lobste.rs.


List of my other blog posts.

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.