C is memory with syntactic sugar and as such it is helpful to think of things in C as starting from memory. One of the pieces that I think is often overlooked is variables and data types. If you have the right mental model for variables and data types it makes other concepts in C, and other langauages, easier. Let’s start with three definitions.
- Every variable is a starting memory address to the compiler.
- Every variable has a data type.
- A data type is a number of bytes to the compiler.
Yes I am being simplistic and yes certain data types have certain syntactic sugar but I have found this to be a good mental model. In most assembly languages, data types don’t exist. You operate on bytes and offsets. Most C compilers operate only one step above assembly, giving useful abstractions instead of dealing with individual bytes and instructions.
When you write a variable like int x = 10;
what you are saying to the compiler is there is a memory address which we have labeled x. Starting at that address we have sizeof(int)
bytes. Copy the value 10 into those sizeof(int)
bytes. Notice I said a variable was a starting location. Under the hood, all variables in C reference their single starting memory address and the compiler knows to use a certain number of bytes based on the data type.
Most of this happens automatically in C. You say int
and it generates the assembly code to copy the correct number of bytes to the correct memory locations. How does a C compiler know the size of its data types? Integer data types are defined in the limits.h file. Float data types are defined via macros in the floats.h file. Other data types such as structs and typedefs are defined in code.
Let’s think about copying variables.
1 2 3 |
int x = 10; int y = 20; x = y; |
If we copy an int from y to x, we are saying we have this value containing sizeof(int)
bytes at the memory address starting at y, copy that to this other location starting at the memory address x. Under the hood that is what the C compiler and its resulting assembly language is doing.
Thinking of variables as memory addresses and data types as a number of bytes has helped clarify many concepts for me in C. Whenever I get stuck on a concept I always come back to what is going on at the memory level and that usually helps to move forward.
Update: I removed a section on L-values and R-values. I was trying to make a point linking L-values and assignable memory locations but in the end it was creating more confusion that it was helping.