Basics of Memory Addresses in C

Memory Addresses

It is helpful to think of everything in C in terms of computer memory. Let’s think of computer memory as an array of bytes where each address in memory holds 1 byte. If our computer has 4K of memory for example, it would have 4096 elements in the memory array. When we talk about pointers storing addresses, we are talking about a pointer storing an index to an element in the memory array. Dereferencing a pointer would be getting the value at that index in the array. All of this is of course a lie. How operating systems handle memory is much more complex than this. Memory is not necessarily contiguous and it is not necessarily handed out sequentially. But the analogy provides an easy way to think about memory in C to get started.

Confused about pointers, addresses and dereferencing? Take a look at this 5-Minute Guide to Pointers.

Say our computer has 4K of memory and the next open address is index 2048. We declare a new char variable i = ‘a’. When the variable gets declared memory is set aside for its value and the variable name is linked to that location in memory. Our char i has a value ‘a’ stored at the address 2048. Our char is a single byte so it only takes up index 2048. If we use the address-of operator (&) on our variable i it would return the address 2048. If the variable was a different type, int for instance, it would take up 4 bytes and use up elements 2048-2051 in the array. Using the address-of operator would still return 2048 though because the int starts at that index even though it takes up 4 bytes. Let’s look at an example.

Running that you should get output like the following:

address of charvar = 0x7fff9575c05f
address of charvar - 1 = 0x7fff9575c05e
address of charvar + 1 = 0x7fff9575c060
address of intvar = 0x7fff9575c058
address of intvar - 1 = 0x7fff9575c054
address of intvar + 1 = 0x7fff9575c05c

In the first example on lines 1-5 we declare a char variable, print out the address-of the char, and then print out the address just before and just after the char in memory. We get the addresses before and after by getting the using the & operator and then adding or subtracting one. In the second example on lines 7-11 we do the same thing except this time we declare an int variable, printing out its address and the addresses right before and after it.

In the output we see the addresses in hexadecimal. What is important to notice is that the char addresses are 1 byte before and after while the int the addresses are 4 bytes before and after. Math on memory addresses, pointer math, is based on the sizeof the type being referenced. The size of a given type is platform dependent but for this example our char takes 1 byte and our int takes 4 bytes. Subtracting 1 address from a char gives a memory address that is 1 byte previous while subtracting 1 from an int gives a memory address that is 4 bytes previous.

Even though in our example we were using the address-of operator to get the addresses of our variables, the operations are the same when using pointers that hold the address-of a varible.

Some commenters have brought up that storing &charvar – 1, an invalid address because it is before the array, is technically unspecified behavior. This is true. The C standard does have areas that are unspecified and on some platforms even storing an invalid address will cause an error.

Array Addresses

Arrays in C are contiguous memory areas that hold a number of values of the same data type (int, long, *char, etc.). Many programmers when they first use C think arrays are pointers. That isn’t true. A pointer stores a single memory address, an array is a contiguous area of memory that stores multiple values.

Running that you should get output like the following:

numbers = 0x7fff0815c0e0
numbers[0] = 0x7fff0815c0e0
numbers[1] = 0x7fff0815c0e4
numbers[2] = 0x7fff0815c0e8
numbers[3] = 0x7fff0815c0ec
numbers[4] = 0x7fff0815c0f0
sizeof(numbers) = 20

In this example we initialize an array of 5 ints. We then print the address of the array itself. Notice we didn’t use the address-of & operator. This is because the array variable already decays to the address of the first element in the array. As you can see the address of the array and the address of the first element in the array are the same. Then we loop through the array and print out the memory addresses at each index. Each int is 4 bytes on our computer and array memory is contiguous, so each int addres be 4 bytes away from each other.

In the last line we print the size of the array. The size of an array is the sizeof(type) * number of elements in the array. Here the array holds 5 ints, each of which takes up 4 bytes. The entire array is 20 bytes.

Struct Addresses

Structs in C tend to be contiguous memory areas, though not always. And like arrays they hold multiple data types, but unlike arrays they can hold a different data types.

Running that you should get output like the following:

address of ball = 0x7fffd1510060
address of ball.category = 0x7fffd1510060
address of ball.width = 0x7fffd1510064
address of ball.height = 0x7fffd1510068
sizeof(ball) = 12

In this example we have our struct definition. Then we declare a instance ball of the struct measure and we populate its width, height, and category members with values. Then we print out the address of the ball variable. Like the array varible structs decay to the address of their first element. We then print out each of the struct members. Category is the is the first member and we see that it has the same address as the ball variable. The width member is next followed by the height member. Both have address higher than the category member.

You might think that because category is a char and chars take up 1 byte then the width member should be at an address 1 byte higher than the start. As you can see from the output this isn’t the case. According to the C99 standard (C99 §6.7.2.1), a C implementation can add padding bytes to members for aligment on byte boundaries. It cannot reorder the data members but it can add in padding bytes. In practice most compilers will make each member the same size as the largest member in the struct but this is entirely implementatation specific.

In our example you can see that the char actually takes up 4 bytes and the size of the struct takes a total of 12 bytes. What to take away?

  • A struct variable points to the address of the first member in the struct.
  • Don’t assume that struct members will be a specific number of bytes away from another field, they may have padding bytes or the memory might not be contiguous depending on the implementation. Use the address-of (&) operator on the member to get its address.
  • And use sizeof(struct instance) to get the total size of the struct, don’t assume it is just the sum of its member fields, it may have padding.

Conclusion

Hope this post helps you to understand more about how addresses operate on different data types in C. In a future post we will go over some basics on pointers and arrays in C.

Update 1:Thanks to Sorito, I added a link back to blog post about pointers, addresses, and dereferencing.
Update 2:Thanks to Keith Thompson and tjoff from hacker news for helping clarify struct addresses and memory. I reworked the example code to be more clear about memory.



20 Responses to “Basics of Memory Addresses in C”

  1. Bob says:

    One important note is that the compiler may add padding bytes at the end or the middle of structs to optimize memory alignment.

    • Sunio says:

      Does this mean that if for instance, the struct has one integer and one char as elements, the compiler may add zero-padding bytes to align the memory to 4 bytes? What about the memory size reported by sizeof, for the char would be 1 byte as expected?

      • Keith Thompson says:

        Yes. sizeof (char) is 1 by definition. The size of a struct is *at least* the sum of the sizes of its elements. But given:

        struct foo {
        int i;
        char c;
        };

        it’s likely that sizeof (struct foo) == 2 * sizeof (int), as the compiler adds padding so that all the elements of an array of struct foo are properly aligned.

        On the other hand, sizeof (struct foo) *could* be just sizeof (int) + 1, if the system is able to access objects at arbitrary memory locations. (But on the x86, for example, misaligned accesses are supported, but they’re a bit slower than aligned accesses, and compilers typically add padding because it improves performance.)

    • Dennis Kubes says:

      I have reworked the section on structs to be more clear about byte alignment and taking into account suggestions. Reworked the example code as well.

  2. Keith Thompson says:

    > It is helpful to think of everything in C in terms of computer memory. Let’s think of computer memory as an array of bytes where each address in memory holds 1 byte.

    I find it more helpful to think of each individual object (that’s not part of a larger object) as an *independent* contiguous chunk of memory.

    Not all implementations treat all of memory as a single contiguous array. Just comparing the addresses of two distinct objects, other than for equality or inequality, has undefined behavior.

    If you declare:

    int x, y, z;

    then yes, it’s very likely that they’re allocated in consecutive N-byte slots of memory — but it’s not guaranteed, and it’s not even particularly useful to assume that they are.

    (And a minor point: Addresses printed with “%p” should be converted to void*. It’s possible for different pointer types to have different representations.)

  3. Keith Thompson says:

    BTW, the comp.lang.c FAQ http://www.c-faq.com/, is an excellent source of information on this kind of thing (though it’s not organized as a tutorial). Section 6 in particular does a great job of explaining the relationship between arrays and pointers.

    • Dennis Kubes says:

      That is a very good resource. Thank you. If it isn’t already out there somebody should do a post on resources for C programmers. I have the question come up a few times on different lists.

  4. Phoenix says:

    The very fact that Keith Thompson is replying on this post should be considered as an achievement.

    About the declaration :

    int x, y, z;

    I want to know whether any system purposefully allocates the three variables non-contiguously, may be for security purposes.

  5. Sorito says:

    “Deferencing” – huh? Did you mean “Dereferencing”?

    I did not like the didactic qualities of this article. First three sentences are written well and define basic concepts. But then sentence four introduces a more advanced concept (“pointers”) without defining it first. If you define very basic concepts, you have to also define the more advanced concepts.

    So, sentence four should be more something like this: “Then there is something called pointer. Memory can contain all kinds of things, but when it contains a memory address, that’s a pointer. So when we talk about pointers storing addresses, we are talking about an element in the memory array storing an index to another element in the memory array”.

    • Sorito says:

      After those four sentences, I stopped reading, for the points outlined.

    • Dennis Kubes says:

      Thanks for the suggestion. I added a link to my previous post that goes into more detail about pointers, addresses, and dereferencing.

  6. [...] Excelente artículo (no es extenso y es muy claro) donde Dennis Ku*bes nos explica los fundamentos de como funciona la memoria cuando programamos en lenguaje C. Conceptos básicos que nos aclaran el funcionamiento en la memoria de matrices (array) y structs. [...]

  7. Ricardo says:

    I’m enjoying your series of posts about pointers and memory in C. Congratulations!

  8. [...] More here Share this:FacebookMoreTwitterLike this:LikeBe the first to like this. [...]

  9. netra says:

    I want to store an address in a particular address , how to do that,
    example:-
    Char * start (holds a address)
    ADDR -> Its a macro pointing to particular memory address.
    I want to store the memory address pointed by start to AADR.
    Thans in advance
    Netra

    • Dennis Kubes says:

      Why would you want to store an address in a particular address? That sounds like something that would be done when trying to take advantage of a vulnerability in a piece of software.

  10. […] 本文由 伯乐在线 – 伯乐在线读者 翻译自 Dennis Kubes。转载请参见文章末尾处的要求。 […]

  11. […] 原文链接:http://denniskubes.com/2012/08/17/basics-of-memory-addresses-in-c/ […]