I am working my way through Professional Assembly Language by Richard Blum. Why? I like lower level programming and I find assembly language interesting.
My setup is a Mint Linux x64 box. Professional Assembly Language, and many of the other books on assembly language, tend to use i386 32-bit assembly. In fact there are more books on 32-bit assembly language on the market than there are on 64-bit assembly language. It is nice to learn by going through book examples, but I don’t want to have to change them too much. I don’t mind changing command line switches, but I don’t want to have to convert all code to 64-bit assembly while learning.
Trying to compile and link 32-bit assembly on an 64-bit machine, you can run into some issues. This post goes over how to get setup so you can assemble and link both 32-bit and 64-bit assembly on an x64 Linux machine.
You can compile assembly on Linux using just gcc. If you want to see how to do that, skip to the bottom of this post. Professional Assembly Language used the gnu assembler and linker, as and ld. My guess is for learning. The ability to see each step in the process is nice. We will use as and ld. To start, I am assembling and linking this program from the Chapter 04 of the book.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# cpuid.s View the CPUID Vendor ID string using c library calls .section .data output: .asciz "The processor Vendor ID is '%s'\n" .section .bss .lcomm buffer, 12 .section .text .globl _start _start: movl $0, %eax cpuid movl $buffer, %edi movl %ebx, (%edi) movl %edx, 4(%edi) movl %ecx, 8(%edi) pushl $buffer pushl $output call printf addl $8, %esp pushl $0 call exit |
Using the as command to assemble.
as -o cpuid.o cpuid.s cpuid.s: Assembler messages: cpuid.s:16: Error: invalid instruction suffix for 'push' cpuid.s:17: Error: invalid instruction suffix for 'push' cpuid.s:20: Error: invalid instruction suffix for 'push'
The first error I ran into says I have invalid instructions. That’s valid, I am trying to use 32-bit assembly instructions on a 64-bit machine. To fix that I had to tell the as
command it to use 32-bit assembly. This is done by passing the --32
parameter.
as --32 -o cpuid.o cpuid.s
That worked and I got a cpuid.o 32-bit object code file in the directory. Next I try to link it using the ld command.
ld -o cpuid2 cpuid.o
ld: i386 architecture of input file 'cpuid.o' is incompatible with i386:x86-64 output
cpuid.o: In function _start':
(.text+0x1f): undefined reference to 'printf'
cpuid.o: In function
_start':
(.text+0x29): undefined reference to 'exit'
There are actually two errors here. The first is line one. The second is all the lines after. The first error is the `cpuid.o' is incompatible with i386
error. It is saying I assembled in 32-bit format but I am trying to link in 64-bit format. To fix that we tell the ld
command to use a 32-bit architecture. This is done by passing the -m elf_i386
parameter. The -m
parameter is the emulation linker. The man pages defines linker emulation as the “personality of the linker, which gives the linker default values for the other aspects of the target system” and specify that “the emulation can affect various aspects of linker behavior, particularly the default linker script”. I don’t know exactly what that means, but I think it means “act like a different architecture”.
ld -m elf_i386 -o cpuid2 cpuid.o
I run that and the first error has been fixed. We still get the second error.
cpuid.o: In function '_start': (.text+0x1f): undefined reference to 'printf' cpuid.o: In function '_start': (.text+0x29): undefined reference to 'exit'
This is saying it can’t find printf or exit functions. Those functions are in libc. We need to tell the linker how to find them. The book uses dynamic linking with the --dynamic-linker
command. In the book it looks like a single dash, but the man page for ld says it is a double dash --
. Your linker location may be different. Mine was at /lib/ld-linux.so.2
. It is possible to link statically instead of dynamically but we won’t cover that.
We add in the -lc
to link libc into our program.
ld --dynamic-linker /lib/ld-linux.so.2 -m elf_i386 -o cpuid -lc cpuid.o
When I run that I get a different error. Here is where things start to get interesting.
ld: cannot find -lc
It can’t find libc.
Looking into the Linker
We can see where the linker is looking for libraries using the --verbose
parameter. We keep all of our dynamic linker and 32-bit parameters.
ld --dynamic-linker /lib/ld-linux.so.2 -m elf_i386 -lc --verbose
That gives a lot of output. At the end of the output there is a section where it attempts to open various libc.so files.
attempt to open //usr/local/lib/i386-linux-gnu/libc.so failed attempt to open //usr/local/lib/i386-linux-gnu/libc.a failed attempt to open //lib/i386-linux-gnu/libc.so failed attempt to open //lib/i386-linux-gnu/libc.a failed attempt to open //usr/lib/i386-linux-gnu/libc.so failed attempt to open //usr/lib/i386-linux-gnu/libc.a failed attempt to open //usr/local/lib32/libc.so failed attempt to open //usr/local/lib32/libc.a failed attempt to open //lib32/libc.so failed attempt to open //lib32/libc.a failed attempt to open //usr/lib32/libc.so failed attempt to open //usr/lib32/libc.a failed attempt to open //usr/local/lib/libc.so failed attempt to open //usr/local/lib/libc.a failed attempt to open //lib/libc.so failed attempt to open //lib/libc.a failed attempt to open //usr/lib/libc.so failed attempt to open //usr/lib/libc.a failed attempt to open //usr/i386-linux-gnu/lib32/libc.so failed attempt to open //usr/i386-linux-gnu/lib32/libc.a failed attempt to open //usr/x86_64-linux-gnu/lib32/libc.so failed attempt to open //usr/x86_64-linux-gnu/lib32/libc.a failed attempt to open //usr/i386-linux-gnu/lib/libc.so failed attempt to open //usr/i386-linux-gnu/lib/libc.a failed ld: cannot find -lc
All of its attempts failed.
Packages
At this point I am thinking “Maybe I don’t have the 32-bit libc on the box. I read some tutorials. There are various suggestions on adding in multiple architectures. I try some different commands. None help. It looks like I have all the packages and libraries installed already.
sudo dpkg --add-architecture i386 sudo apt-get update sudo apt-get dist-upgrade sudo apt-get install libc6:i386 libncurses5:i386 libstdc++6:i386 sudo apt-get install multiarch-support
Then this stack overflow post recommended installing g++ multiarch. Looks like I didn’t have that installed after all.
sudo apt-get install gcc-multilib g++-multilib
That installs a bunch of packages. The important one which installs a 32-bit libc is libc6-dev-i386. A lot of the packages look related to i386 32-bit development.
The following additional packages will be installed: g++-5-multilib gcc-5-multilib lib32asan2 lib32atomic1 lib32cilkrts5 lib32gcc-5-dev lib32gomp1 lib32itm1 lib32mpx0 lib32quadmath0 lib32stdc++-5-dev lib32ubsan0 libc6-dev-i386 libc6-dev-x32 libc6-x32 libx32asan2 libx32atomic1 libx32cilkrts5 libx32gcc-5-dev libx32gcc1 libx32gomp1 libx32itm1 libx32quadmath0 libx32stdc++-5-dev libx32stdc++6 libx32ubsan0
When the install is done.
attempt to open //usr/local/lib/i386-linux-gnu/libc.so failed attempt to open //usr/local/lib/i386-linux-gnu/libc.a failed attempt to open //lib/i386-linux-gnu/libc.so failed attempt to open //lib/i386-linux-gnu/libc.a failed attempt to open //usr/lib/i386-linux-gnu/libc.so failed attempt to open //usr/lib/i386-linux-gnu/libc.a failed attempt to open //usr/local/lib32/libc.so failed attempt to open //usr/local/lib32/libc.a failed attempt to open //lib32/libc.so failed attempt to open //lib32/libc.a failed attempt to open //usr/lib32/libc.so succeeded opened script file //usr/lib32/libc.so opened script file //usr/lib32/libc.so attempt to open /lib32/libc.so.6 succeeded /lib32/libc.so.6 attempt to open /usr/lib32/libc_nonshared.a succeeded attempt to open /lib32/ld-linux.so.2 succeeded /lib32/ld-linux.so.2 /lib32/ld-linux.so.2 ld-linux.so.2 needed by /lib32/libc.so.6 found ld-linux.so.2 at /lib32/ld-linux.so.2
The linker is now able to find libc under /lib32/libc.so.6
Another piece that is interesting is the output format and architecture at the top of the ld output.
OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386") OUTPUT_ARCH(i386)
This shows that we are linking a 32-bit elf executable.
Linking
With the linker being able to find our 32-bit libc we can now go back to our ld command.
ld --dynamic-linker /lib/ld-linux.so.2 -m elf_i386 -o cpuid -lc cpuid.o ./cpuid
The linker works. It finds libc and creates an executable from our 32-bit assembly code. Running that we get the output below.
The processor Vendor ID is 'GenuineIntel'
Summary
I took the round about way to show a train of thought and maybe give some insight into how all parts are working when trying assemble and link 32-bit assembly on an x64 machine. In short, if you want to assemble and link 32-bit assembly on an x64 machine make sure your computer is configured correctly for multi-architecture. Install all of the following packages if they aren’t already installed.
sudo dpkg --add-architecture i386 sudo apt-get update sudo apt-get dist-upgrade sudo apt-get install libc6:i386 libncurses5:i386 libstdc++6:i386 sudo apt-get install multiarch-support sudo apt-get install gcc-multilib g++-multilib
When assembling using the --32
parameter. When linking use the -m elf_i386
parameter. The entire flow looks like this.
as --32 -o cpuid.o cpuid.s ld --dynamic-linker /lib/ld-linux.so.2 -m elf_i386 -o cpuid -lc cpuid.o ./cpuid
If you use the file command on the cpuid executable you will see we have assembled, linked, and executed a 32-bit assembly on a 64-bit machine.
file cpuid cpuid: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, not stripped
x64 Assembly Language
Can we still assemble x64 on the same machine? Yes. Easily. I took a hello world x64 example I found, saved it as hello.s assembled it, linked it, and executed it.
as -o hello.o hello.s ld -o hello hello.o ./hello
Using the file command again, we see we have assembled, linked, and executed a 64-bit elf executable.
file cpuid hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped
Using GCC instead
Instead of using as
and ld
, we can use gcc to handle both stages. We still need to have our environment setup correctly as with as
and ld
. We also need to change our assembly source code globl label from _start
to main
.
1 2 3 4 |
... .globl main main: ... |
We use the -m32
parameter with gcc to assemble and link in 32-bit format in one step.
gcc -m32 cpuid.s -o cpuid ./cpuid The processor Vendor ID is 'GenuineIntel' file cpuid cpuid: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.32, ...
And that’s it. One machine, both 32-bit and 64-bit assembly.