Go Internals: How much can we figure by tracing a syscall in Go?
How many rabbit holes can tracing a system call in Go lead you into?
Introduction
(TGRH = The God of Rabbit Holes)
Me: How does Go make system calls?
TGRH: It depends.
Me: On what?
TGRH: On the OS/processor architecture combination because each OS has it’s own ABI depending on the architecture. Also, Go has it’s own ABI, not one but two of them. Not to mention, Go has it’s own flavor of Assembly language too. Before you ask, yes you can use libc to make syscalls directly for every OS/Arch combination and be done with it - but that affects Go’s portability and binary sizes on Linux.
Me: What do all these words even mean and what are you even talking about?
TGRH: I told you, it depends..
Okay, what?
Long story short, once I got into the details - I realized truly explaining how Go makes system calls will require a much larger canvas.
I will instead be writing about all the rabbit holes I got into while tracing a system call in Go - each one touches upon some fundamental aspect of how computers/operating systems/programming languages work under the hood.
Minimal program that makes a syscall
The code below makes a system call using the golang.org/x/sys/unix package.
package main
import (
"golang.org/x/sys/unix"
)
func main() {
msg := []byte("hello syscall\n")
unix.Write(1, msg) // fd = 1 i.e. stdout
}
We generate an executable from this using go build -o syscall_minimal.
Tracing a syscall in Linux
Let’s trace the syscall all the way from main to the actual point where the syscall is made.
For each function we will examine the Go code as well as the corresponding assembly generated. I will attach a debugger in VSCode for the source and use gdb to step through the assembly.
My operating system is Linux (Ubuntu) on an x86_64 (amd64) architecture. Go version: 1.24.4.
We will talk about each rabbit hole as and when we encounter them.
Trace: Main
Code
This is the main func. The unix.Write(..) function is the recommended way to make syscalls in go - it expects 2 arguments: file descriptor number and byte slice of the bytes to be written.
Assembly
Lets see what parts of it we can make sense of.
Some weird hex values are being loaded into the rdx register and then at some region in the stack (0x22 + rsp) and (0x28 + rsp) - if you look closely: it’s actually the string we want to print: “hello syscall\n”
We load the following values into registers:
rax(eax): 1 - file descriptor
rbx: Address of the start of the string from the stack
rcx(ecx): 14 - length of the string
rdi: 14 - length of the string again
We call the unix.Write(..) func
Even though 2 arguments are expected, some extra registers are also loaded. This is probably a compiler optimization.
Rabbit Hole: Go has it’s own ABI - Application Binary Interface
What is an ABI?
Firstly, what even is an ABI?
If you make a raw system call in assembly in Linux (x86-64), the syntax is usually something like (pseudo-code):
mov systemcallnumber, %rax
mov arg1, %rdi
mov arg2, %rsi
mov arg3, %rdx
syscall
For every system call, the same convention is followed - we load arguments into these specific registers and then use syscall command, the OS performs the syscall and then returns. This convention: the manner in which binary files interact with the OS is called the ABI.
If you make a regular Assembly function call in Linux (x86-64), it has a separate (slightly different) ABI register convention than system calls.
Every OS/Architecture pair will have a different ABI since registers and other internals will be different for the architecture.
If you compile C code and inspect the Assembly via a disassembler, it uses the same ABI as the underlying OS/Arch pair it is running on.
It is possible for programming languages to have their own ABIs
As I said, like in C it is only intuitive that Assembly generated for running on an OS/Arch will have the same ABI as say - hand-written assembly for that OS/Arch, right?
Not really. If you design a programming language - you own the compiler and the assembler. What is stopping you from using your own ABI for calling your own functions? Nothing.
And that is exactly what Go does.
The signature of the unix.Write call is
unix.Write(fd int, msg [] byte)
If you go back to the Assembly code a couple paragraphs above - we can clearly see these two function arguments are being loaded in rax, rbx (we’ll ignore the other ones for now - probably an optimization the compiler made by itself).
The Linux(x86-64) ABI specifies that rdi, rsi be used for function calls - but Go is clearly using it’s own ABI, which it is absolutely allowed to do.
Go’s ABI documentation confirms what we saw using GDB. It uses its own ABI for function calls and the first two args are passed in rax, rbx (refer to amd64 architecture section).
In fact, Go has more than one ABI - as you will find out next.
References
Trace: unix.Write / unix.write
We move to the next section which is the library function unix.Write and it’s internal call to unix.write
Code
Nothing much in the public Write() here apart from a simple call to an internal function, let’s take a quick peek inside.
We enter the internal unix.write method where:
Filename is zsyscall_linux.go, this suggests some sort of arch-level file separation for system calls
We call the next function Syscall with 4 params:
SYS_WRITE: the system call number for write call on linux (1)
file descriptor for stdout (1)
pointer to the string to be printed
length of the string to be printed
Assembly:
Seems like we don’t have a corresponding assembly procedure for the public unix.Write as we are directly taken to the internal unix.write call - compiler optimization most probably.
The important parts are:
We know the current values in registers are:
rax: 1 (file descriptor)
rbx: Address of the start of the string
rcx: 14 (length of the string)
We store the following values into the stack before the func call to Syscall
%rsp: 0x1 (syscall number)
%rsp + 8: 1 (file descriptor)
%rsp + 10: (address of start of string)
%rsp + 18: (length of string)
In effect we are simply passing the function params down to the next call, although via the stack rather than registers this time - why?
Rabbit Hole: Go has not one but two ABIs
We previously discussed that Go defines it’s own ABI. We discussed that the ABI specifies that registers rax, rbx and so on must be used for function calling.
But in the Assembly function call above it is passing all arguments on the stack. Why?
Turns out Go has not one but two ABIs.
ABIInternal (new): All inter-function calls use ABIInternal - which we saw in the previous sections. These use registers for function calling for performance gains.
ABI0 (old): is used only in specific cases - for example when making calls to functions defined in Assembly. The Syscall() we are calling above is in fact - defined in Assembly, as we will see next.
Refer to the links below for a complete picture.
References
Trace: unix.Syscall
Code
In Go land - the next function takes us to this definition
This is just a declaration, on stepping in with the debugger - we find this is actually an assembly function which directs us to the Syscall package’s syscall func.
Assembly
Back in Assembly land as well we have a single line:
Rabbit Hole: Go has its own flavor of Assembly
Ofcourse every processor usually has its own flavor of Assembly. But if you scroll back up - to the Assembly function Syscall - you’ll see that even though the filename is asm_linux_amd64.s - it does not look like x86-64 assembly.
This is because Go has its own flavor of Assembly too. I am halfway through still trying to unpack the reasons and historical-context behind it - for example if Go has it’s own Assembly why does it still have so many arch-specific Assembly files?
References
Trace: We jump to the syscall
After this point, the tracing comprises mostly of more of the same → setting up registers using the correct ABI and calling the next function. At the end of a web of internal functions - Go sets up registers according to the Linux System Call ABI and finally performs the system call as shown below.
Tracing a syscall in Mac Silicon
I tried to execute the same code in my Mac M1 (arm64) and follow the code path. Go version: 1.23.3.
This time we won’t follow the Assembly since it’s just more of setting up registers and making calls - but we’ll have a quick look at the code.
We will follow a couple of code blocks and then delve into the final rabbit hole.
Trace: main
Code
We follow the same path as before initially
Trace: unix.Write/unix.write
We can see that we are now in an arch-specific file:
The file name is zsyscall_darwin_arm64.go
In the Linux trace at this point we were making a call to an Assembly function - let’s see if this is in Assembly too
Trace: syscall_syscall
We can figure out by reading the code that this is making a libc call. What does this mean?
If you look at the Linux syscall trace - Go does all the heavy lifting right till the end and makes the system call by itself.
In this case we are relying on the C library of the OS to instead make the call for us. This is because unlike Linux - Mac’s internal ABI does not give any guarantees of stability and may change. So Go instead relies on the C standard library to make the call.
Rabbit Hole: Difference in how Go makes Syscalls in Mac vs Linux and its implications
What are the implications of this:
Using libc in Mac ensures that Go Mac binaries are dynamically-linked binaries(binaries which require libraries to exist in the system to execute - libc in this case). Linux Go binaries are statically-linked by default (contain everything required for execution in themselves).
You can verify this with otool in Mac and ldd in Linux:
Consider a Docker container - the reason a Go Linux binary can execute on a scratch Docker container is because since it is statically linked. We do not need any underlying libraries on the container - leading to less bloat on the container.
Surprisingly - the size of the Linux binary is slightly smaller than the Go binary. Since it is self-contained - you would expect it to be larger than the MacOS binary right? But there are other factors as well - the overhead of the statically linked binary is balanced by some other differences in the two binaries.
Mac Binary size: ~1.92 MB
Linux Binary size: ~1.87 MB
Notes
We use different Go versions in Mac and Linux but the concepts discussed are same across both versions - there is no change in ABI or system call convention.
ABI does not include only registers and function calling convention but a whole host of other things like stack layout, boundaries.
Outro
I started off with a very simple question: How does Go execute system calls and wound up finding out a bunch of stuff I wouldn’t have otherwise.
You can really just pull the thread on any question and go in as many directions as you want.
You can just do things :)