Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / ASM

A Note on How Functions are Called

5.00/5 (30 votes)
28 Jun 2023CPOL7 min read 37.9K   609  
This is a note on how compilers implement function calls to pass the parameters to and get the return value from the called functions.
The article explains how function calls are implemented in compilers, focusing on the example of a C program running on an X64 architecture.

Background

This note is not about the syntax to call a function, it is about how compilers implement function calls to pass the parameters to and get the return value from the called functions. The examples in this note are written in a Linux Mint 17.3 Cinnamon 64-bit computer. I used GNU GCC and GAS to build the C and assembly code. I used the Eclipse for C/C++ Developers as my IDE. If you want to run the attached code, you can refer to this link on how to set up your environment. Since the assembly code in the examples is 64-bit, you need a 64-bit machine to run it. I only ran the code in Linux. It may run fine in other operating systems, but I do not have a guarantee.

Warm-up with a C Function

This is a warm-up session. We will review how a function is called in a high-level language.

Image 1

The attached 1095241/a-simple-c-function-call.zip is a C program. The a-simple-c-function-call.c is implemented as the following:

C++
#include <stdio.h>
    
// The function for addition
int add(int i, int j) {
    return i + j;
}
    
int main() {
    
    // Make a call to the add function
    int result = add(100, 200);
    
    printf("Expected Result = 300\n");
    printf("The actual Result = %d\n", result);
    
    return 0;
}
  • The add() function takes two integer parameters and returns an integer value;
  • The main() function makes a call to the add() function and prints out the result.

Running the C program, we can see the following result:

Image 2

This note is intended to answer the following questions:

  • Where does the calling function put the parameters so the called function can get them?
  • Where does the called function put the return value so the calling function can get it when the function returns?

Although this note is prepared with a 64-bit Linux machine in C, I hope that it may have some reference value for other high-level languages.

A Little Background with the X64 Architecture

To get the two questions answered, we need some knowledge on the computer architecture. The assembly language program that I wrote for this note is for X64.

Image 3

The most important parts that are related to this note are the following:

  • The general purpose registers RAX - R15 can be used for many purposes, but mostly they can be used to store integer values (int, long, etc.) and make integer calculations;
  • The XMM registers XMM0 - XMM15 are used to store floating point values (float, double, etc.) and make floating point calculations;
  • The register RSP has a special usage, it points to the top of the stack.

I am not going to discuss too many details on the X64 architecture, nor the assembly language programming. If you are interested to explore more, you can check out the following links:

How Function Calls Are Made

How high-level language functions are called at the assembly level has always been an interesting subject. Unfortunately, all the assembly experience that I had was on some cheap early CPUs. Implementing a function call would be very tedious on those CPUs. Fortunately, I found the link http://cs.lmu.edu/~ray/notes/gasexamples/. It showed me that a function call is actually pretty simple on an X86/X64 computer.

Image 4

The attached example program has 8 program files:

  • The how-function-calls-are-made.c is the entry point of the program written in C;
  • The 7 assembly files (.s) as1_*.s - as7_*.s each demonstrates a single fact on how function calls are made.

The following is the how-function-calls-are-made.c file:

C++
#include <stdio.h>
    
int return_an_integer();
double return_a_double();
int first_6_int_parameters(int i1, int i2, int i3, int i4,
        int i5, int i6);
int the_7th_int_parameter(int i1, int i2, int i3, int i4,
        int i5, int i6, int i7);
double first_8_dbl_parameters(double d1, double d2,  double d3,  double d4,
        double d5,  double d6,  double d7,  double d8);
double the_9th_dbl_parameter(double d1, double d2,  double d3,  double d4,
        double d5,  double d6,  double d7,  double d8, double d9);
void pass_a_pointer(char* s);
    
int main() {
    
    // Get the return value - Integer and double
    // 1. Get an integer return value
    printf("Calling return_an_integer() => %d\n", return_an_integer());
    // 2. Get an double return value
    printf("Calling return_a_double() => %.3f\n", return_a_double());
    
    // Pass integer parameters into functions
    // 1. Pass the first 6 integer parameters to a function
    int result = first_6_int_parameters(1, 2, 1, 1, 2, 1);
    printf("Calling first_6_int_parameters() => %d\n", result);
    // 2. Pass more than 6 integer parameters to a function
    result = the_7th_int_parameter(1, 2, 1, 1, 2, 1, 10);
    printf("Calling the_7th_int_parameter() => %d\n", result);
    
    // Pass double parameters into functions
    // 1. Pass the first 8 double parameters to a function
    double dResult = first_8_dbl_parameters(0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1);
    printf("Calling first_8_dbl_parameters() => %.1f\n", dResult);
    // 2. Pass more than 8 parameters to a function
    dResult = the_9th_dbl_parameter(0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 10.0);
    printf("Calling the_9th_dbl_parameter() => %.1f\n", dResult);
    
    // Pass a pointer
    char s[13];
    pass_a_pointer(s);
    printf("Calling pass_a_pointer() => %s\n", s);
    
    return 0;
}

The makefile can be used to compile/clean/run the program:

CC=gcc
SRC=src
BUILD=Default
    
all: $(BUILD)/how-function-calls-are-made
    
$(BUILD)/how-function-calls-are-made: $(BUILD)/main.o \
        $(BUILD)/as1_return_an_integer.o \
        $(BUILD)/as2_return_a_double.o \
        $(BUILD)/as3_first_6_int_parameters.o \
        $(BUILD)/as4_the_7th_int_parameter.o \
        $(BUILD)/as5_first_8_dbl_parameters.o \
        $(BUILD)/as6_the_9th_dbl_parameter.o \
        $(BUILD)/as7_pass_a_pointer.o
    $(CC) -o $(BUILD)/how-function-calls-are-made $(BUILD)/*.o
    
$(BUILD)/main.o: $(SRC)/how-function-calls-are-made.c
    $(CC) -c -g0 -o $(BUILD)/main.o $(SRC)/how-function-calls-are-made.c
    
$(BUILD)/as1_return_an_integer.o: $(SRC)/as1_return_an_integer.s
    as -c -o $(BUILD)/as1_return_an_integer.o $(SRC)/as1_return_an_integer.s
    
$(BUILD)/as2_return_a_double.o: $(SRC)/as2_return_a_double.s
    as -c -o $(BUILD)/as2_return_a_double.o $(SRC)/as2_return_a_double.s
    
$(BUILD)/as3_first_6_int_parameters.o: $(SRC)/as3_first_6_int_parameters.s
    as -c -o $(BUILD)/as3_first_6_int_parameters.o $(SRC)/as3_first_6_int_parameters.s
    
$(BUILD)/as4_the_7th_int_parameter.o: $(SRC)/as4_the_7th_int_parameter.s
    as -c -o $(BUILD)/as4_the_7th_int_parameter.o $(SRC)/as4_the_7th_int_parameter.s
    
$(BUILD)/as5_first_8_dbl_parameters.o: $(SRC)/as5_first_8_dbl_parameters.s
    as -c -o $(BUILD)/as5_first_8_dbl_parameters.o $(SRC)/as5_first_8_dbl_parameters.s
    
$(BUILD)/as6_the_9th_dbl_parameter.o: $(SRC)/as6_the_9th_dbl_parameter.s
    as -c -o $(BUILD)/as6_the_9th_dbl_parameter.o $(SRC)/as6_the_9th_dbl_parameter.s
    
$(BUILD)/as7_pass_a_pointer.o: $(SRC)/as7_pass_a_pointer.s
    as -c -o $(BUILD)/as7_pass_a_pointer.o $(SRC)/as7_pass_a_pointer.s
    
run: $(BUILD)/how-function-calls-are-made
    $(BUILD)/how-function-calls-are-made
    
clean:
    find $(BUILD) -type f -delete

1. How the value is returned from a function?

The GNU C implementation on an X64 computer is pretty simple. We can simply put the return value in the specified register. Let's take a look at the as1_return_an_integer.s and the as2_return_a_double.s files.

ASM
.globl return_an_integer
    
.text
return_an_integer:
    mov $2106, %rax
    
    # The return value is in %rax
    ret
ASM
.globl return_a_double
    
.data
    v1: .double 2016.422
    
return_a_double:
    movsd v1, %xmm0
    
    # The return value is in %xmm0
    ret
  • To return an integer/long value, we can put the value into the RAX register. The calling function in GNU C knows to pick the return value from this register;
  • To return a float/double value, we can put the value into the XMM0 register. The calling function in GNU C knows to pick the return value from this register.

2. How the parameters are passed into the called function?

Let's first take a look at how integer/long values are passed into a function in the files as3_first_6_int_parameters.s and as4_the_7th_int_parameter.s.

ASM
.globl  first_6_int_parameters
    
first_6_int_parameters:
    mov $0, %rax
    
    # First 6 integers - left to right
    # rdi, rsi, rdx, rcx, r8, r9
    add %rdi, %rax
    add %rsi, %rax
    add %rdx, %rax
    add %rcx, %rax
    add %r8, %rax
    add %r9, %rax
    
    # The return value is in %rax
    ret
ASM
.globl  the_7th_int_parameter
    
the_7th_int_parameter:
    mov $0, %rax
    
    # First 6 integers
    add %rdi, %rax
    add %rsi, %rax
    add %rdx, %rax
    add %rcx, %rax
    add %r8, %rax
    add %r9, %rax
    
    # The 7th and above integer parameters
    # are pushed to the stack by the caller
    # and it is the caller's responsibility to pop them
    # The order of the parameters is right to left
    add 8(%rsp), %rax
    
    # The return value is in %rax
    ret
  • In GNU C, the first 6 integer/long parameters (from the left to right in the function declaration) are copied by the caller into the RDI, RSI, RDX, RCX, R8, and R9 registers correspondingly;
  • If there are more than 6 integer/long parameters, starting from the 7th parameter (from the right to left in the function declaration), they are pushed into the stack. When a function call is made, the return address is pushed to the top of the stack. When the called function retrieves the 7th+ parameters, it needs to offset for 8 bytes.

For float/double parameters, let's take a look at the as5_first_8_dbl_parameters.s and as6_the_9th_dbl_parameter.s files.

ASM
.globl first_8_dbl_parameters
    
first_8_dbl_parameters:
    
    # First 8 double - left to right
    # xmm0, xmm1, xmm2, xmm3, xmm4, xmm5, xmm6, xmm7
    addsd  %xmm1, %xmm0
    addsd  %xmm2, %xmm0
    addsd  %xmm3, %xmm0
    addsd  %xmm4, %xmm0
    addsd  %xmm5, %xmm0
    addsd  %xmm6, %xmm0
    addsd  %xmm7, %xmm0
    
    # The return value is in %xmm0
    ret
ASM
.globl the_9th_dbl_parameter
    
the_9th_dbl_parameter:
    
    # First 8 double
    addsd  %xmm1, %xmm0
    addsd  %xmm2, %xmm0
    addsd  %xmm3, %xmm0
    addsd  %xmm4, %xmm0
    addsd  %xmm5, %xmm0
    addsd  %xmm6, %xmm0
    addsd  %xmm7, %xmm0
    
    # The 9th and above double parameters
    # are pushed to the stack by the caller
    # and it is the caller's responsibility to pop them
    # The order of the parameters is right to left
    addsd 8(%rsp), %xmm0
    
    # The return value is in %xmm0
    ret
  • In GNU C, the first 8 float/double parameters (from the left to right in the function declaration) are copied by the caller into the xmm0, xmm1, xmm2, xmm3, xmm4, xmm5, xmm6, and xmm7 registers correspondingly;
  • Similar to the integer/long case, if there are more than 8 float/double parameters, starting from the 9th parameter (from the right to left in the function declaration), they are pushed into the stack. When the called function retrieves the 9th+ parameters, it also needs to offset the return address on the top of the stack for 8 bytes.

3. How are the pointers handled?

In C and assembly languages, pointers are integers (X64 has a 64-bit address space) and the integer value represents a memory address. Let us take a look at the as7_pass_a_pointer.s file.

ASM
.globl  pass_a_pointer
    
pass_a_pointer:
    
    # %rdi is the first parameter
    mov %rdi, %rax
    
    movb $72, (%rax)
    movb $101, 1(%rax)
    movb $108, 2(%rax)
    movb $108, 3(%rax)
    movb $111, 4(%rax)
    movb $32, 5(%rax)
    movb $87, 6(%rax)
    movb $111, 7(%rax)
    movb $114, 8(%rax)
    movb $108, 9(%rax)
    movb $100, 10(%rax)
    movb $33, 11(%rax)
    movb $0, 12(%rax)
    
    ret
  • The pass_a_pointer function takes a pointer to an array as the parameter. Since it is the only parameter, the pointer value is passed by the calling function in the RDI register;
  • A set of ASCII bytes (Hello World!) are written to the memory starting from the address pointed to by the RDI register.

Run the Example

Since assembly languages are CPU dependent, you will need an X64 Linux computer to run the programs. You can get the GNU C/C++/GAS by the following command:

sudo apt-get install build-essential

You can issue the following commands to check if the required compiler and assembler are installed successfully.

  • which gcc
  • which as
  • which make

If they are successfully installed, the which command will tell you the installation directory. If you are interested to explore how to run the program in Eclipse, this link is a pretty comprehensive note on using Eclipse for C/C++ development. If you just want to run the program, you can simply go to the directory where the makefile is located and run the following command:

make run

Image 5

Summary

This note should have had the following questions answered regarding to GNU GCC:

  • How is the value returned from a function?
    • An integer/long value is returned in the RAX register;
    • A float/double value is returned in the XMM0 register.
  • How are the parameters passed into the called function?
    • The first 6 integer/long parameters (from the left to right in the function declaration) are copied into the RDI, RSI, RDX, RCX, R8, and R9 registers correspondingly;
    • If there are more than 6 integer/long parameters, starting from the 7th parameter (from the right to left in the function declaration), they are pushed into the stack;
    • The first 8 float/double parameters (from the left to right in the function declaration) are copied into the xmm0, xmm1, xmm2, xmm3, xmm4, xmm5, xmm6, and xmm7 registers correspondingly;
    • If there are more than 8 float/double parameters, starting from the 9th parameter (from the right to left in the function declaration), they are pushed into the stack;
    • When a function is called, the top of the stack is the return address. When retrieving parameters from the stack, we need to offset 8 bytes in the X64 architecture.
  • How are the pointers handled?
    • A pointer is an integer, so it is passed into and returned from a function the same way as an integer. An X64 CPU has a 64-bit address space.
  • Anything else needs attention?
    • According to my major reference of this note http://cs.lmu.edu/~ray/notes/gasexamples/ (really appreciated to ray), the called function needs to preserve the RBP, RBX, R12, R13, R14, R15 registers. All others are free to be changed by the called function.

It should be noted that all the observations in this note only apply to GNU GCC. C language does not have a standardized ABI, so other compilers may implement the function calls differently. This note should have some value for programmers to understand the term of "passing by value/reference" when programming in high-level languages.

Points of Interest

  • This is a note on how functions are called at the assembly language level;
  • I hope you like my posts and I hope this note can help you one way of the other.

History

  • 26th April, 2016: Initial revision

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)