Laboratory 12 - Theory

Multi-module programming (asm+C)

Multi-module programming (asm+C)

Working with multi-module programms, regardless of the programming language of each module, it is assumed that one of the modules calls a subprogram from a different module. We assume that one of the modules is written in assembly language, and the other one in a high level programming language.

Motivation:

  • high execution speed in resolving tasks with minimal resource consumption;

Call code

Entry code

Return code

  • Restoring nonvolatile altered resources;
  • Removing local variables of the function;
  • Destroying the stack frame;
  • Returning to the calling code and removing the parameters.
Except for the volatile resources and direct results of the function, the status of the program after these steps must reflect the initial, pre-call state.

Declaring extern symbols:

  • In order to access a function written in assembly language from a C program, the function needs to be declared global in the assembly program and needs to contain the character '_' in front of the function name.
  • If the function will be called from the C program as
    fun()
    , then the asm program will contain the following:
    global _fun
    segment code public code use32
    _fun:
    

Keeping the value of some registers untainted

High level languages require that certain registers have the same value after a function call as before the function call. For this purpose, if the subprogram defined in assembly language changes some of these registers, then their values at the entry point need to be stored (for example on the stack). These values will be restored before returning from the procedure.
  • PUSHAD and POPAD can be used for storing and restoring the values of the 8 general registers.

Passing parameters to the function

  • Parameters are passed using the stack, which offers a greater flexibility than passing parameters using registers (regarding the number of parameters);

Passing parameters to the function

  • When entering the function we set the register EBP←ESP. Before exiting the function we will restore this value. Because ESP changes when we push parameters on the stack, the best way to acces the values of the parameters is using a base or an index register. For this purpose EBP is more suitable, because when we use it we automatically refer to the stack segment. The sequence that prepares the stack access is:
    push ebp
    mov ebp, esp

Reserving memory space for local defined data

Sometimes the procedure needs local data. If their value does not need to be stored between two consecutive function calls, then these are volatile data and they will be stored on the stack. Otherwise, these are static data and they will be stored in a different segment from the stack segment, for instance in the data segment. Reserving n bytes (n being a multiple of 4) for local data can be done relative to EBP.
sub esp,n
Hence:
  • the EBP register will be used to acces parameters (for example [EBP+8] accesses the first parameter represented on 32 bits);
  • the first parameter accessible from the stack is the last parameter added on the stack by the caller program;
  • we reserve space on the stack for local variables, for example:
    sub esp,4*1
  • this method simplifies the way of accessing parameters, especially for functions with a variable number of parameters;
  • it is the responsibility of the programer to pop the parameters out of the stack.

Returning values from the function

  • if the function returns an integer, then this will be returned in EAX;
  • if the function returns a string, then its address will be returned in EAX;
  • using the CDECL convention, it is assumed the the registers EBX, ESI, EDI, EBP and ESP do not modify their value during the function call;

Returning from the procedure

When returning from the procedure the following steps are necessary:
  • restoring the values of the registers (see section Keeping the value of some registers untainted);
  • restoring the stack so that it contains the return address on top:
    mov esp, ebp
    pop ebp 

Structure of a function:

global _fun
segment code public code use32 
_fun:      
        push ebp
        mov ebp, esp   
        pushad 
        ;... code of the function ... 
        popad 
        mov eax, returned_value 
        mov esp, ebp
        pop ebp 
        end
	

Using procedures defined in assembly within a C program

Example 1

In an assembly program we define a procedure called hello_world that does not have any parameters and does not return anything. The procedure prints the message "Hello World!" on the screen.

hello_world.asm

hello_world.c

bits 32
extern _printf
global _hello_world
segment data public data use32
	mesaj db 'Hello world!', 0
segment code public code use32
_hello_world:
	push ebp
	mov ebp,esp
	push dword mesaj
	call _printf
	add esp, 4*1
	pop ebp
    ret
#include <stdio.h>

void hello_world();

int main()
{
	hello_world();
	printf("This program just prints something on the screen!");
	return 0;
	
}

Observe the keyword extern, which tells the compiler that the function / variable is defined in a different file (not in the current file). It is the linker's job to create a conexion between this declaration of the function / variable and its definition.

Example 2

In an assembly program we define a procedure called return_10, which does not have any parameters and it returns an integer.

return_10.asm

return_10.c

bits 32
global _return_10
segment data public data use32
segment code public code use32
_return_10:
	mov eax, 10
	ret
#include <stdio.h>
int return_10();
int main()
{
	printf("The program returns the value %d!",return_10());
	return 0;
}

Example 3

In an assembly program we define a procedure called sum which has two integer parameters and returns their sum (an integer).

sum.asm

sum.c

bits 32
global _sum
segment data public data use32
segment code public code use32
_sum:
	push ebp
	mov ebp, esp	                 	
	mov eax, [ebp+8]
	add eax, [ebp+12]
	mov esp, ebp
	pop ebp
    ret

#include <stdio.h>

int sum(int, int);

int main()
{
	
	printf("%d\n", sum(2, 3));
	return 0;
	
}



Example 4

In an assembly program we define a procedure called factorial which has a positive integer as parameter and returns its factorial (a positive integer).

factorial.asm

factorial.c

bits 32
global _factorial
segment data public data use32
segment code public code use32
_factorial:
	push ebp
	mov ebp,esp
	sub esp, 4                   
	mov eax, [ebp+8] 
	cmp eax,2
	jbe   .trivial
	.recursiv:
		dec  eax
		push eax
		call _factorial
		add  esp, 4    
		mov  [ebp-4], eax   ; m = (n-1)!
		mov  eax, [ebp+8]   ; n
		mul  dword [ebp-4]  ; edx:eax ← n * m
		jmp  .final
	.trivial:
		xor  edx, edx     
	.final:
	add esp, 4
    mov esp, ebp
    pop ebp
    ret
    





#include <stdio.h>

int factorial(int);

int main()
{
	int n, f;
	printf("n = ");
	scanf("%d", &n);

	f = factorial(n);

	printf("factorial(%d) = %d\n", n, f);

	return 0;
}







Multi-module programming (asm+C) in Visual Studio

The following tutorial is based on Visual Studio 2015, it is assumed that you have a version of Visual Studio installed on your computer. For more details please access in MS TEAMS the Files section from the General channel, and follow the steps presented in the document procedura_instalare.doc.
The following example shows how to compile, run and debug the program from the Example section.

We use the command line for compiling/assembling the modules

The steps used for compiling the main.c program are:

  • open the Visual Studio command line, for this navigate in the Windows Start menu at Visual Studio and choose the option VS2015 x86 Native Tools Command Prompt, as in the figure below.

  • In the terminal window navigate to the directory where the program sources are located. In the following example the sources are in the tmp folder, the dir command lists the content of the current directory. Besides the source files of the program, in the tmp directory we also have the executable nasm.exe used for assembling modulAsm.asm.

  • In the first step we assembly modulAsm.asm using the command:
nasm modulAsm.asm -fwin32 -o modulAsm.obj
(see the figure below). The result is the file modulAsm.obj.

  • Using the Visual Studio compiler (cl.exe) we compile main.c. This step must include link editing -> we use the parameter /linker with the file modulAsm.obj. The result is the program main.exe.

  • The program can be executed from command line using main.exe:

We can debug the program using Ollydbg, but in order to do this we must specify it in the assembly/compile step:

> nasm modulAsm.asm -fwin32 -g -o modulAsm.obj
> cl /Z7 main.c /link modulAsm.obj
From Ollydbg, File -> Open we open main.exe.
In Visual C if we wish to include debugging information the options are /Z{7|i|I} (see https://msdn.microsoft.com/en-us/library/958x11bc.aspx).