Laboratory 11 - Theory

Multi-module programming (asm+asm)

Multi-module programming (asm+asm)

Non-trivial programs (real application programs) tend to be large, consisting of thousands of lines of code, which leads to increased code complexity. As a consequence of this issue, the following questions arise inherently:

  • how is it possible to "break down" the given problem into subproblems of minimal dificulty ?
  • after the "breakdown", which one of the identified subproblems is already known and it has well-established and well-known solutions that can be reused ?

Subprograms in assembly language

  • One of the variants to "break down" the code into subproblems is code modularisation. Assembly language is not aware of the concept of the subprogram. But we can create a sequence of instructions that can be called from any area of a program and after its job is finished to return the control to the program who made the call.
  • Such a sequence of instructions is called subprogram or procedure. The call of a subprogram can be done with a jmp instruction. The problem is that the processor does not know where to return when the subprogram is over. Therefore we have to save the return address when we call a subprogram, and the return from the subprogram will actually be a jump to the return address.
  • The place where the return address is stored is the execution stack. We need to use the stack because a subprogram could call another subprogram and so on.
  • There are two instructions that allow us to call a procedure and to return from a procedure: call and ret.

Sintax:

call label
  • The call instruction is actually a jmp instruction which also puts on the stack the return address (the address of the instruction that follows the call instruction, not the jump destination).
  • The ret instruction transfers program control to a return address located on the top of the stack. The address is usually placed on the stack by a call instruction, and the return is made to the instruction that follows the call instruction. The optional operand specifies the number of stack bytes to be released after the return address is popped; the default is none. This operand can be used to release parameters from the stack that were passed to the called procedure and are no longer needed.

Remarks:

  • All data and code labels are visible within the entire program, therefore no duplicate labels should exist. To avoid duplication, the name of a label to be used in a procedure should begin with a single period.
  • A label beginning with a single period is treated as a local label, which means that it is associated with the previous non-local label.

Example:

.label: ; a label to be used in a procedure (a local label)
label:  ; a non-local label
  • The NASM assembler provides a simple mechanism to build a program from multiple source file through its preprocessor.
  • Using a very similar syntax to the C preprocessor, NASM's preprocessor lets you include other source files into your code. This is done by the use of the %include directive.
  • Therefore, a procedure can be defined either in the same file as the main program (see lab11_procedura.asm) or in a different file (see factorial.asm).

lab11_procedure.asm


; the program calculates and displays the factorial of a number
; the procedure factorial is defined in the code segment of the program
bits 32
global start

extern printf, exit
import printf msvcrt.dll
import exit msvcrt.dll

segment data use32 class=data
	format_string db "factorial=%d",  10, 13, 0

segment code use32 class=code
; procedure definition
factorial: 
	mov eax, 1
	mov ecx, [esp + 4] 
	; mov ecx, [esp + 4] read the parameter from the stack
	; WARNING!!! The return address is on top of the stack.
	; The parameter required by procedure is next to the return address.
	; (see the following diagram)
	;
	; The stack (after the procedure call)
	;
	;|-------------------|
	;|   return address  |  < esp
	;|-------------------|
	;|     00000006h     |  < esp+4 - the parameter required by the procedure
	;|-------------------|
	; ....

	.repeat: 
		mul ecx
	loop .repeat ; the case ecx = 0 is not considered
	
	ret
; "main" program       
start:
	push dword 6        ; pass the parameter to procedure
	call factorial      ; call the procedure

	; display the result
	push eax
	push format_string
	call [printf]
	add esp, 4*2

	push 0
	call [exit]

lab11_proc_main.asm - The procedure factorial is defined in another file (factorial.asm) and it is included into this file using %include directive.


;  the program calculates and displays the factorial of a number
;  the procedure factorial is defined in the file factorial.asm
bits 32
global start

import printf msvcrt.dll
import exit msvcrt.dll
extern printf, exit

; the code from factorial.asm will be inserted here
%include "factorial.asm"

segment data use32 class=data
	format_string db "factorial=%d", 10, 13, 0

segment code use32 class=code
start:
	push dword 6
	call factorial

	push eax
	push format_string
	call [printf]
	add esp, 4*2

	push 0
	call [exit]

factorial.asm


; we need to avoid multiple inclusion of this file
%ifndef _FACTORIAL_ASM_ ; if _FACTORIAL_ASM_ is not defined
%define _FACTORIAL_ASM_ ; then we define it

; procedure definition
factorial:                  ; int _stdcall factorial(int n)
	mov eax, 1
	mov ecx, [esp + 4]  ; read the parameter from the stack

	repeat: 
		mul ecx
	loop repeat         ; the case ecx = 0 is not considered

	ret 4
%endif

Multi-module programs

A program written in assembly language can be break into several source files that can be assembled separately in .obj files. In order to write a multi-module program, the following requirements should be fulfilled:

  • all segments should be declared using public qualifier, because the code segment of the final program is built by concatenation of code segments from each module; the same for data segment
  • all data and code labels from a module that need to be "exported" to other modules should be declared using the global directive
  • data and code labels that are declared inside a module and that are used within other module should be "imported" using the extern directive
  • a variable should be completely defined inside a single module (not one-half in one module and one-half in another). Also, the transfer of execution control from one module to another can be done only through jump instructions (jmp, call, or ret).
  • the program entry point should be defined only inside the module that contains the "main program".

Each module will be assembled separately by using the command:

nasm -fobj module_name.asm
then the modules will be linked together by using the command:
alink module_1.obj module_2.obj ...  module_n.obj -oPE -subsys console -entry start

The stages (assembling / linking / debugging / running)

ASSEMBLING:

nasm -f obj module.asm
  • The option -f specifies the type of file that will be generated, in this case an .obj file.

LINKING:

alink module_1.obj ... module_n.obj -oPE -subsys console -entry start
  • There is a file called "ALINK.TXT" in the folder nasm of asm_tools that describes the ALINK possible options.
  • ALINK options:
    -o xxx
    xxx:
    • COM = output COM file
    • EXE = output EXE file
    • PE = output Win32 PE file (.EXE)
    -subsys xxx:
    The option -subsys set the windows subsystem to use (default=windows).
    • windows, win or gui => windows subsystem
    • console, con or char => console subsystem
    • native => native subsystem
    • posix => POSIX subsystem
    -entry name
    The option -entry specifies the program entry point (first instruction to be executed).

DEBUGGING:

OLLYDBG.EXE module.exe 

RUNNING:

module.exe