Laboratory 1 - Theory

Converting between different number bases

Learning:

  • how to convert decimal numbers to the corresponding number in a different base, especially base 16 and 2
  • how to convert a number in a different base, especially 16 and 2, to base 10
  • how to directly convert from base 16 to base 2 and vice versa

Theoretical considerations

A numeral system (or system of numeration) is a writing system for expressing numbers; that is, a mathematical notation for representing numbers of a given set, using digits or other symbols in a consistent manner. For every numeral system the number of distinct symbols, also called digits, is equal to the base (b). So, for base b=2 (binary numbers) the digits are 0 and 1. For the numeral system with base 16 the symbols are: 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F. For numeral systems with the base higher than 10 we use other symbols beside the digits we use in the decimal base. So, for hexadecimal numbers the letters A,B,C,D,E,F correspond to the values 10,11,12,13,14,15. Notation: the base is usually written in parenthesis at the end of the number, for example: 100101001(2), sau 17A6B(16).

Converting decimal numbers to a different base b

The easiest algorithm is to repeatedly divide by the base b to which we convert the number, keeping track of the remainders. So we divide the number by b, then continue dividing the quotient to b, and so on until the remainder becomes 0. Finally, the number in base b is formed out of the remainders in reverse order.

Example:

  1. Convert 347 from base 10 to base 16(H)
    Considering the remainders in reverse order we obtain 15B(H).
    347(D) = 15B(H)
  2. Convert 57 from base 10 to base 2(B).
    57(D) = 111001(B)
There is a quick way to convert numbers between base 2 and 16 or vice versa, taking into consideration that each hexadecimal digit corresponds to exactly 4 binary digits::

Decimal value Hexadecimal value Binary number corresp. to the hexadecimal digit
0 0 0000
1 1 0001
2 2 0010
3 3 0011
4 4 0100
5 5 0101
6 6 0110
7 7 0111
8 8 1000
9 9 1001
10 A 1010
11 B 1011
12 C 1100
13 D 1101
14 E 1110
15 F 1111


One needs to take into account, when converting from base 2 to base 16, that dividing the binary number into groups of 4 starts from right to left (eventually adding zeros to the left, if necessary). Then those nibbles (groups of 4 bits) are converted into hexadecimal

Example:

  1. Convert the decimal number 347 to base 2 and 16.
    347(D) = 15B(H) = 0001 0101 1011(B)

Other examples:

2 = 2(10) =10(2) 62(10) = 111110(2) 1995(10) = 11111001011(2) 1024(10) = 10000000000(2)

Converting a number from a different base to the decimal base

In order to convert a number from a base b to base 10 one can use the following formula. If the number in base b is represented as follows:
Nr(b) = Cn Cn-1 Cn-2 … C2C1 C0
then the value of the number in base 10 can be computed as:
Nr(10) = Cn * bn + Cn-1 * bn-1 + … + C2 * b2 +  C1 * b1+ C 0

Examples:

  1. Convert the hexadecimal number 3A8(H) to base 10:
    N = 3*162 + 10*16 1 + 8 = 3*256 + 160 + 8 = 936(10)
  2. Convert the hexadecimal number 86C(H) to base 10:
    86C(16) = 8 * 162 + 6 * 16 + 12 = 2156(10)
  3. Convert the binary number 1101101(2) to base 10:
    1101101(2) = 1*26+1*25+0*24+1*23+1*22+0*2+1=109(10)

Other examples:

1010011(2) = 83(10)  11100011(2) = 227(10) 1000000000(2) = 512(10)   11001(2) =25(10)

Bit. Sign bit. Complementary code. Representing signed integers.

Bit

  • The reason why computers computes use binary arithmetics is that base 2 is the most suitable for automation among all other bases. A bit represents a binary digit.
  • Bit = the basic information unit
  • A bit contains only two possible values: 0 or 1.
  • Depending on the context, a bit can have different meanings, such as 0 or 1, true or false, good or bad, etc. It all depends on the interpretation!
  • One byte is a sequence of 8 bits, numbered from 0 to 7 as follows::
    7
    high bit
    6543210
    low bit

Sign bit. Complementary code

If we interpret a certain bit configuration as a signed number, then, by convention, a single bit is used for representing the sign of the number. That is the high bit (bit 7) from the high byte of the location, where the number is represented. If the value of this number is 0, then the number is positive. If the value of this bit is 1, then the number is negative.

How should signed integers be represented?

Different methods where proposed:
  • Direct code: representing the absolute value of the number on the n-1 bits of the n bits of the location, and use high bit to represent the sign. Although this seem to be the easiest method, it proved to be less efficient than others. One problem is that using this representation: -7 + 7: -7 + 7 ≠ 0
  • Inverse code (one’s complement): representing the absolute value of the number on the n-1 bits of the n bits of the location, and if the number is negative invert all n bits of the representation. However, this representation is also not very efficient, for instance the same problem comes up: -7 + 7 ≠ 0)
  • Complementary code (two’s complement): for complementing a number representing on n bits, first one has to invert all bits of the representation (0 becomes 1 and 1 becomes 0) and then add 1 to the obtained value. This is the representation that is being used for signed numbers.

Alternative rules for the complementary code:

  • Starting from the right part of the representation, leave all bits unchanged up to the first bit with value 1 (including this bit); invert the rest of the bits up to and including bit n-1
  • or
  • Subtract the binary content of the location from 100..00, where the number of zeros is equal to the number of bits of the location that needs to be complemented
  • or
  • 3. Subtract the hexadecimal content of the location from 100..00, where the number of zeros is equal to the number of hexadecimal digits of the location that needs to be complemented.
The 3 alternative rules are equivalent with the first rule for computing the complementary code.

For example, let’s compute the complementary code for a byte containing the number (18)10:


Initial location:
After inverting the bits
Adding 1
 
Complement:

00010010
11101101
11101101+
00000001
11101110

So the complement of the number (18)10 = (12)16 = (00010010)2 , is (11101110)2 = (EE)16 = (238)10 .
Using the rule of the binary subtraction:


Initial location:
Complement:

100000000-
 00010010
 11101110

Using the rule of the hexadecimal subtraction:


Initial location:
Complement:
100-
 12
 EE

Representing signed integers

An integer between -2n-1 and 2n-1-1 is represented on a location of n bits as follows:
  • If the number is positive, then the location contains the binary representation of the number;
  • If the number is negative, then the location contains the binary representation of the complementary code of the number.

Observation. The number -2n-1 cannot be represented on n-1 bits, because there is no space left for the signed bit. Therefore, it is represented on n bits as 100..0
On the other hand in the signed representation this indicates a negative number. This number is its own complement, therefore, by convention, -2n-1 is represented in the complementary code on n bits as 100..0. The same number in the unsigned representation represents the number  2n-1.
 
Examples of different number representation on locations with different sizes:.


Location size (bytes)
Number in base 10 Representation of the complementary code (hexadecimal) Representation of the complementary code (binary)
1 0 00 00000000
2 0 0000 0000000000000000
1 1 01 00000001
2 1 0001 0000000000000001
1 -1 FF 11111111
2 -1 FFFF 1111111111111111
1 127 7F 01111111
2 127 007F 0000000001111111
1 -128 80 10000000
2 -128 FF80 1111111110000000
2 128 0080 0000000010000000
2 32767 7FFF 0111111111111111

Why the complementary code?

The implementation of integer operations needs to be efficient and to use, as much as possible, comon algorithms from the fundamental operations on integers regardless of the representation convention.
The complementary code best meets the aboe requirements so far. Two main reasons for that are:
  • The addition operations is executed the same in the signed and unsigned representation. It is a simple addition of the n bits of the location, ignoring the last transport.
  • The subtraction is reduced to the addition of the first number with the complement of the second
The multiplication and division have seperate algorithms for the signed and unsigned representation, however the percentage of additions and subtractions is a lot higher than the one of multiplications and divisions. Therefore, the complementary code is an efficient representation for integers. Another advantage is that, when working with positive numbers (this is decided from the beginning), the sign bit is not lost, and the representation interval is much larger.

Representation dimension

There are certain restrictions, the most important of which is the dimension of the representation, i.e. the maximum number of binary digits (number of bits) from the representation of an integer. We denote by n the representation dimension.
The possible values of n for computers nowadays are: 8, 16, 32 and 64.
Problem: if we have a number on 7 bits, how do we represent it on 8 bits?
Answer: it depends on the interpretation
  • In the unsigned representation we add zeros to the remaining high bits
  • In the signed representation we add the sign bit to the remaining high bits
    • Example: (10011)2 is represented on a byte as follows:
      • (00010011)2 in the unsigned representation
      • (11110011)2 in the signed representation
      In the next table we can see the intervals that can be represented on a location of different sizes, both for the signed and unsigned representation.


      No. of
      bytes

      Unsigned representation

      Signed representation

      1

      [0, 28-1]  = [ 0 , 255 ]

      [-27 , 27-1]    = [-128 , 127]

      2

      [0, 216-1] = [ 0 , 65535 ]

      [-215 , 215-1] = [-32 768 , 32 767]

      4

      [0, 232-1] = [ 0 , 4 294 967 295 ]

      [-231 , 231-1] = [-2 147 483 648 , 2 147 483 647]

      8

      [0, 264-1] = [ 0 , 18 446 824 753
      389 551 615 ]

      [-263 , 263-1] = [-9 223 412 376 694 775 808 ,
      9 223 412 376 694 775 807]

Tools for laboratories

Tools for laboratories:

  • Editor: Notepad++
  • Assembler: NASM
  • Linker: ALINK
  • Debugger: Olly DBG
The set of tools can be downloaded here: 🔗 ASM tools. Students that use other operating systems besides Windows can use an emulator (the tools have been succesfully tested with Wine), or run the tools in Windows virtual machine. The set of tools already contains the editor Notepad++ which has an integrated plugin which allows you to perform most operations with a few key combinations.

Instructions:

  • Download the set of tools and unzip it
  • Start the editor Notepad++ from the directory npp (don’t use the own version of Notepad++)
  • Use the editor to implement the given laboratory problems. You can use the following key combinations that can also be found in the menu Plugins -> ASM Plugin:
      ASM Code TemplateCtrl+Shift+NFills in the editor a minimal ASM program
      Build ASMCtrl+F7Assemblies the current program
      Run programCtrl+F6Runs the current program
      Debug programF6Debugs the current program, using Olly Debugger (the debugger documentation can be found in the ollydb directory from the archive)

Structure of a program in assembly

Example:

bits 32 ;assembling for the 32 bits architecture
global start

; we ask the assembler to give global visibility to the symbol called start 
;(the start label will be the entry point in the program) 
extern exit ; we inform the assembler that the exit symbol is foreign; it exists even if we won't be defining it
import exit msvcrt.dll  ; we specify the external library that defines the symbol
        ; msvcrt.dll contains exit, printf and all the other important C-runtime functions

; our variables are declared here (the segment is called data) 
segment data use32 class=data
; ... 

; the program code will be part of a segment called code
segment code use32 class=code
start:
; ... 

    ; call exit(0) ), 0 represents status code: SUCCESS
    push dword 0 ; saves on stack the parameter of the function exit
    call [exit] ; function exit is called in order to end the execution of
the program