# Basic Assembly Language I (Data Size)

## Preview text

Basic Assembly Language I (Data Size)
ICS312 Machine-Level and Systems Programming
Henri Casanova ([email protected])

Size of Data

 Labels merely declare an address in the data segment, and do not specify any data size

 Size of data is inferred based on the source or destination register

 mov eax, [L]

 mov al, [L]

 mov [L], eax

; stores 32 bits

 mov [L], ax

; stores 16 bits

 This is why it’s really important to know the names of the x86 registers

Size Reduction

 Sometimes one needs to decrease the data size

 For instance, you have a 4-byte integer, but you needs to use it as a 2-byte integer for some purpose

 We simply uses the the fact that we can access lower bits of some registers independently

 Example:

 mov

ax, [L] ; loads 16 bits in ax

 mov

bl, al ; takes the lower 8 bits of ax and puts them in bl

al ax bl

Size Reduction

 Of course, when doing a size reduction, one loses information

 So the “conversion to integers” may or may not work

 Example that “works”:

 mov ax, 000A2h

; ax = 162 decimal

 mov bl, al;

; bl = 162 decimal

 Decimal 162 is encodable on 8 bits (it’s < 256)

 Example that “doesn’t work”:

 mov ax, 00101h

; ax = 257 decimal

 mov bl, al;

; bl = 1 decimal

 Decimal 257 is not encodable on 8 bits because > 255

Size Reduction and Sign
 Consider a 2-byte quantity: FFF4  If we interpret this quantity as unsigned it is decimal 65,524
 The computer does not know whether the content of registers/ memory corresponds to signed or unsigned quantities
 Once again it’s the responsibility of the programmer to do the right thing, using the right instructions (more on this later)
 In this case size reduction “does not work”, meaning that reduction to a 1-byte quantity will not be interpreted as decimal 65,524 (which is way over 255!), but instead as decimal 244 (F4h)
 If instead FFF4 is a signed quantity (using 2’s complement), then it corresponds to -000C (000B + 1), that is to decimal -12
 In this case, size reduction works!

Size Reduction and Sign
 This does not mean that size reduction always works for signed quantities
 For instance, consider FF32h, which is a negative number equal to -00CEh, that is, decimal -206
 A size reduction into a 1-byte quantity leads to 32h, which is decimal +50!
 This is because -206 is not encodable on 1 byte
 The range of signed 1-byte quantities is between decimal -128 and decimal +127
 So, size reduction may work or not work for signed or unsigned quantities!  There will always be “bad” cases

Two Rules to Remember
 For unsigned numbers: size reduction works if all removed bits are 0
0 0 0 0 0 0 0 0XXXXXXXX

XXXXXXXX
 For signed numbers: size reduction works if all removed bits are all 0’s or all removed bits are all 1’s, AND if the highest bit not removed is equal to the removed bits
 This highest remaining bit is the new sign bit, and thus must be the same as the original sign bit
a a a a a a a a aXXXXXXX

a = 0 or 1

aXXXXXXX

Size Increase
 Size increase for unsigned quantities is simple: just add 0s to the left of it
 Size increase for signed quantities requires sign extension: the sign bit must be extended, that is, replicated
 Consider the signed 1-byte number 5A. This is a positive number (decimal 90), and so its 2-byte version would be 005A
 Consider the signed 1-byte number 8A. This is a negative number (decimal -118), and so its 2-byte version would be FF8A

Unsigned size increase
 Say we want to size increase an unsigned 1byte number to be a 2-byte unsigned number
 This can be done in a few easy steps, for instance:
 Put the 1-byte number into al  Set all bits of ah to 0  Access the number as ax
 Example
 mov al, 0EDh  mov ah, 0  mov ..., ax

Unsigned size increase
 How about increasing the size of a 2-byte quantity to 4 byte?  This cannot be done in the same manner because there is no
way to access the 16 highest bit of register eax separately!
AX

AH

AL

= EAX

 Therefore, there is an instruction called movzx (Zero eXtend), which takes two operands:
 Destination: 16- or 32-bit register  Source: 8- or 16-bit register, or 1 byte in memory, or 1 word in
memory  The destination must be larger than the source! 