intVal3 TBYTE 1234 -- invalid TBYTE variable declaration gone unnoticed by assembler

Question

I'm currently learning assembly programming by following Kip Irvine's "assembly language x86 programming" book.

In the book, the author states

MASM uses the TBYTE directive to declare packed BCD variables. Constant initializers must be in hexadecimal because the assembler does not automatically translate decimal initializers to BCD. The following two examples demonstrate both valid and invalid ways of representing decimal -1234:
intVal TBYTE 80000000000000001234h ; valid
intVal TBYTE -1234 ; invalid 
The reason the second example is invalid is that MASM encodes the constant as a binary integer rather than a packed BCD integer.

I understand that the MASM assembler can't translate decimal integer to BCD. But I had come up with the following code that compiled just fine (notice that intVal3 TBYTE 1234 is suppose to be invalid, but it was compiled just like valid code)

.386
.MODEL FLAT, STDCALL
.STACK 4096
ExitProcess PROTO, dwExitCode: DWORD

.DATA
intVal1 TBYTE 800000000000001234h
intVal2 TBYTE -1234h
intVal3 TBYTE -1234      ; compiled despite being invalid

.CODE 
    main PROC 


invoke ExitProcess, 0
main ENDP
END main

Why has the invalid code gone unnoticed by the assembler? Is this an error that can't be detected by assembler and require vigilance on the part of the programmer?

=============== EDIT 1 =================

I have checked the listing file as suggested by @PaulH, here is a screenshot

enter image description here

Judging from the result in the listing file and from what @PaulH said, I have arrived at the following conclusion (not sure entirely correct though):

variable of type TBYTE will interpret simple store the binary value of arguments (whether it would be 80000000000000001234h, -1234h or -1234) into the variable. Because variable of type TBYTE is suppose to be used as BCD integer, therefore it is entirely upto the programmer to make sure variable of type TBYTE is used correct.

Cody Gray · Accepted Answer

The raison d'être of the TBYTE type is to have something that is the same width as the x87 FPU's internal registers, meaning that it can be used to spill the contents of one of those registers to memory without losing any precision.

Normally, when you save a floating-point value in memory, you represent it either as a single-precision (32-bit; DWORD) or double-precision (64-bit; QWORD) value. This is fine, except that it loses precision. If you want to spill a temporary intermediate value during a computation, then you often cannot afford to lose precision by truncating the value because that will affect the final result.

The name TBYTE just means that values of this type are 10 bytes wide—the same width as is used internally for floating-point values on the x87. (By default, at least, assuming you haven't decreased the FPU's precision.)

So, TBYTE really has nothing inherently to do with binary-coded decimal (BCD). I have no idea what Kip Irvine is talking about there. You could certainly store a BCD value in a TBYTE, but you could just as well store a smaller BCD value in a QWORD or DWORD. As the name suggests, BCD is just an encoding that allows you to store decimal digits in binary form.

The reason why

intVal3 TBYTE -1234

~~compiles~~ assembles is because, to the assembler (MASM), all you've done is declare a 10-byte value initialized to -1234. It implicitly extends -1234 to fill 10 bytes, resulting in the value 0xFFFFFFFFFFFFFFFFFB2E, as you see in the hex dump. Same thing for -1234h, except that the h means the value is interpreted as being in hexadecimal, rather than decimal.

Notice that this is basically the same thing that would happen if you did

myValue QWORD -1234

because the assembler is going to extend -1234 to be 8 bytes long.

As Ped7g says in a comment, the thing to remember above all when programming in assembly language is:

In the end, it doesn't matter how you specify the memory content in the source, … the code which operates upon that memory defines its "meaning" (type).

The assembler just stores bytes. With TBYTE, it stores 10 of them. With QWORD, it stores 8 of them. With DWORD, it stores 4 of them. You get the picture. How your code interprets these bytes is up to you, because you have to write that code.

Peter Cordes points out (see comments) that the x87 FPU does have instructions designed to load and store BCD values: FBLD and FBSTP. These can be used as a slow way to turn a binary integer into decimal digits.

Both of these instructions take m80bcd values as their sole operand, which is an 80-bit BCD value that would be the same length as a TYBTE. So, it is possible that Kip Irvine is talking about this use for TBYTE values.

However, I don't believe that MASM implicitly converts TBYTE initializers to a BCD format, as that would be very inconvenient when you were using TBYTE to store an extended-precision floating-point value, as discussed above. With MASM or any other assembler, you are still on your own to represent the value assigned to a TBYTE appropriately as BCD or floating-point, whichever one you wanted.

And anyway, now that you've heard of FBLD and FBSTP, you can pretty much forget about them again. I don't think they were ever very commonly used, and they certainly don't get any use now. Even on older CPUs, like the original Pentium (P5) and Pentium II (P6), these instructions took ~150 clock cycles. On newer CPUs, they have gotten even slower (Skylake has a throughput of 1 FBSTP per 266 cycles). Therefore, even if you did want to work with 80-bit BCD values, you'd be better off writing out the necessary instructions yourself. (Ask a new question about that if you need help.)

intVal3 TBYTE 1234 -- invalid TBYTE variable declaration gone unnoticed by assembler

Tags:

x86

assembly

masm

Thor

1 Answers

Cody Gray

Recent Activity

Donate For Us

intVal3 TBYTE 1234 -- invalid TBYTE variable declaration gone unnoticed by assembler

Tags:

x86

assembly

masm

Thor

1 Answers

Cody Gray

Related questions

Recent Activity

Donate For Us