Making a Dos program with NASM

The tutorials here are done using Borland's NASM, and ALINK
These are free. Find them online. Click here for my archive of them.

Turning our sourcecode into an executable is a 2-step process.
1) Compile sourcecode(text) into an Object file (.obj)
2) Link object file(s) into the executable (.exe)

how to compile...

NASM -f obj [source_filename]
ALINK [source_filename1] [source_filename2] [source_filename_etc] [-o output_filename]


some syntax options are:
NASM [-f format {bin/obj/win32/as86}] [-o output_filename] source_filename
ALINK [source_filename1] [source_filename2] [source_filename_etc] [-o output_filename] [-oXXX (xxx is format{exe/com/pe})]


Technicalities of source code.
Assembly code is probably the simplest most straight-foward code there is. However, as all source code, the code itself is a communication to the compilier. You have already seen Instruction syntax, which is the way to code CPU instrutions (opcodes). Now you'll see Directives, which are instructions to the compilier. Directives help you form your program.

1) Labels
2) Defining Variables
3) Segments
4) Procedures?
5) Comments


IMPORTANT: When reading/writting code, pay attention to every punctuation mark, because Periods, Colons, Semi-Colons, Singe/Double Quotes have specific meaning to the complilier.


1) Labels
Labels are markers in source code. They mark their location. The label is used as a pointer to the location it marks. That pointer is an offset value from the Label's segment.


2) Defining Variables
To define a variable, we use a Data Allocation Directive. By defining a variable, we instruct the compilier to set aside space for that variable (allocate) and to place an initial value there (optional). A variable is given a name, a size, and an initial value. When the compilier processes the source code, and incounters a variable, it replaces the variable's name with the variable's offset from it's segment. So a variable's name is really a LABEL, and a LABEL is our visual handle for a numerical POINTER. Labels are great, because it keeps us from having to manually count the offset value of our variables and procedures.

  DB - define Byte --- size = 1 byte
  DW - define Word --- size = 2 bytes
  DD - define Doubleword --- size = 4 bytes
  DQ - define Quadword --- size = 8 bytes
  DT - define Tenbyte --- size = 10 bytes

These directives can initiallize multiple variables at a time and Initialize each variable with a value. That value can be written as a number, a character in quotes, or as an expression.

  Byte_Var DB 23          ; defines byte with a value of 23

  Word_Var DW 1023          ; defines word with a value of 1023

  String_var DB "hello there"          ; defines 11 bytes in series. Each BYTE has the ascii value for it's respective character in the string being defined.

  Char_var DB 'A'          ; defines byte to the ascii value of 'A' - 41h

  Char_var DB 'A'+10          ; Here's an expression. Defines byte to the ascii value of 'A'+10, which equals ascii 'K' - 4Bh

  Word_var DW -1024          ; defines a negative word. See Signed Numbers.

  String_var DB "hello there",0          ; defines 12 bytes in series. The last byte is the Numerical value of Zero. This is a Null Terminated String which is sometimes used.

  String_var DB "hello there",0Dh,0Ah,'$'          ; defines 14 bytes in series. The byte value 0Dh (13d) is CarrageReturn. 0Ah (10d) is LineFeed. Last Byte is the ASCII char '$'. This is a DollarSign-Terminated string used by DOS.



3) Segments
You may or may not have read about Segments. As in SEGMENT:OFFSET addressing. It is the way memory is addressed in Real mode. Your program will have it's Code and Stack Segment registers set by DOS, but the data segment must be set by your code (your program). If you have more than a segment's worth of code (over 64KB), then you'll have to preform a FAR CALL or FAR JMP when you need to transfer execution to code not in the current Code Segment. OR, perhaps you'll have more than one segment's worth of data; in such a case your DS (Data Segment) or ES (Extra Segment) must be made to point to the segment which contains the variable you wish to access.
So, the programmer needs a way to get the different segment addresses into the proper segemnt registers. This is not hard stuff to do, the compilier is made to keep track code and data and where they belong in relation to the segments you've defined. Part of the Assembly Language syntax includes ways to define segments, and ways to move Segment addresses AND Offset addresses of code and data into registers.

Consequientially, the programmer must know when he'll have more than a segment's worth of code/data, and must decide how he/she wants to organized that code/data under deffernt segments. Keep in mind, one segment is 64 Kilobytes. I think that's alot!
It'll be a long time until you (the beginning assembler) needs more than 64KB for anything.

Defining Segments
The following is applicable to NASM when compiling code into the OBJ format. NASM can also compile code straight into a COM executable, but different rules apply to defining a COM file's segments.
Use the Segment directive. Followed by the name you wish to give that segment. Optionally followed by "qualifiers", which give certain attributes to the segment (some are PRIVATE, PUBLIC, COMMON and STACK).
exp: segment [ name ]
exp: segment [ name ] stack
exp: segment [ name ] private


Here's a situation:

You've declared a variable and you want to get it's value into a register. The following can be done in NASM:

segment data_area
var1 db 80h

segment code_area
..start:

mov ax, data
mov ds, ax        
;get var1's segment address into DS (data segment)
mov si, var1         ;get the var1's offset address into SI
mov al, [ds:si]         ;move the value at ram location DS:SI into AL: AL = 80h


In the sample code above, we moved the OFFSET address of var1 into the SI register.
You may ask, "what's an offset?" An offset is the distance of one thing from another. In a variable's case, that distance is measured in bytes.
var1 is Offset from the data_area segment marker, which marks the start of that segment.
How does the compilier know that var1 is addressable as an offset from data_area?
Because, var1 was declared under the data_area marker, that act tells the compilier to include var1 in that segment.
The code_area segment begins our code's segment, ending the data_area segment. Any label declared beneath code_area will have it's OFFSET measured from the code_area marker
When compiling for OBJ files (which normally get linked into an EXE), an Entry Point must be specified. The Entry Point marks the first instruction of the program. "..start" is the label NASM uses to mark the Entry Point.

You can move the segment address of any segment by using it's name, as shown above.

In any case, here's a way to get a variable's segment address into a register regardless of which segment it's in:

mov ax, seg var1
mov ds, ax
        ;cmp this method, with the one previously shown
mov si, var1



look in NASM's documentation for SEG, WRT and SEGMENT (or SECTION) to learn more.



If you fill a segment with more than 64Kb of data or code, the compilier will probably notice and issue an error. You will then need to make another segment to contain all your code or data.



4) Procedures
Procedures don't exist in NASM. TASM has you define a procedure by defining/ending (proc/endp) a procedure. Nasm doesn't care about that. Just put a label at the top of your procedure/function/routine (whatever you want to call it), and Jump to it, call it, whatever. The thing that truely makes a procedure, is that it has a RET instruction toreturn to the calling code.

5) Comments
Comments require a Semicolon. They are allowed on a line-by-line basis.
exp:

mov ax, 34       ; moves 34 into the AX register

anything after the semicolon (on that line) is ignored by the compilier.




The following is a skeleton of common ASM code. Copy and save it as blank.asm. Use it as a template to speed up writting new code.


;Project Name:
;Authur:
;Original Date:
;Date Modified:
;Purpose:
;Descrition:

segment stack stack
resb 100h

segment data

segment code

..start:

mov ax, data
mov ds, ax

;code goes here




As you develop your code, and write newer versions, it's good to keeps all the older code somewhere; so if someone plagurizes or steals your code, you can prove it is yours by showing how YOU wrote the code from it's Starting Stage into it's final form.









Now here's asm code, NASM style, with comments showing the values of the operations. If you don't understand the following, please come back and read it again after you've done a few tutorials.





segment data

wordvar_1 dw 132
wordvar_2 dw 476
wordvar_3 dw 796

segment code

..start:
mov ax, data
mov dx, ax

mov ax, wordvar_1         ;moves wordvar_1's offset address into AX
mov ax, [wordvar_1]         ;moves the word stored in memory, whose offset address is represented by the label 'wordvar_1', into AX

mov ax, [wordvar_1]         ;AX = 132
mov ax, wordvar_1         ;AX = 0
mov ax, [ds:0]         ;AX = 132

mov ax, [wordvar_2]         ;AX = 476
mov ax, wordvar_2         ;AX = 2
mov ax, [ds:2]         ;AX = 476

mov ax, [wordvar_3]         ;AX = 796
mov ax, wordvar_3         ;AX = 4
mov ax, [ds:4]         ;AX = 796