::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. Feb/March 98 :::\_____\::::::::::. Issue 3 ::::::::::::::::::::::......................................................... Extending NASM by mammon_ Programmers transitioning to NASM from a commercial assembler such as MASM or TASM immediately notice the lack of any high-level language structures -- the assembly syntax accepted by NASM is only slightly more sophisticated than what you would find in a debugger. While this has its good side --smaller code size, nothing hidden from the programmer-- it does make coding a bit more tedious. For this reason NASM comes with a preprocessor that is both simple and powerful; by writing NASM macros, the high-level functionality of other assemblers can be emulated rather easily. As thw following macros will demonstrate, most of the high-level asm features in commercial assemblers really do not do anything very elaborate; they simply are more convenient for the programmer. The macros that I will detail below provide some basic C and ASM constructs for use in NASM. I have made the complete file available at http://www.eccentrica.org/Mammon/macros.asm The macro file can be included in a .asm file with the NASM directive %INCLUDE "macros.asm" Comments on the usage of each macro are included in the file. Macro Basics ------------ The fundamenal structure of a NASM macro is %macro {macroname} {# parameters} %endmacro The actual code resides on the line between the %macro and %endmacro tags; this code will be inserted into your program wherever NASM finds {macroname}. Thus you could create a macro to push the contents of each register such as: %macro SAVE_REGS 0 push eax push ebx push ecx push edx %endmacro Once you have defined this macro, you can use it in your code like: SAVE_REGS call ReadFile ...which the preprocessor will expand to push eax push ebx push ecx push edx call ReadFile before assembling. It should be noted that all preprocessing takes place in a single stage immediately before compiling starts; to preview what the pre- processor will send to the assembler, you can invoke nasm with the -e option. The %macro tag requires that you declare the number of paramters that will be passed to the macro. This can be a single number or a range, with a few quirks: %macro LilMac 0 ; takes 0 arguments %macro LilMac 5 ; takes 5 arguments %macro LilMac 0-3 ; takes 0-3 arguments %macro LilMac 1-* ; takes 1 to unlimited arguments %macro LilMac 1-2+ ; takes 1-2 arguments %macro LilMac 1-3 0, "OK" ; takes 1-3 arguments, 2-3 default to 0 & "OK" The last three examples bear some explanation. The "-*" operator in the %macro tag specifies that the macro can handle any number of parameters; in other words, there is no maximum number, and the minimum is whatever number is to the left of the "-*" operator. The "+" operator means that any additional arguments will be appended to the last argument instead of causing an error, so that: LilMac 0, OK, This argument is one too many will result in argument 1 being 0 and argument 2 being "OK, This argument is one too many." Note that this is a good way to pass commas as part of an argu- ment (normally they are only separators). Providing defualt arguments after the number of arguments allows a macro to be called with fewer arguments than it expects. %macro SAVE_VARS 1-4 ecx, ebx, eax will fill a missing 4th argument with eax, 3rd with ebx, and 2nd with ecx. Note that you have to provide defaults starting with the last argument and working backwards. The parameters to the macro are available as %1 for the first argument, %2 for the second, and so on, with %0 containing a count of all the arguments. There is an equivalent to the DOS "SHIFT" command called %rotate which will rotate the parameters to either the left or to the right depending on whether a positive or negative value was supplied: Before: %1 %2 %3 %4 Before: %1 %2 %3 %4 Before: %1 %2 %3 %4 %rotate 1 %rotate -1 %rotate 2 After: %4 %1 %2 %3 After: %2 %3 %4 %1 After: %3 %4 %1 %2 So that rotating by 1 will put the value at %1 into %4, and rotating by -1 will put the value of %1 into %2. High-Level Calls ---------------- Perhaps the buggest complaint about NASM is its primitive call syntax. In MASM and TASM, the parameters to a call may be appended to the call itself: call MessageBox, hOwner, lpszText, lpszTitle, fuStyle where in NASM the parameters must be pushed onto the stack prior to the call: push fuStyle push lpszTitle push lpszText push hOwner call MessageBox Using NASM's "-*" macro feature along with the %rep directive make a high-level call easy to replicate: %macro call 2-* %define _func %1 %rep &0-1 %rotate 1 push %1 %endrep call _func %endmacro The %define directive simply defines the variable _func [underscores should prefix variable names in macros so you do not mistakenly use the same name later in the program] as %1, the name of the function to call. The %rep and %endrep directives enclose the instructions to be repeated, and %rep takes as a parameter the number of repetitions [in this case set to the number of macro parameters minus 1]. Thus, the above macro cycles through the arguments to call and pushes them last-argument first [C syntax] before making the call. Overloading an existing instruction such as call will cause warnings at compile time [remember, the preprocessor thinks you are doing a recursive macro invoke] so usually you will want to name the macro "c_call" or something similar. The following macros provide facilities for C, Pascal, fastcall, and stdcall call syntaxes. ;==============================================================-High-Level Call ; ccall FuncName, param1, param2, param 3... ;Pascal: 1st-1st, no clean ; pcall FuncName, param1, param2, param 3... ;C: Last-1st, stack cleanup ; stdcall FuncName, param1, param2, param 3... ;StdCall: last-1st, no clean ; fastcall FuncName, param1, param2, param 3... ;FastCall: registers/stack %macro pcall 2-* %define _j %1 %rep %0-1 %rotate -1 push %1 %endrep call _j %endmacro %macro ccall 2-* %define _j %1 %assign __params %0-1 %rep %0-1 %rotate -1 push %1 %endrep call _j %assign __params __params * 4 add esp, __params %endmacro %macro stdcall 2-* %define _j %1 %rep %0-1 %rotate -1 push %1 %endrep call _j %endmacro %macro fastcall 2-* %define _j %1 %assign __pnum 1 %rep %0-4 %rotate -1 %if __pnum = 1 mov eax, %1 %elif __pnum = 2 mov edx, %1 %elif __pnum = 3 mov ebx, %1 %else push %1 %endif %assign __pnum __pnum+1 %endrep call _j %endmacro ;==========================================================================-END Switch-Case Blocks ------------------ One of the most awkward C constructs to code in assembly is the SWITCH-CASE block. It is also rather difficult to re-create as a macro due to variable number and length of CASE statements. NASM's preprocessor has a context stack which allows you to create a set of local variables and addresses which is specific to a particular invocation of a macro. Thus it becomes possible to refer to labels which will be created in a future macro by giving them context-dependent names: %macro MacPart1 0 %push mac ;create a context called "mac" jmp %$loc ;jump to context-specific label "loc" %endmacro %macro MacPart2 0 %ifctx mac ;if we are in context 'mac' %$loc: ;define label 'loc' xor eax, eax ;code at this label... ret %endif ;end the if block %pop ;destroy the 'mac' context %endmacro As you can see, the context is created and named with a %push directive, and destroyed with a $pop directive. NASM has a number of preprocessor conditional IF/ELSE statements; in the above example, the %ifctx [if current context equals] directive is used to determine if a 'mac' context has been created [Note that the 'base' NASM conditionals include %if, %elif, %else, and %endif; these carry over to the %ifctx directive, such that there is available %ifctx, %ifnctx, %elifctx, %elifnctx, %else, and %endif; all %if directives must be closed with an %endif directive]. Finally, %$ is used to prefix the name of a context- specific variable or label. Non-context-specific local labels use the %% prefix: %macro LOOP_XOR %%loop: pop eax xor eax, ebx test eax, eax jnz %%loop %endmacro The SWITCH-CASE macro that follows uses the syntax: SWITCH Variable CASE Int BREAK CASE Int BREAK DEFAULT ENDSWITCH Which could be implemented as follows: card db 0 ;card_variable Jack EQU 11 Queen EQU 12 King EQU 13 ... SWITCH card CASE Jack add edx, Jack BREAK CASE Queen add edx, Queen BREAK CASE King add edx, King BREAK DEFAULT add d, [card] ENDSWITCH Note that SWITCH moves the variable into eax and CASE moves the value into ebx. ;===========================================================-SWITCH-CASE Blocks %macro SWITCH 1 %push switch %assign __curr 1 mov eax, %1 jmp %$loc(__curr) %endmacro %macro CASE 1 %ifctx switch %$loc(__curr): %assign __curr __curr+1 mov ebx, %1 cmp eax, ebx jne %$loc(__curr) %endif %endmacro %macro DEFAULT 0 %ifctx switch %$loc(__curr): %endif %endmacro %macro BREAK 0 jmp %$endswitch %endmacro %macro ENDSWITCH 0 %ifctx switch %$endswitch: %pop %endif %endmacro ;==========================================================================-END If-Then Blocks -------------- While the preprocessor provides support for if-then directives, it is a slight bit of work to cause that to generate the equivalent assembly language 'if' code [ the preprocessor 'if' is resolved before compile time, not at run time]. Using macros, you can create if-then blocks with the following structure: IF Value, Cond, Value ;if code here ELSIF Value, Cond, Value ;else-if code here ELSE ;else code here ENDIF An example being: IF [Passwd], e, [GoodVal] ;e == equals or je jmp Registered ELSE jmp FormatHardDrive ENDIF The trickiest part about this macro sequence is the 'Cond' parameter. NASM allows condition codes [the 'cc' in 'jcc' that you findin opcode refs] to be passed to macros; these condition codes are simply the 'jcc' with the 'j' cut off -- 'jnz' becomes 'nz', 'jne' becomes 'ne', 'je' becomes 'e', and so on. The reason for this is that the condition code is appended to a 'j' later in the macro: %macro Jumper %1 %2 %3 ;JUMPER Reg1, cc, Reg2 cmp %1, %3 j%+2 Gotcha jmp error %endmacro The above code appends %2 to the 'j' with the directive j%+2. Note that if you use j%- instead of j%+, NASM will insert the *inverse* condition code, so that jz becomes jnz, etc. For example, calling the macro %macro Jumper2 %1 j%-1 JmpHandler %endmacro with the invocation 'Jumper2 nz' would assemble the code 'jz JmpHandler'. The condition codes can be a bit tricky to work with; it is advisable to add a sequence such as the following to the macro file: %define EQUAL e %define NOTEQUAL ne %define G-THAN g %define L-THAN l %define G-THAN-EQ ge %define L-THAN-EQ le %define ZERO z %DEFINE NOTZERO nz so that you could call the IF macro as follows: IF PassWd, EQUAL, GoodVal ;if code here ...etc etc. Note also that the IF-THEN-ELSE macros put the passed values into eax and ebx for compatison, so these registers will need to be preserved. ;===========================================================-IF-THEN-ELSE Loops %macro IF 3 %push if %assign __curr 1 mov eax, %1 mov ebx, %3 cmp eax, ebx j%+2 %%if_code jmp %$loc(__curr) %%if_code: %endmacro %macro ELSIF 3 %ifctx if jmp %$end_if %$loc(__curr): %assign __curr __curr+1 mov eax, %1 mov ebx, %3 cmp eax, ebx j%+2 %%elsif_code jmp %$loc(__curr) %%elsif_code: %else %error "'ELSIF' can only be used following 'IF'" %endif %endmacro %macro ELSE 0 %ifctx if jmp %$end_if %$loc(__curr): %assign __curr __curr+1 %else %error "'ELSE' can only be used following an 'IF'" %endif %endmacro %macro ENDIF 0 %$loc(__curr): %$end_if: %pop %endmacro ;==========================================================================-END For/While Loops --------------- The DO...FOR and DO...WHILE do nothing differnet from the previous macros, but are simply a different application of the same principles. The syntax for calling these macros is: DO ;code to do here FOR min, Cond, max, step DO ;code to do here WHILE variable, Cond, value It is perhaps easiest to illustrate this by comparing the macros with C code. for( x = 0; x <= 100; x++) { SomeFunc() } Equates to: DO call SomeFunc FOR 0, l, 100, 1 Likewise, for( x = 0; x != 100; x--) { SomeFunc() } Equates to: DO call SomeFunc FOR 0, e, 100, -1 The WHILE macro is similar: while( CurrByte != BadAddr) {SomeFunc() } Equates to: DO call SomeFunc WHILE CurrByte, ne, BadAddr Once again, eax and ebx are used in the FOR and WHILE macros. ;====================================================-DO-FOR and DO-WHILE Loops %macro DO 0 %push do jmp %$init_loop %$start_loop: push eax %endmacro %macro FOR 4 %ifctx do pop eax add eax, %4 cmp eax, %3 j%-2 %%end_loop jmp %$start_loop %$init_loop: mov eax, %1 jmp %$start_loop %%end_loop: %pop %endif %endmacro %macro WHILE 3 %ifctx do pop eax mov ebx, %3 cmp eax, ebx j%+2 %%end_loop jmp %$start_loop %$init_loop: mov eax, %1 jmp %$start_loop %%end_loop: %pop %endif %endmacro ;==========================================================================-END Data Declarations ----------------- Declaring data is relatively simple in assembly, but sometimes it helps to make code more clear if you create macros that assign meaningful data types to variables, even if those macros simply resolve to a DB or a DD. The following macros demonstrate this concept. They are invoked as follows: CHAR Name, String ;e.g. CHAR UserName, "Joe User" INT Name, Byte ;e.g. INT Timeout, 30 WORD Name, Word ;e.g. WORD Logins DWORD Name, Dword ;e.g. DWORD Password Note that when invoked with a name but not a value, these macros create empty [DB 0] variables. ;============================================================-Data Declarations %macro CHAR 1-2 0 %1: DB %2,0 %endmacro %macro INT 1-2 0 %1: DB %2 %endmacro %macro WORD 1-2 0 %1: DW %2 %endmacro %macro DWORD 1-2 0 %1: DD %2 %endmacro ;==========================================================================-END Procedure Declarations ---------------------- Procedure declarations are another matter of convenience. It is often useful in your code to clearly delineate the start and end of a procedure; each of the PROC macros below does that, as well as creating a stack fram for the procedure. The ENTRYPROC macro creates a procedure named 'main' and declares main as a global symbol; the standard PROC declares the provided name as global. These macros can be used as follows: PROC ProcName Parameter1, Parameter2, Parameter3 ;procedure code here ENDP ENTRYPROC ;entry-procedure code here ENDP Note that the Parameters to PROC are set up to EQU to offsets from ebp, e.g. ebp-4, ebp-8, etc. I have also included support for local variables, which will EQU to positive offsets from ebp' these may be used as follows: PROC ProcName Parameter1, Parameter2, Parameter3... LOCALDD Dword_Variable LOCALDW Word_Variable LOCALDB Byte_Variable ;procedure code here ENDP ;=======================================================-Procedure Declarations %macro PROC 1-9 GLOBAL %1 %1: %assign _i 4 %rep %0-1 %2 equ [ebp-_i] %assign _i _i+4 %rotate 1 %endrep push ebp mov ebp, esp %push local %assign __ll 0 %endmacro %macro ENDP 0 %ifctx local %pop %endif pop ebp %endmacro %macro ENTRYPROC 0 PROC main %endmacro %macro LOCALVAR 1 sub esp, 4 %1 equ [ebp + __ll] %endmacro %macro LOCALDB 1 %assign __ll __ll+1 LOCALVAR %1 %endmacro %macro LOCALDW 1 %assign __ll __ll+2 LOCALVAR %1 %endmacro %macro LOCALDD 1 %assign __ll __ll+4 LOCALVAR %1 %endmacro ;==========================================================================-END Further Extension ----------------- Continued experimentation will of course prove fruitful. It is recommended that you read/print out chapter 4 of the NASM manual for reference. In addition, it is very helpful to test your macros by cpmpiling the source with "nasm -e", which will output the preprocessed source code to stdout and will not compile the program. ____________________________________________________________________________ ______ _____. ____ ``` ._____/\______.________._________. ._\___ |__\_ /. \\ | | | _ | | (_ | __/ |CE , .=|_____/\______| |----)____| |______|______|======[ Win 32 ASM ]===. '===============| :===================================================' NASM specific Win32 coding by Tamas Kaproncai Contents ======== 0. Preface 1. Compiling 2. Include files 3. Library files 4. Importing API functions 5. Calling API functions 6. WinMain 7. Window procedure 8. Sections 9. Self modification 0. Preface ========== I will introduce the win32 coding and I will focus on the NASM specific part. Downloadable working examples: ftp://ftp.szif.hu/pub/demos/tool/w32nasm.zip http://rs1.szif.hu/~tomcat/win32 There is another tutorial on this topic, called: "The Win32 NASM Coding Toolkit v0.02 by Gij" that uses the LCC linker and the resource compiler which comes with LCC. 1. Compiling ============ I'm working with the following free programs in connection with NASM: - linker: ALINK v1.5 by Anthony A.J. Williams. - resource compiler: GoRC v0.50b by Jeremy Gordon. The process of compiling a win32 program involves a number of steps which can be divided into three main processes: preparing the include files, preparing the library files, and writing the actual program. The compiling flow chart ------------------------ .h -> ? -> .inc \ .asm -> NASM -> .obj \ .rc -> GORC -> .res -> ALINK -> .exe / .dll -> IMPLIB -> .lib (? means handwork) 2. Include files ================ The include files (*.inc) must be generated from existing header files (*.h) that come with win32-compatible C or Pascal compilers. Files needed: WIN32N.INC (Thanks for the inital MASM version to S.L.Hutchesson). The compiler will be NASM version 0.97 http://www.cryogen.com/nasm Usage: nasmw -fobj -w+orphan-labels -pwin32n.inc %1.asm 3. Library files ================ Files Needed: WIN32.LIB The linker will be ALINK http://www.geocities.com/SiliconValley/Network/4311/#alink Usage: alink -oPE %1 win32.lib %1.res %2 %3 More lib files can be created with IMPLIB. Example: IMPLIB DDRAW.DLL 4. Importing API functions ========================== EXTERN MessageBoxA IMPORT MessageBoxA use32.dll 5. Calling API functions ======================== PUSH UINT MB_OK PUSH LPCTSTR title1 PUSH LPCTSTR string1 PUSH HWND NULL CALL [MessageBoxA] 6. WinMain ========== You don't need to use the name, WinMain: You must start the program with the label, ..start: At the begening there is nothing special in the stack, so you should call GetModuleHandleA for hInstance and GetCommandLineA for the command line. (Command line consists the full path, the file name and the parameters). You can exit the program with: RETN or you should call the ExitProcess function: PUSH UINT 0 ; the error code CALL [ExitProcess] 7. Window procedure =================== There are four parameters on the top of the stack: PUSH EBP MOV EBP,ESP %DEFINE hwnd EBP+8 ;handle of window %DEFINE message EBP+12 ;message %DEFINE wParam EBP+16 ;first message parameter %DEFINE lParam EBP+20 ;second message parameter You can handle the messages depends on WPARAM [wParam] and the rest you can pass to DefWindowProcA: PUSH LPARAM [lParam] PUSH WPARAM [wParam] PUSH UINT [message] PUSH HWND [hwnd] CALL [DefWindowProcA] POP EBP RETN 16 8. Sections =========== You need a code section: SECTION CODE USE32 CLASS=CODE and a data section: SECTION DATA USE32 CLASS=DATA You don't need bss section, instead of you should append every RESB, RESW, RESD, RESQ to the end of the source code. This zero data not will be included to the exe file. 9. Self modification ==================== You can include your code and data together in one section: SECTION CODE USE32 CLASS=CODE In that case you need another object file, with only one line source: SECTION CODE USE32 CLASS=DATA ALINK will combine the properties of these two sections. EXTERN MessageBoxA EXTERN ExitProcess SECTION CODE USE32 CLASS=CODE ..start: PUSH UINT MB_OK PUSH LPCTSTR title1 PUSH LPCTSTR string1 PUSH HWND NULL CALL MessageBoxA PUSH UINT NULL CALL ExitProcess SECTION DATA USE32 CLASS=DATA string1: db 'Hello world!',13,10,0 title1: db 'Hello',0 ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::.......................................................FIN