(C) 2004-2006 Julián Albo.
Use and distribution allowed under the terms of the GPL license.
Last revision date: 28-may-2006
Current Pasmo version: 0.6.0 (in progress)
Pasmo is a multiplatform Z80 and 8080 cross-assembler, easy to compile
and easy to use.
It can generate object code in several formats suitable for many Z80
machines and emulators.
Pasmo generates fixed position code, can not be used to create relocatable
object files for use with linkers.
Pasmo is compatible with the syntax used in several old assemblers, by
supporting several styles of numeric and string literals and by
providing several names of the most used directives.
However, in Pasmo the Z80 mnemonics, register and flags names and
directives are reserved words, this may require changes of symbol
names conflicting in some programs.
Pasmo can also generate the 8086 equivalent to the z80 assembly code. It can create COM files for ms-dos, by using the binary generation mode, or CMD files for CP/M 86, by using the --cmd generation mode. This feature is experimental, use with care.
Download Pasmo from
http://www.arrakis.es/~ninsesabe/pasmo/.
Several binary executable are provided in the web, if your platform
is not between these, or wants a more recent version, you must
download the source package and compile it. If you want to compile it
in windows you can use cygwin or mingw with the Makefile provided, with
other compilers you may need to create a project, workspace or whatever
your compiler or IDE uses.
To compile you need gcc version 2.95 or later, with the c++ language
included (usually a package called g++-something).
Others compilers may also be used, any reasonable standard complaint
c++ compiler must compile it with few or none corrections.
From version 0.5.2 a configure script is provided. You can use the
usual './configure ; make ; make install' procedure.
You have also an official Debian package.
Pasmo is invoked from command line as:
pasmo [options] [file.asm] [file.bin] [file.symbol [file.publics] ]
Where file.asm is the source file, file.bin is the object file to be created and optionally file.symbol is the file where the symbol table will be written and file.publics is the file for the public symbols table. Both symbol file names can be an empty string for no generation or - to write in the standard output. When the --public option is used this is handled in another way, see below. The source and object file can also be specified with the options --input and --output. If the --link option is used the source file must be omitted.
Options can be zero or more of the following:
When no option for code generation is specified, --bin is used by default.
The -d option is intended to debug pasmo itself, but can also be useful to find errors in asm code. When used the information is showed in the standard output. Error messages goes to error ouptut unless the --err option is used.
If none of the code generation options is specified, then --bin mode is used by default.
The --bin mode just dumps the code generated from the first position used without any header. This mode can be used for direct generation of CP/M or MSX COM files, supposed that you use a ORG 100H directive at the beginning of the code, or to generate blocks of code to be INCBINed in other programs.
The --com mode is similar to --bin, but code is assembled or linked starting at 0100 hex without the need to ORG it. The reason for the use of the two options is to make easier to assemble whithout changes code written for DR ASM or similar assemblers and code written for use with a relocatable asembler such as DR RMAC and a linker.
The --dump mode generates a human readable uppercase hexadecimal dump, consisting of lines of 16 hexadcimal two digit numbers separated by spaces with the address of the code at the beginning of the line as a four digit hexadecimal number followed by a collon and a space.
The --hex mode generates code in Intel HEX format. This format can be used with the LOAD or HEXCOM CP/M utilities, can be transmitted more easily than a binary format, and is also used in some PROM programming tools.
The prl format is used in several variants of Digital Research CP/M operating system. In pasmo is supported only to create RSX files for use in CP/M Plus, use for PRL files in MP/M is not supported because I don't have a MP/M system, real or emulated, where to test it.
The REL relocatable format is used by Digital Research and Microsoft assemblers and compilers, and many others. This version of Pasmo has preliminar support for it, supporting only program segment, PUBLIC symbols, and sources that not use ORG directives. Note that for compatibility you probably must use the --nocase mode, or write all public symbols in upper case. Remeber also that in REL files the identifiers are truncated to 6 characteres.
The --cmd option generates a CP/M 86 CMD mode, using the 8080 memory model of CP/M 86. Used in conjuction with the --86 option can easily generate CP/M 86 executables from CP/M 80 sources with minimal changes.
The --tap options generates a tap file with a code block, with the loading position set to the beginnig of the code so you can load it from Basic with a LOAD "" CODE instruction.
Same as --tap but using tzx format instead of tap.
The --cdt options generates a cdt file with a code block, with the loading position set to the beginning of the code and the start address to the start point specified in the source, if any, so you can use RUN "" to execute it or LOAD "" to load it.
With the --tapbas option a tap file is generated with two parts: a Basic loader and a code block with the object code. The Basic loader does a CLEAR before the initial address of the code, loads the code, and executes it if a entry point is defined (see the END directive). That way you can directly launch the code in a emulator, or transfer it to a tape for use in a real Spectrum.
Same as --tapbas but using tzx format instead of tap.
Same as --tapbas but using cdt format instead of tap and with a Locomotive Basic loader instead of Spectrum Basic.
Generate the object file in plus3dos format, used by the Spectrum +3 disks. The file can be loaded from Basic with a LOAD "filename" CODE instruction.
Generate the object file with Amsdos header, used by the Amstrad CPC on disk files. The file generated can be loaded from Basic with LOAD "filename", address or executed with RUN "filename" if an entry point has been specified in the source (see the END directive).
Generate the object file with header for use with BLOAD in MSX Basic.
The symbol table generated contains all identifiers used in the program, with the locals represented as a 8 digit hexadecimal number in order of use, unless the --public option is used. In that case only the symbols specified in PUBLIC directives are listed.
The symbol table format is a list of EQU directives. That way you can INCLUDE it in another source to create programs composed of several blocks.
Source code files must be valid text files in the platform used. The use of, for example, unix text files under pasmo in windows, is unsupported and the result is undefined (may depend of the compiler used to build pasmo, for example). The result of the use of a file that contains vertical tab or form feed characters is also undefined.
Some symbols have several meanings depending of his use and the context,
this is caused by the intent to be source compatible with several old
assemblers and to allow the use of operators commonly used in another
languages.
The recommended way to avoid mistakes is to always separate the operators
and his operands with white space, specially inside macros.
Everything after a ; in a line is a comment (unlees the ; is part of a
string literal, of course).
There are no multiline comments, you can use IF 0 .... ENDIF instead
(but see INCLUDE).
If the comment begins with ;; instead of a single ; it will not be
included in macro expansion.
String literals are written to the object file without any character set translation. Then the use of any character with a different meaning in the platform were pasmo is running and the destination machine must be avoided, and the code of the character may be used instead. That also means that using Pasmo in any machine that uses a non ascii compatible character set may be difficult, and that a source written in utf-8 may give undesired results. This may be changed in future versions of Pasmo.
A line may begin with a decimal number followed by blanks. This number is ignored by the assembler, is allowed for compatibility with old assemblers. The line number reported in errors is the sequential number of the line in the file, not this.
Blanks are significative only in string literals and when they separate lexical elements. Any number of blanks has the same meaning as one. A blank between operators and operands is allowed but no required except when the same character has other meaning as prefix ('$' and '%', for example).
Numeric literals can be written in decimal, binary, octal and hexadecimal formats. Several formats are accepted to obtain compatibility with the source format of several assemblers.
A literal that begins with $ is a hexadecimal constant, except if the literal is only the $ symbol, in that case is an operator, see below.
A literal that begins with # is a hexadecimal constant, except if there are two consecitives #, see the ## operator.
A literal that begins with & can be hexadecimal, octal or binary constant, depending of the character that follows the &: H means hexadecimal, O octal and X hexadecimal, if none of this the caracter must be a valid hexadecimal digit and the constant is hexadecimal. See also the use of & in macros.
A literal that begins with % is a binary constant. See also the use of % in macro arguments.
A literal that begins with a decimal digit can be a decimal, binary, octal or hexadecimal. If the digit is 0 and the following character is an X, the number is hexadecimal. If not, the suffix of the literal is examined: D means decimal, B binary, H hexadcimal and O or Q octal, in any other case is taken as decimal. Take care, FFFFh for example is not an hexadecimal constant, is an identifier, to write it with the suffix notation you must do it as 0FFFFh.
All numeric formats can have $ signs between the digits to improve readability. They are ignored.
There are two formats of string literals: single or double quote delimited.
A string literal delimited with single quotes is the simpler format, all characters are included in the string without any special interpretation, with the only exception that two consecutive single quotes are taken as one single quote character to be included in the string. For example: the single quote delimited string 'That''s all folks' generates the same string as the double quote delimited "That's all folks".
A string literal delimited with double quotes is interpreted in a way similar to the C and C++ languages. The \ character is taken as escape character, with the following interpretations: n is a new line character (0A hex), r is a carriage return (0D hex), t is a tabulator (09 hex), a is a bell (07 hex), x indicates that the two next characters will be considered the hexadecimal code of a char and a char with that code is inserted, an octal digit prefixes and begins an octal number of up to three digits, and the corresponding character is inserted into the string, the characters \ and " means to insert itself in the string, and any other char is reserved for future use.
A string literal of length 1 ot 2 can be used as a numeric constant with the numeric value of the first character, and the second in his case as the high order byte. This allows expressions such as 'A' + 80h to be evaluated as expected.
Identifiers are the names used for labels, EQU and DEFL symbols and macro names and parameters. The names of the Z80 mnemonics, registers and flag names, and of pasmo operands and assemble directives are reserved and can not be used as names of identifiers, except in macro parameters. Reserved names are case insensitive, even if case sensitive mode is used.
In the following 'letter' means an english letter character in upper or lower case. Characters that correspond to letters in other languages are not allowed in identifiers.
Identifiers begins with a letter, '_', '?', '@' or '.', followed for zero or more letter, decimal digit, '_', '?', '@', '.' or '$'. The '$' are ignored, but a reserved word with a '$' embedded or appended is not recognized as such.
Identifiers that begins with '_' are special when using autolocal mode, see the --alocal option and the chapter about labels for details. The check for autolocal is done before stripping '$', then $_name is not considered local.
Identifiers are case sensitive if the option --nocase is not used. When using --nocase, they are always converted to upper case.
File names are used in the INCLUDE and INCBIN directives. They follow special rules.
A file name that begins with a double quote character must end with another double quote, and the file name contains all character between them without any special interpretation.
A file name that begins with a single quote character must end with another single quote, and the file name contains all character between them without any special interpretation.
In any other case all characteres until the next blank or the end of line are considered part of the file name. Blank characters are space and tab.
A label can be placed at the beginning of any line, before any assembler mnemonic or directive. Optionally can be followed by a ':', but is not recommended to use it in directives, for compatibility with other assemblers, however it may be needed if a macro with the same name as the label is already defined.
A line that has a label with no mnemonic nor directive is also valid.
The label has special meaning in the MACRO, EQU and DEFL directives, in any other case the value of the current code generation position is assigned to the label.
Labels can be used before his definition, but the result of doing this with labels assigned with DEFL is undefined.
The value of a label cannot be changed unless DEFL is used in all assignments of that label. If the value assigned to a label is different in the two passes of the assembly the program is illegal, but is not guaranteed that an error is generated. However, is legal to assign a value undefined in the first pass (by using an expression that contains a label not yet defined, for example).
In the default mode a label is global unless declared as LOCAL into a MACRO, REPT or IRP block, see the LOCAL directive for details.
In the autolocal mode, introduced by using the --alocal command line
option, all labels that begins with a '_' are locals.
His ambit ends at the next non local label or in the next PROC, LOCAL,
MACRO, REPT, IRP, IRPC, ENDP or ENDM directive.
The check for autolocals is done before stripping the '$' in the
identifier, thus $_this_label_is_not_autolocal.
Both automatic and explicit local labels are represented in the symbol table listing as 8 digit hexadecimal numbers, corresponding to the first use of the label in the source.
List of directives supported in Pasmo, in alphabetical order.
All numeric values are taken as 16 bits unsigned, using 2 complement or trucating when required. Logical operators return FFFF hex for true and 0 for false, in the arguments 0 is false and any other value true.
Parenthesis may be used to group parts of expressions. They are also used to express indirections in the z80 instructions that allows or require it. This can cause some errors when a parenthesized expression is used in a place were an indirections is allowed. Pasmo uses some heuristic to allow the expression to be correctly interpreted, but are far from perfect.
Using the bracket only mode the parenthesis have the unique meaning of grouping expressions, brackets are required for indirections, thus solving ambiguities.
Short circuit evaluation: the && and || operators and the conditional expression are short circuited. This means that if one of his operators need not be evaluted, it can include undefined symbols or divisions by 0 without generating an error (but still must have correct syntax). In the conditional expression this applies to the branch not taken, in the && operator to the second operand if the first is false, and in the || operator to the second operand if the first is true.
Table of operators by order of precedence, those in the same line have the same precedence:
## (see note) $, NUL, DEFINED *, /, MOD, %, SHL, SHR, <<, >> +, - (binary) EQ, NE, LT, LE, GT, GE, =, !=, <, >, <=, >= NOT, ~, !, +, - (unary) AND, & OR, |, XOR && || HIGH, LOW ?
The ## operator is an special case, is processed during the macro expansion, see the chapter about macros.
Note that the precedence is not the same as in some old assemblers, specialy MASM. Always mark prcedence with parenthesis in code intended to be use with several different assemblers.
Note also that HIGH an LOW are operators, not functions. To avoid confusions with precedente rukes the syntax required is not HIGH (argument), but (HIGH argument), or even (HIGH (argument) ) inside macros if the argument contains macro parameters.
There are two types of macro directives: the proper MACRO directive and the repetition directives REPT and IRP. In addition the ENDM and EXITM directives controls the end of the macro expansion.
A macro parameter is an indentifier that when the macro is expanded is substitued by the value of the argument applied. The identifier used can have the same name of a keyword, the keyword is not recognized as such in that case. Be careful, the readers of the macro code may get confused with that.
By default inside a macro a parameter is expanded by substituing it with
the correspondig argument in the macro call.
If a MACRO is defined inside another macro directive the external parameters
are not substitued, with the other macro directives the parameter
substitution is done beginnig by the most external directive.
The NUL operator can be used to check if the argument passed to the parameter
is not empty. The .SHIFT directive can be used to work with an undeterminated
number of arguments.
Identifier pasting: inside a macro the operator ## can be used to join two idenfiers resulting in another identifier. This is intended to allow the creation of identifiers dependent of macro arguments.
Forced expansion: the & is used to explicitly mark the expansion of a macro argument following it without whitespace. It also allows parameter expansion inside a string literal. Unlike the ## operator, when using to create identifiers it does not supress whitespace preceding it.
Macro arguments are passed literally by default. This is not desirable in cases when the value of a expression, because precedence rules can modify the result of the expansion inside other expressions. In that cases the % operator can be used at the beginning of a macro argument, it evaluates the expression following, takes his value as number literal in decimal and passes it as the effective macro argument.
There are two types of macro arguments: the firts is a comma separated list
of items, where each item is an arbitrary list of tokens. The second is a
comma separated list of items between angle brackets. Each item can be any
token, or another angle bracket delimited expression. Wite space between
items is ignored.
The nested angle bracket delimited arguments can be used to include in the
argument withespace or commas, or to pass arguments to be used as arguments
of another macro.
A macro defined with the MACRO directive is called by using his name as if it where a directive or instruction. The macro arguments are evaluated and assigned to the MACRO parameters. If there are less arguments than parameters, the remainder get assigned an empty value. If there are more, the remaninig arguments are stored but not asigned to any parameter, they can only be accessed by using the .SHIFT directive inside the macro.
There are some special cases when using the name of an already defined macro name:
This is a label with the same name as the macro:
macroname: ....
This is a redefintion of macroname:
macroname MACRO ...
In the following cases MACRO is taken as the beginning of the arguments of macroname. I don't know any possible good use of that, it can be banned in future versions. If you consider it useful, please send me code samples of his use.
otherlabel macroname MACRO ... otherlabel: macroname MACRO ...
IRP parameter, argument list.Repeats the block of code between the IRP directive and his corresponding ENDM one time for each of the arguments.
IRPC parameter, character list.Repeats the block of code between the IRPC directive and his corresponding ENDM one time for each character in the list. The character list can be a literal string or a macro argument betwee angle brackets.
name MACRO [ list of parameters]or:
MACRO name [ , list of parameters]In all cases, list of paramenters is a comma separated list of identifiers, and name is the name assigned to the macro created.
REPT count REPT count, varname REPT count, varname, initial REPT count, varname, initial, incrementUpdate: In 0.6.0 the loop var is no longer a symbol, it is expanded like a macro argument. That way will be much more useful with the new macro expansion capabilities.
The assumption of Pasmo if that, being a cross-assembler, it will be
used on a machine with many available resources. Then I do not make
any effort to provide means to do things that can be easily made
with other utilities, unless I think (or other people convince me)
that including it in Pasmo can be much more convenient.
For example, if you want to create a sin table you can write a program
in your favourite language that writes a file with the table and
INCLUDE that file, and if you want to automate that type of things you
can use make.
Taken that into account, I am open to suggestions to improve Pasmo and to patches that implements it. In the later case please take care to write things in a portable way, without operating system or compiler dependences.
Note:
Please do not send patches during the current development of 0.6.0 version,
I'm still rewriting and rearranging some parts and integrate patches will
be difficult.
E-mail me about the feature desired or bug, instead.
Pasmo has a simple code generator that uses absolute address of memory. That will make difficult to adapt it to generate relocatable code for use with linkers. I don't have plans to do it for the moment, maybe someone want to contribute?
Update: Starting with version 0.6.0 pasmo can generate linkable code in REL format, and link modules in that format. This feature is currently unfinished and poorly tested, please use with care and report bugs.
Some people suggested to add support for Game Boy programming. There are two problems, the simplified way used to generate code in Pasmo, and my inexistent knowledge of the Game Boy.
Thanks to all people that has made suggestions and notified or corrected bugs. And to these that show me the beautiful things they do with Pasmo.
You can use Pasmo to convert any binary file to .tap, just write a tiny program called for example convert.asm:
ORG address_to_load_the_file INCBIN file.bin
Assemble it with: pasmo --tap convert.asm file.tap, and you have it. The same may be done for the other formats supported.
To obtain the code of an instruction you can do:
echo 'ld a,b' | pasmo --input - -o - --dump
Pasmo emits a warning when using a expression that looks line a non
existent z80 instruction, such as 'ld b, (nn)', but the simplified
way used to detect that also warns in cases like
'ld b,(i1+i2)*(i3+i4)'.
A way to avoid the warning in that case is to prefix the expression
with parenthesis with '+' or '0 +'.
Using the bracket only mode the problem does not exist, in that case
the parenthesis are always taken as expresssions (and the programmer
is supposed to know that), thus the warning is not emitted.
More suggestions about that are wellcome.
Update: The new parser in 0.6.0 does a much better work, warning only if the expression is entirely inside parenthesis.
There is no way to include a file whose name contains blanks, single and double quoutes. Someone use file names like that?
That's all folks!
Send comments and criticisms to:
julian.notfound@gmail.com