Re: Compiler Deisgn.

Gabor DEAK JAHN <djg@argus.vki.bke.hu>
16 May 1998 14:23:21 -0400

          From comp.compilers

Related articles
Compiler Deisgn. Matthew.Webb@net1.demon.co.uk (Matthew Webb) (1998-05-12)
Re: Compiler Deisgn. dwight@pentasoft.com (1998-05-15)
Re: Compiler Deisgn. abbottk@earthlink.net (Kirk Abbott) (1998-05-15)
Re: Compiler Deisgn. djg@argus.vki.bke.hu (Gabor DEAK JAHN) (1998-05-16)
Re: Compiler Deisgn. anton@mips.complang.tuwien.ac.at (1998-05-27)
| List of all articles for this month |

From: Gabor DEAK JAHN <djg@argus.vki.bke.hu>
Newsgroups: comp.compilers
Date: 16 May 1998 14:23:21 -0400
Organization: Compilers Central
References: 98-05-058
Keywords: disassemble

On 12 May 1998 22:17:10 -0400, Matthew Webb <Matthew.Webb@net1.demon.co.uk>
wrote:


> I have written a disassembler and a coming assembler. I have not
> studied compiler design and so do not know the best way of doing it. My
> diassembler/assembler are basically just one massive case statment on
> the bytes or text strings.
> Can anyone give another structure other than a case statment please?


For the disassembler you could use a table. After stripping the opcode
modifiers and overrides (like ES: or 32-bit overrides in 80x86),
table[opcode] should return the mnemonic and a code for the possible
addressing modes. With the second you can go on deciphering the rest
of the instruction.


For the assembler (still assuming 80x86 or a similarly complicated
syntax) I used a mixed approach. Due to the fixed syntax I didn't use
full-scale parsing but cut the source line into label, mnemonic and
argument parts. The mnemonic has a simple lookup table (by binary
search), the table containing the adresses of the analyzing
routines. I parse the arguments with a combination of recursive
descent and table-driven parsing. The main idea is that during the
parsing, I construct a 32-bit variable whose bits represent different
aspects of the complicated addressing modes of the 80x86. When I am
finished with the arguments, this variable contains the actual
addressing mode. The analyzing routines mentioned before use this
variable to find out the opcodes. They look like:


    if Analyze (opcode1, AddressMode, RegToMem) then
    if Analyze (opcode2, AddressMode, MemToReg) then
    if Analyze (opcode3, AddressMode, SegToMem) then
    if Analyze (opcode4, AddressMode, MemToMem) then Error;


That is, each call to Analyze checks whether AddrMode matches the
third argument (these are constants). If an Analyze finds a match, it
emits the opcode (in reality the argument list of Analyze is somewhat
more complicated), otherwise it returns with true and the search goes
on. If every check failed, we signal the error.




Bye,
    G=E1bor


-------------------------------------------------------------------
DEAK JAHN, Gabor,
Budapest,
Hungary.


E-mail: djg@argus.vki.bke.hu, deakjahn@ludens.inf.elte.hu
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.