Generic Assembler?

"Chris Williams" <thesagerat@yahoo.co.jp>
28 Feb 2005 00:50:51 -0500

          From comp.compilers

Related articles
Generic Assembler? thesagerat@yahoo.co.jp (Chris Williams) (2005-02-28)
Re: Generic Assembler? joe@burgershack.com (Randy) (2005-02-28)
Re: Generic Assembler? gah@ugcs.caltech.edu (glen herrmannsfeldt) (2005-03-01)
Re: Generic Assembler? vidar@hokstad.name (Vidar Hokstad) (2005-03-04)
| List of all articles for this month |

From: "Chris Williams" <thesagerat@yahoo.co.jp>
Newsgroups: comp.compilers
Date: 28 Feb 2005 00:50:51 -0500
Organization: http://groups.google.com
Keywords: assembler
Posted-Date: 28 Feb 2005 00:50:51 EST

I have started work (i.e. research and structure-creating algorithms)
on a compiler that is intended to work solely as a scripted
preprocessor--preprocessing the source code in levels based on
symbological and structural definitions, then translating that into a
lower level format via scripting. This process would be repeated for
varying layers of languages, with the output of each being fed into
the next layer.


C ruleset -> Generic Assembler Language -> Platform Assembler -> Bytes
-> Object File


For instance would be the various rulesets that could be applied. (The
middle three would probably appear as a single file though, with the
last simply being a library of functions that abstracted an object
file.)
Below is an example of what such a rule file might look like.


iso_c.rules
-----------
|rules uses="assembler.rules" > |> Tell it we are defining new rules
which produce code that needs to be parsed by the assembler.rules
ruleset <|


      |comment single >//<
      |comment multiple >/*|.*<*/<


      |structure if >
            if|space*<(|code if_condition<)|space*<{|code if_code<}|space*<
            |optional >
                  |optional multiple > |> can be any number of these <|
                        |structure else_if >
                              else|space+<if|space*<(|code
else_if_condition<)|space*<{|code else_if_code<}|space*<
                        <
                  <
                  |structure else >
                        else|space*<{|code else_code<}|space*<
                  <
            <
      >>
            |writeln "cmp " . if_condition . ", 0"<


            |exists else_if > |- if () {} else if () {}
                  |writeln "je ELSE_IF_" . |id else_if[0]<<
                  |writeln if_code<


                  |for L = 0 # L < else_if.length # L++ >
                        |writeln "ELSE_IF_" . |id else_if[L]< . ":"< |> Label <|
                        |writeln "cmp " . else_if_condition[L] . ", 0"


                        |exists else_if[L + 1] > |> Check if there is another
else if <|
                              |writeln "je ELSE_IF_" . |id else_if[L + 1]<<
                              |writeln else_if_code[L]<
                              |writeln "jmp STRUCT_END_" . |id if<<


                        >> |> No more else ifs <|
                              |exists else > |> Check if there is an else <|
                                    |writeln "je ELSE_" . |id else<<
                                    |writeln else_if_code[L]<
                                    |writeln "jmp STRUCT_END_" . |id if<<


                                    |writeln "ELSE_" . |id else< . ":"< |> Label <|
                                    |writeln else_code<


                              >> |> No else <|
                                    |writeln "je STRUCT_END_" . |id if<<
                                    |writeln else_if_code[L]<
                              <
                        <
                  <


            >> |> No else if statements <|
                  |exists else > |- if () {} else {}
                        |writeln "je ELSE_" . |id else<<
                        |writeln if_code<
                        |writeln "jmp STRUCT_END_" . |id if<<


                        |writeln "ELSE_" . |id else< . ":"< |> Label <|
                        |writeln else_code<


                  >> |- if () {}
                        |writeln "je STRUCT_END_" . |id if<<
                  <
            <


            |writeln "STRUCT_END_" . |id if< . ":"< |> Label for end <|
      < |> End of if structure <|


< |> Finished defining rules <|
-----------


In the above, we define a regular expression for an if/else if/else
structure, and then a body of code which will be called for each
instance of that structure with various variables already initialised
by the parser that we can access to be able to create a proper
assembler representation.


Anyways, my hope is to have the assembler.rules file be a processor
specific file that will accept processor specific assembler _and_ a
generic assembler language which would be required of any
assembler.rules file (and which files like iso_c.rules would be
required to use exclusively for portability.) But I only know x86
family assembler and wanted to inquire if there were any particular
instructions that seem fairly universal and/or can vey easily be
spoofed in two instructions on any non-supporting chip?


Certainly I can just take C and all of the various binary operations it
has plus "return", but I would like to get as many extra ones in there
if they seem to be fairly universal (so it would be worthwhile to code
in the generic assembler things other than language definitions.) Does
anyone know of any instructions like this?


As to the language itself, again I only know x86, so am a bit worried
that there are instruction sets which are entirely off-the-wall from
what I know (like C compared to Lisp.) If there is something like this,
could you please point me to some reference material on it, so I can
study it?


Also kindly note that I have never created a compiler and am only on
about page 14 of the Dragon Book, so...be gentle.


Thank you,
Chris Williams


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.