Re: Avoiding Branch Misprediction in Virtual Machine (VM) Portably

Eliot Miranda <eliotm@pacbell.net>
Mon, 14 May 2007 21:16:42 GMT

          From comp.compilers

Related articles
Avoiding Branch Misprediction in Virtual Machine (VM) Portably moron451-compiler1@yahoo.com (2007-05-12)
Re: Avoiding Branch Misprediction in Virtual Machine (VM) Portably anton@mips.complang.tuwien.ac.at (2007-05-13)
Re: Avoiding Branch Misprediction in Virtual Machine (VM) Portably moron451-compiler1@yahoo.com (2007-05-13)
Re: Avoiding Branch Misprediction in Virtual Machine (VM) Portably eliotm@pacbell.net (Eliot Miranda) (2007-05-14)
| List of all articles for this month |

From: Eliot Miranda <eliotm@pacbell.net>
Newsgroups: comp.compilers
Date: Mon, 14 May 2007 21:16:42 GMT
Organization: SBC http://yahoo.sbc.com
References: 07-05-044
Keywords: interpreter, VM
Posted-Date: 16 May 2007 03:03:24 EDT

moron451-compiler1@yahoo.com wrote:
> Hello,
>
> I've read several web sites and papers about coding virtual machines
> using a threaded code technique (Anton Ertl's papers for example). He
> recommends the use of GNU C's labels-as-values as a somewhat portable
> way to do threaded coding. However, I don't want to rely on a non-
> portable method like that. I want my VM to be 100% ANSI C. That
> seems to only leave me with the "Giant Switch" method. However... I
> recently thought about a different method that still uses the switch
> statement, but instead of one giant switch, it uses one for each
> instruction (see below). I was wondering if anyone can tell me if
> this is a good idea or not.
>


[clever code snipped]


> [Ignoring the detail that there's no "label" keyword in C, I don't see
> the point. A reasonable compiler will recognize collapse the common code
> sequences and jumps to jumps so you'll end up with the same object code
> you would have from the giant switch. Also, if you're going to call a
> routine for each operator, you can use a much smaller loop approximately
> like this:
>
> extern int *pc:
>
> void (*op)()[] = { &add, &sub, &mul, ... };
>
> for(;;) (op[*pc++])();
>
> Or make the code a list of function pointers, so it's
>
> for(;;) (*op++)();
>
> -John]


You might also think of the extension to the vanilla list of function
pointers I used successfully a while back, i.e.


http://compilers.iecc.com/comparch/article/91-03-121


The idea is to write code that can be optimized by post-processing the
compiler-generated assembler. Using macros the code can be compiled as
vanilla C or can be post-processed.


Each operation is written as a function, e.g.


void
pushLit()
TBEGIN
          *--sp = (obj)*tcip++;
TEND


tcip is the threaded code instruction pointer, sp is the stack pointer.
        Depending on the compiler sp and tcip can be held in global
registers. The default implementations of TBEGIN and TEND are simply
#define TBEGIN {
#define TEND }


and the interpreter is John's while(1) (*tcip++)();


But when one want's direct threaded code TEND gets defined as planting
the direct threaded jump before each operation function's epilog and
TBEGIN gets defined so that the operation function's prolog an be
identified. You then compile using -S to produce assembler and edit


opfunc:
          function prolog
          code
          threaded jump next op
          function prolog


down to


opfunc:
          code
          threaded jump next op


using e.g. sed.


This code is ANSI C with the default macros but can generate efficient
direct threaded code with special purpose macros and edit scripts. Its
a horrible hack but it works.


--
The surest sign that intelligent life exists elsewhere in Calvin &
the universe is that none of it has tried to contact us. Hobbes.
--
Eliot ,,,^..^,,, Smalltalk - scene not herd



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.