Re: Managing the JIT

Barry Kelly <>
Sat, 01 Aug 2009 00:05:33 +0100

          From comp.compilers

Related articles
[4 earlier articles]
Re: Managing the JIT (Philip Herron) (2009-07-27)
Re: Managing the JIT (BGB / cr88192) (2009-07-27)
Re: Managing the JIT (BGB / cr88192) (2009-07-28)
Re: Managing the JIT (Armel) (2009-07-29)
Re: Managing the JIT (BGB / cr88192) (2009-07-30)
Re: Managing the JIT (Armel) (2009-07-31)
Re: Managing the JIT (Barry Kelly) (2009-08-01)
Re: Managing the JIT (BGB / cr88192) (2009-08-02)
Re: Managing the JIT (BGB / cr88192) (2009-08-02)
Re: Managing the JIT (Aleksey Demakov) (2009-08-07)
Re: Managing the JIT (BGB / cr88192) (2009-08-08)
| List of all articles for this month |

From: Barry Kelly <>
Newsgroups: comp.compilers
Date: Sat, 01 Aug 2009 00:05:33 +0100
Organization: Compilers Central
References: 09-07-079 09-07-093 09-07-108 09-07-113 09-07-117
Keywords: code, incremental
Posted-Date: 01 Aug 2009 16:49:52 EDT

BGB / cr88192 wrote:

> basically, pretty much any capability of the assembler is available from the
> textual interface.
> however, the textual interface provides capabilities not available if direct
> function calls were used, such as using multi-pass compaction (AKA: the
> first pass assumes all jumps/... to be full length, but additional passes
> allow safely compacting the jumps).

If the assembler function interface encoded jumps specially (which it
would need to do anyway due in case of fixups, such as jumps to
non-local entrypoints) it can do jump optimization and simply blit the
surrounding code.

To take a concrete example: the Delphi compiler has a built-in assembler
which can use limited Delphi syntax to access global symbols (vars and
procs), record field offsets, that kind of thing, directly in asm
expressions. The built-in assembler directly processes the opcodes into
machine code with corresponding fixups, except for jumps, which it
encodes as a higher-level format of branches.

The normal compiler's code generation similarly produces machine code,
fixups and branches. Built-in assembler code can be embedded in the
middle of a normal procedure, so it's important that both produce the
same format.

Then, branches can be optimized - to the point of conflating / shrinking
/ inverting / eliminating chained and adjacent branches - but the only
work needed is blitting blobs of machine code and adjusting fixup
offsets, so high-level data about assembler format isn't needed.

> what does the binary interface buy you?...

Speed and memory - elimination of a whole pass, both emitting and

> note that wrapping every single opcode with a function would likely be far
> more work than writing most of the assembler.

Opcodes have patterns; addressing modes similarly have patterns, and
usually apply in the same way to a subset of the opcodes. So one only
needs a simple interface, along the lines of:

    // etc.
    GenEffectiveAddress // i.e. addressing mode like r/m & sib on x86
    GenFixup // for linker to keep track of
    GenBranch // for jump optimization to keep track of

This kind of low-level interface doesn't need more than couple of
hundred lines of C, including implementation, if even that.

> printf-like interface, as a very convinient and usable way to drive the
> assembler...

Convenience doesn't always add up to performance. Of course, for a
compiler like Delphi, compilation speed is a key priority.

> the overal performance difference either way is likely to be small, as in
> this case, the internal processing is likely to outweigh the cost of parsing
> (figuring out which opcode to use, ...).

The hot path for lexing, parsing, optimizing, codegen'ing and assembling
of a chunk of source text need never blow the CPU cache, if you're
careful. I find it hard to see the same kind of efficiency coming from
an intermediate text format.

-- Barry


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.