|Facts about the Java class file format firstname.lastname@example.org (Markus Pilz) (1998-10-17)|
|Re: Facts about the Java class file format email@example.com (Tim Harris) (1998-10-21)|
|Re: Facts about the Java class file format jgm@CS.Cornell.EDU (Greg Morrisett) (1998-10-24)|
|Re: Facts about the Java class file format firstname.lastname@example.org (Stefan Monnier) (1998-10-30)|
|Re: Facts about the Java class file format Jan.Vitek@cui.unige.ch (1998-10-30)|
|Re: Facts about the Java class file format email@example.com (1998-10-30)|
|Re: Facts about the Java class file format firstname.lastname@example.org (1998-11-01)|
|Re: PowerPC CodePack (Was: Facts about the Java class file format) email@example.com (1998-11-06)|
|[1 later articles]|
|From:||Tim Harris <firstname.lastname@example.org>|
|Date:||21 Oct 1998 01:38:33 -0400|
|Organization:||University of Cambridge Computer Laboratory|
Markus Pilz <email@example.com> wrote:
> o The theoretical minimum average number of bits needed to encode the
> opcode is 4 bits instead of the 8 or 16 used today.
If my understanding of section 4.3 of your report is correct then this
figure has been calculated from the mean, across the 4016 classes
which you studied, of the mean number of bits of information in each
opcode. This is 3.46 bits, so if I recall it correctly, a corollory
of the source coding theorem says that a prefix code can be
constructed with binary code words of (at most) mean length 4.
However, I am not clear about how useful this result is in practice
since (assuming I understand your report correctly) the mean length of
4 would be achieved by tailoring the encoding to each class file,
rather than being a fixed encoding which is used for all class files.
Given that many classes have been observed to be small and do not
contain many bytecode operations, this seems to raise the problem of
how to distribute or describe the encoding used in a particular case.
Defining the encoding explicitly before the bytecode data would
mitigate the benefits of the more compact representation the it
provides. Similarly, the start-up effects of a scheme like LZ or the
compressed parse trees used with slim binaries may limit their
effectiveness to reach the theoretical minimum. I would be interested
to see how close a practical scheme can come!
A (clearly less thorough!) examination of about fifty class files
seemed to indicate that 6 bits could be required when the same
encoding was used for all of the files (presumably as a consequence of
larger total number of distinct operations seen and the fact that the
frequently used bytecodes differ somewhat between classes).
Something else that came to mind while reading the report is whether
there is much to be gained by analyzing the values which occur as
operands to bycode operations -- for example whether there is a useful
dominance of low values produced by aload/iload/etc.
[It also occurs to me that small size is important when you're
transferring a Java app, but less important when you're running it.
Netscape ships their Java code in zip files, where it's typically
compressed by about 50%. How much better than that is anyone likely
to do and still have a format that's useful for execution? -John]
Return to the
Search the comp.compilers archives again.