Re: Internal Representation of Strings

"Bartc" <bartc@freeuk.com>
Mon, 16 Feb 2009 11:41:36 GMT

          From comp.compilers

Related articles
[5 earlier articles]
Re: Internal Representation of Strings anton@mips.complang.tuwien.ac.at (2009-02-14)
Re: Internal Representation of Strings cfc@shell01.TheWorld.com (Chris F Clark) (2009-02-14)
Re: Internal Representation of Strings lkrupp@pssw.nospam.com.invalid (Louis Krupp) (2009-02-14)
Re: Internal Representation of Strings cr88192@hotmail.com (cr88192) (2009-02-16)
Re: Internal Representation of Strings tony@my.net (Tony) (2009-02-15)
Re: Internal Representation of Strings DrDiettrich1@aol.com (Hans-Peter Diettrich) (2009-02-16)
Re: Internal Representation of Strings bartc@freeuk.com (Bartc) (2009-02-16)
Re: Internal Representation of Strings wclodius@lost-alamos.pet (2009-02-16)
Re: Internal Representation of Strings ArarghMail902@Arargh.com (2009-02-17)
Re: Internal Representation of Strings bartc@freeuk.com (Bartc) (2009-02-18)
Re: Internal Representation of Strings tony@my.net (Tony) (2009-02-18)
Re: Internal Representation of Strings tony@my.net (Tony) (2009-02-18)
Re: Internal Representation of Strings cr88192@hotmail.com (cr88192) (2009-02-19)
[22 later articles]
| List of all articles for this month |

From: "Bartc" <bartc@freeuk.com>
Newsgroups: comp.compilers
Date: Mon, 16 Feb 2009 11:41:36 GMT
Organization: Compilers Central
References: 09-02-051 09-02-068 09-02-078
Keywords: storage, design, comment
Posted-Date: 17 Feb 2009 15:59:20 EST

"Tony" <tony@my.net> wrote in message
> "Chris F Clark" <cfc@shell01.TheWorld.com> wrote in message
>>> What are some good ways/concepts of internal string representation?
>>
>> One such pointer is this (old but valuable) paper by Paul Abrahams:
>>
>> http://delivery.acm.org/10.1145/60000/51610/p61-abrahams.pdf?key1=51610&key2=4789464321&coll=GUIDE&dl=GUIDE&CFID=21920990&CFTOKEN=76966469
>
> The replies to my OP are appreciated. At this point in my research, I
> know that null-terminated implementation (maybe just for
> literals?). I'm leaning toward a Pascal-style string but with either a
> 32-bit length or 7/16/32-bit lengths.


I'm thinking of the following representation for short strings 2 to 256
characters, designed for use as array and record elements.


The short strings have a fixed maximum size, but can contain a shorter
string within that. I'd like to store a length but don't want to sacrifice a
byte (the string might only be 8 chars for example). So I store the length
using the final two bytes as follows, for a string with 8 bytes:


0,0 Length is 0
0,N Length is N (up to 6)
X,0 Length is 7
X,Y Length is 8


(But: needs a bit of logic to translate to the Ptr+Length form used
everywhere else. And doesn't allow embedded zeros)


--
Bartc
[This sounds awfully complicated for an in-memory design. Why not just
use a four byte length and code more compactly if needed on I/O. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.