Re: Internal Representation of Strings

"Tony" <tony@my.net>
Sat, 21 Feb 2009 08:10:07 -0600

          From comp.compilers

Related articles
[13 earlier articles]
Re: Internal Representation of Strings ArarghMail902@Arargh.com (2009-02-17)
Re: Internal Representation of Strings bartc@freeuk.com (Bartc) (2009-02-18)
Re: Internal Representation of Strings tony@my.net (Tony) (2009-02-18)
Re: Internal Representation of Strings tony@my.net (Tony) (2009-02-18)
Re: Internal Representation of Strings cr88192@hotmail.com (cr88192) (2009-02-19)
Re: Internal Representation of Strings cr88192@hotmail.com (cr88192) (2009-02-21)
Re: Internal Representation of Strings tony@my.net (Tony) (2009-02-21)
Re: Internal Representation of Strings idbaxter@semdesigns.com (Ira Baxter) (2009-02-21)
Re: Internal Representation of Strings cr88192@hotmail.com (cr88192) (2009-02-22)
Re: Internal Representation of Strings DrDiettrich1@aol.com (Hans-Peter Diettrich) (2009-02-22)
Re: Internal Representation of Strings DrDiettrich1@aol.com (Hans-Peter Diettrich) (2009-02-22)
Re: Internal Representation of Strings bartc@freeuk.com (Bartc) (2009-02-22)
Re: Internal Representation of Strings scooter.phd@gmail.com (Scott Michel) (2009-02-22)
[14 later articles]
| List of all articles for this month |

From: "Tony" <tony@my.net>
Newsgroups: comp.compilers
Date: Sat, 21 Feb 2009 08:10:07 -0600
Organization: Compilers Central
References: 09-02-051 09-02-068 09-02-078 09-02-084 09-02-090
Keywords: storage
Posted-Date: 21 Feb 2009 09:34:51 EST

"Bartc" <bartc@freeuk.com> wrote in message news:09-02-090@comp.compilers...
> "Bartc" <bartc@freeuk.com> wrote in message
> news:09-02-084@comp.compilers...
>
>> I'm thinking of the following representation for short strings 2 to 256
>> characters, designed for use as array and record elements.
>
>> So I store the length using the final two bytes as follows, for a string
>> with 8 bytes:
>>
>> 0,0 Length is 0
>> 0,N Length is N (up to 6)
>> X,0 Length is 7
>> X,Y Length is 8
>
>> [This sounds awfully complicated for an in-memory design. Why not just
>> use a four byte length and code more compactly if needed on I/O. -John]
>
> I use a 4-byte length in other places, but where a string is short, it
> does seem attractive to use all available bytes and not waste one for
> a length or terminator (and if I wanted up to 8 useable characters,
> that would mean using an odd 9-byte field).
>
> Extracting the length is perhaps 3 x86 instructions for most strings,
> maximum 5. Compared with one instruction to pick up the length from a
> regular string. (Of course I haven't actually tried it yet..)
>
> --
> Bartc
> [In a world where laptops have a gigabyte of RAM, what's the point in
> trying to save a few bytes with structures in memory? -John]


What if you make every item in a parse tree contain a string. Those strings
are likely to be very small, a lot of one-character strings. It just seems
like low overhead strings always have a place. (No, I haven't built a
compiler, yet).


Tony
[Let's say you have a gigantic parse tree with 10,000 nodes. That means
you'd have 40K of length words. Who cares? -John]



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.