Make an editor for a language

"Davide Marino" <dfsgm@tin.it>
27 Feb 1999 22:54:18 -0500

          From comp.compilers

Related articles
Make an editor for a language dfsgm@tin.it (Davide Marino) (1999-02-27)
Re: Make an editor for a language dwight@pentasoft.com (1999-02-28)
Re: Make an editor for a language anton@mips.complang.tuwien.ac.at (1999-03-02)
Re: Make an editor for a language maratb@CS.Berkeley.EDU (Marat Boshernitsan) (1999-03-02)
Re: Make an editor for a language heinrich@idirect.com (Kenn Heinrich) (1999-03-02)
Re: Make an editor for a language dontspamger@informatik.uni-bremen.de (George Russell) (1999-03-04)
Re: Make an editor for a language mzraly@world.std.com (1999-03-04)
| List of all articles for this month |

From: "Davide Marino" <dfsgm@tin.it>
Newsgroups: comp.compilers
Date: 27 Feb 1999 22:54:18 -0500
Organization: TIN
Keywords: question, comment

I need make an editor for a programming language. It would be capable
of color tokens according their lexical value. I use flex to make the
lexical analisys and after i will use the functions yylex(),
yylength() to color the text. My problem is find a fast solution to
color the text when the user is typing. It's obvious that is
impossible to scan all the text at each letter typed by the user. But
there are some cases (for example inside a comment) when a little
modification can change all lexical values. My idea is to mantain in
memory all the text in this form:


- there is a double linked list with one element for each line of text
      with a link to the line after and before.


- each element of the list is an array of char of this kind from left to
right


              4 chars is the pointer to line before
              4 chars is the pointer to line after
              1 char is the length (in chars) of the line
              n chars are the text (of length n) of the line
              2k chars are couple like (position of first char of the token in
the line, kind of the token)
                                                    where k is the number of tokens in the line
                                                    N.B. is very simple find the start of sequence of
couples, infact it is length of the line (n) + 9


More, there is a particular line, the line where is the cursor. This line is
of this kind


              4 chars is the pointer to line before
              4 chars is the pointer to line after
              1 char is the actual length (in chars) of the line
              255 chars are for the text (really of length n) of the line
              2*255 chars are couple like (position of first char of the token
in the line, kind of the token)


                                                    N.B. is very simple find the start of sequence of
couples, infact it is length of the 255 + 9


The idea is that is very simple to modify a line if there is a space in
memory to add chars.


The cursor is a couple (pointer to the line, position in the line)


When the user modify a line (typing a char or deleting a char) the
editor search the start of the token where the modification was made
and then call yylex() and yylength() to change the couples (start of
token, type of token) just to when it find that no more changes must
made.


What u think about this kind of data structure?


Is fast enough to change colors when the user is typing?


Think you is possible reduce space in memory?


Now there is an overload of :


- 9 bytes for each line of text (8 for pointers, 1 for the length of the
line)


- 2 bytes for each token (1 for start position in line, 1 for the kind)


- 700 bytes (about) for the current line


So a text of 1000 lines, with 3000 tokens has an overload of about
9000+6000+700=15700 bytes
[People did a lot of context sensitive editors in the 1970s and early
1980s. I don't recall any good solution for this problem other than some
ad-hoc pattern matching. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.