Syntax analysis in real time

jacob@jacob.remcomp.fr (Jacob Navia)
22 Jul 1996 11:00:10 -0400

          From comp.compilers

Related articles
Smart textual editors gupta@csc.ti.com (1996-07-15)
Syntax analysis in real time jacob@jacob.remcomp.fr (1996-07-22)
Re: Syntax analysis in real time sanjay@src.dec.com (1996-07-23)
| List of all articles for this month |

From: jacob@jacob.remcomp.fr (Jacob Navia)
Newsgroups: comp.compilers
Date: 22 Jul 1996 11:00:10 -0400
Organization: Compilers Central
References: 96-07-103
Keywords: parse, tools

The editor is one of the most visible tool of a compiler system. Often
neglected or taken for granted, it tends to integrate into an 'IDE'
(Integrated Development Environment) within the Windows operating system
environment. See the MSVC/Borland compilers for instance. They feature
an editor integrated with the compiler.


I'm developing an environment like that, and it can be downloaded for
free from:


ftp.cs.princeton.edu, in the lcc/contrib lcc-win32.tar.gz archive.


Keep in mind that discussions about editors tend to derive very quickly
into emotional arguments. KEEP COOL!


This discussion was originated when:
>
> I was looking for smart textual editors. By smart, I mean editors
> which might be doing data flow analysis and such things even while the
> editing is in progress so that they can point out the errors to the
> programmer. (e.g. if certain part of the code is unreachable, then it
> might give hints to the programmer etc.)


> [I've seen plenty of syntax editors, none of which seemed to me to be worth
> the grief they cause. (Many common editing operations are hard to express
> in syntactical terms, e.g. moving parens around.) But I've never seen one
> that tries to analyze the semantics of the code you're typing. -John]


I replied to that with:


|> Having implemented a syntax analyzing editor, I think the editor you want
|> is out of the question for the foreseeable future (2-3 years...).


|> My editor limits its job to display keywords in a different color than
|> normal program text, highlighting comments as well. The updates are done in
|> real time of course.


One reader asked:
> What platform are you working on? I think most Unix editors do this.
> (Emacs and xemacs, at least, do, and I've been told that the latest
> releases of vim do as well.)
>
Well, not quite. There is a BIG difference between having a semi-batch
mode and really doing real-time syntax analysis. I'm working for Windows
3.11/Windows NT or 95.


|> This simple analysis is very difficult to do in real time: If you happen
|> to type a '/' just before a '*' all the text until the end of the file
|> will be a comment, and has to be changed (redrawn). The editor has to
|> scan each character you type looking for 'interesting' ones, like '/', to
|> avoid rescaning the whole file at each character typed.


> You need some heuristics. If you insert a /* in a line containing text,
> xemacs will only mark the comment to the end of the line. If you insert
> a /* on an empty line, xemacs will not mark anything until you enter the
> */.


Well, this is precisely the difference. Syntax analysis should help the
programmer avoiding compile time errors such that a piece of code is NOT
commented out (or the opposite). The color should change instantaneously,
so that the user can see immediately the consequences of inserting a
/ and a '*'. The heuristics you mention are of course easier for xemacs
to implement, but they fail to convey the information you need immediately.


> Xemacs knows quite a bit about the language, which it uses to do some
> very good automatic indenting. You can fool it, but it doesn't happen
> too much in actual practice.


Here I decided the other way around: I thought indenting should be a batch
process, and my editor will indent the program text only when you ask it
to do so. The only indentation done when you input the text is
    A: to indent a new line to the same space as the preceeding line.
    B: If you input a brace and then return, the indentation increases by
          one tab.
    C: If you input a closing brace and then return, indentation decreases
          by one tab.


|> The algorithms I used were based in simple heuristics: I limited myself
|> to the current screenfull of text, keeping a pointer to the comment just
|> before the start of the text (if there was any). I rescanned the text
|> starting there, since the user can't modify anything outside the
|> currently displayed screen. This had a certain impact at the
|> responsiveness of the editor for fast typists. With the progress of
|> hardware, more extensive analysis becomes possible.


> I believe the basic xemacs heuristic involves being able to find the
> beginning of a function by scanning backwards, and starting its syntax
> analysis there.


I try to avoid scanning backwards, since is kind of difficult: The complication
arose when the // comment was introduced. This introduced an asymmetry
in the scanning of C program text, since your parser should first test
if there is somewhere a // to know where the comments are. But it should
ignore '//' that appear within character strings, so first we have to find
character strings, etc.


|> Just before I save a file, I check for obvious syntax errors (mismatched
|> parentheses and such). I am thinking of doing this in real time too. This
|> would be doable.


> Are there editors which don't show matching parentheses in real time.
> Even the stone age vi does. The current version of xemacs (and I
> suppose emacs) even handle parentheses in quotes or comments correctly
> when showing the match.


I'm sorry but 'vi' doesn't understand anything about C syntax. It will accept
parentheses within character strings, etc etc.
Here, as I said in my mail, the problem is not that is difficult, but
how to show the correspondence to the user. I had a feature of putting the
cursor at the position of the matching parentheses/bracket, but I had to
take it away because people found it distracting or simply annoying. I
have opted now for a 'batch' mode, i.e. the F11 function key will match


o parentheses
o brackets
o braces
o #ifdef/#endif


only when the user is interested in this functionality, and NOT by default.
The editor should be TRANSPARENT. It should give all complex functionality
that the user needs, only when he wants it. My editor does complexity
analysis (McCabe, Halstead), and could do it in real time. But this would
distract the user, so it will do it only when you ask for it.


--
Jacob Navia Logiciels/Informatique
41 rue Maurice Ravel Tel (1) 48.23.51.44
93430 Villetaneuse Fax (1) 48.23.95.39
France
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.