[ALMOST PATCHES] for oddities in the lex'r code. from Squeak on 1999-06-14 (genius-list)

From: Squeak <squeak_at_xirr.com>
Date: Mon, 14 Jun 1999 09:57:22 -0500 (CDT)

*
in lexer.l: line 72: ^[ ]*load[ ]+<([^>]|\\>)*>[ ]*$ {
suggest: ^[ \t]*load[ \t]+(<([^>]||\\>)*>[ \t]*)+$ {
with appropriate support code to handle multiple file names.
since the follwoing matches line 107.

genius> load <sys1.gel> <sys2.gel>
line 2: Can't open file: '<sys1.gel>'
line 2: Can't open file: '<sys2.gel>'

see below for more complete fix

*
in lexer.l:
for the load commands, the lexical analyzer is doing a fairly full grammar
check. This doesn't look all that swell, but more importantly it makes for
some small oddities in the language.

genius> load x;load y
line 2: Can't open file: 'x;load'
line 2: Can't open file: 'y;load'

Very low priority to fix I'd say, but I would think load would be a
keyword that took arguments of token type STRING:

lexer.l:
load { return (LOAD);}
<([^>]|\\>)*> { char *s; s=strdup(yytext+1); s[strlen(s)]=0;
                        yylval.id=s; return (SYSSTRING); }
<([^"]|\\")*> { char *s; s=strdup(yytext+1); s[strlen(s)]=0;
                        yylval.id=s; return (STRING); }
<([^']|\\')*> { char *s; s=strdup(yytext+1); s[strlen(s)]=0;
                        yylval.id=s; return (STRING); }

Then in parse.y:

loadcommand: LOAD {}
| loadcommand STRING { do_load_command($2); }
| loadcommand SYSSTRING { do_system_load_command($2); }

That way you'd handle arbitrarily long file lists and be able to embed it
in functions blah blah.

Even if noone needs to use load's in functions or multiple loads ona
line, it would still be cleaner code to treat a load command as a key word
with arguments of a certain type, rather than a single lexical token.

Oh yeah and my malloc happy code reminds me: I think there are some memory
leaks in the lexer yacc'r. Mine has tons of em, so obviously I haven't
found a way around them, and they only happen as fast as the user
actually types them, so typically this shouldn't be a problem.

*
in lexer.l: line 120: ";" { DO_RET; return SEPAR; }
suggest: ";" {DO_RET; return ';';}
in parse.y: line 65: %token SEPAR EQUALS
suggest: %token EQUALS
in parse.y: line 72: %left SEPAR
suggest: %left ';'
in parse.y: line 102,107, and 108:
suggest replacing "SEPAR" with "';'"
since the bison manual suggests using single char's as their own token.

*
in lexer.l:
Some patterns contain literal tabs which are hard to read and are prone to
conversion errors. Also it contains the rule "[ \t][ \t]*" which is
equivalent to "[ \t]+" but not as easy to read? It also contains some
small redundancies in that area. A patch to make the tabs escaped and very
slightly reorder the whitespace removal (for neglible gains in speed, but
perhaps gains in straightforwardness?) see
http://www.xirr.com/~squeak/genius-0.4.3-prettylex.diff

This one is not self-extracting (you can't just type "sh *prettylex.diff")
because of the literal tabs I guess. It works dandy as a patch file
though.
Received on Mon Jun 14 1999 - 07:28:10 CDT

This archive was generated by hypermail 2.2.0 : Sun Apr 17 2011 - 21:00:02 CDT