All Questions
35 questions
0
votes
0
answers
40
views
Regex Derivative DFA: This Should Work, But It's Breaking in Unexpected Ways
I’m working on a DFA-based lexer using regex derivatives for tokenizing lexemes. I've built a setup that, theoretically, should handle regex simplification and DFA transitions accurately. For the most ...
0
votes
1
answer
293
views
How does lex match tokens
I am learning lex. I made a simple lex file containing one rule:
%%
“Hello” puts(“response\n”);
%%
After running lex file.l, I’d like to inspect the outputted file file.yy.c. I presume that the ...
1
vote
1
answer
274
views
warning: 'extern' ignored on this declaration
I have in .l file extern struct symbols;
I try to use it in .y file as %union {struct symbols symp;} but when I try to run it , this message is appear warning: 'extern' ignored on this declaration [-...
1
vote
1
answer
490
views
Rule cannot be matched in Flex (LEX) when using ^
I'm trying to run some simple regex in Flex but I cannot get the operator "^" to work as a negation.
This is my code:`
%%
[^abc] printf("rule triggered\n");
.
\n
It ...
0
votes
1
answer
102
views
How to handle OR operator between STRING expressions inside parenthesis with PLY
I want to interpret sentences like:
"i + want + to + turn + ( on | off ) + the + lights"
to get sentences like:
"i want to turn on the lights"
"i want to turn off the lights"
I try defining ...
0
votes
1
answer
83
views
Why are some of my tokens not being recognized?
I'm trying to build a simple HTML lexer, and I've defined my tokens like this:
tokens = [
'text',
'num',
'id',
'url',
'newline',
'space',
'bigger',
'sp',
'del',
...
0
votes
1
answer
661
views
Conditionals for flex
Is is possible to place conditional statements for rules in flex? I need this in order to match a specific rule only when some condition is true. Something like this:
%option c++
%option noyywrap
%%
...
0
votes
1
answer
57
views
Scanning a language with non-delimited strings with nested tokens
I want to create a lexer/parser for a language that has non-delimited strings.
Which part of the language is a string is defined by the command preceding it.
For example it has statements that look ...
0
votes
1
answer
2k
views
Does each call to `yylex()` generate a token or all the tokens for the input?
I am trying to understand how flex works under the hood.
In the following first example, it seems that main() calls yylex() only once, and yylex() generates all the tokens for the entire input.
In ...
0
votes
1
answer
376
views
Make flex handle escaped newlines automatically
I am looking for a rule in flex that handles the escaped newlines and gives me a token ignoring that newline.
Eg:
I have a rule in my lex specification like:
\"(\.|[^\"])*\"
to capture all the ...
1
vote
1
answer
9k
views
Lex program to recognise valid arithmetic expression and also to recognise valid identifies and operators
the below program checks the arithmatic expression like a+b a-b it gives the output valid or invalid;
%{
#include<stdio.h>
#include<stdlib.h>
int c,d,bo=0,bc=0;
%}
operand [a-zA-Z0-9]+
...
1
vote
3
answers
8k
views
lex program on counting no of comment lines
here the program counts the no of comment lines, single line comments and multi line comments and gives a total comments output with a file.txt as input
file.txt
//hellow world
/*hello world1*/
/*...
4
votes
1
answer
398
views
Flex: trying to generate a C++ lexer using Flex; "unrecognized rule" error
I am trying to generate a lexer using flex. This is my definition file lexer.l:
%{
#include <iostream>
using namespace std;
//#define YY_DECL extern "C" int yylex()
%}
staffType "grand" | "...
0
votes
1
answer
306
views
In a Flex Lexer, why are last chars moved to the beginning of a buffer before loading new input?
I'm trying to understand a Lexer (source) I'm porting to JavaScript and am stuck understanding how data from an input is read into a buffer. It's a standard Lexer so I'm hoping someone can give me ...
0
votes
1
answer
159
views
flex 2.5.35 gives error when ctrl-M used in lex file
I have a simple lex file.
%{
#include <stdio.h>
%}
space_char [ \t\^M]
space {space_char}+
%%
%%
int yywrap(void) {
return 1;
}
int main(void) {
yylex();
return ...
0
votes
2
answers
344
views
How to detect partial unfinished token and join its pieces that are obtained from two consequent portions of input?
I am writing toy terminal, where I use Flex to parse normal text and control sequences that I get from tty. One detail of Cocoa machinery is that it reads from tty by chunks of 1024 bytes so that any ...
0
votes
1
answer
759
views
Python lexer lexical analysis token priority rule order dealing with ambiguities --- why STRING has priority over WORD?
I am studying lexer at
Programming Languages course by Westley Weimer .
The notes are here
https://www.udacity.com/wiki/cs262/unit-2#quiz-rule-order
{Video, if you care to watch, last 40 seconds.}
...
2
votes
1
answer
2k
views
What is the difference between t_ignore, pass and t.lexer.skip() in ply.lex?
All three can be used to skip, ignore or pass over the characters. For example:
def t_error(t):
pass
def t_error(t):
t.lexer.skip()
def t_default(t): # put at the extreme end and assuming there ...
0
votes
1
answer
172
views
What is the order of preference when we mix function and string type token definitions in ply.lex?
tokens = (
NUMBER2,
NUMBER1,
)
def t_NUMBER1(t):
r '[0-9]+'
return t
t_NUMBER2 = r '[0-9][0-9]'
If I use the above token specifications in ply.lex then which token ...
0
votes
1
answer
214
views
Explain the syntax of reserved.get(t.value,'ID') in lex.py
Code taken from ply.lex documentation: http://www.dabeaz.com/ply/ply.html#ply_nn6
reserved = {
'if' : 'IF',
'then' : 'THEN',
'else' : 'ELSE',
'while' : 'WHILE',
...
}
tokens = ['...
3
votes
1
answer
4k
views
Use PLY to match a normal string
I am writing a parser by using PLY. The question is similar to this one How to write a regular expression to match a string literal where the escape is a doubling of the quote character?. However, I ...
10
votes
3
answers
13k
views
How should I handle lexical errors in my Flex lexer?
I'm currently trying to write a small compiler using Flex+Bison but I'm kinda of lost in terms of what to do with error handlling, specially how to make everything fit together. To motivate the ...
6
votes
2
answers
9k
views
How to make lex/flex recognize tokens not separated by whitespace?
I'm taking a course in compiler construction, and my current assignment is to write the lexer for the language we're implementing. I can't figure out how to satisfy the requirement that the lexer must ...
2
votes
1
answer
3k
views
Not able to run JFlex generated lexer Java file
So I used JFlex to generate a file called Yylex.java without any problems. When I try to compile it with the command javac Yylex.java, I get 30 errors, originating with this one:
Yylex.java:13: ...
1
vote
1
answer
2k
views
Removing comments with JFlex, but keeping line terminators
I'm writing lexical specification for JFlex (it's like flex, but for Java). I have problem with TraditionalComment (/* */) and DocumentationComment (/** */). So far I have this, taken from JFlex User'...
1
vote
0
answers
232
views
Is there a lexer generator that can take standard lex files and produce a lexer in JavaScript?
I'm playing with CodeMirror, a browser-based editor written in JavaScript. It has a pluggable syntax highlighting component. I'd like to be able to take standard lex files for an arbitrary language ...
4
votes
1
answer
3k
views
ANTLR lexer can't lookahead at all
I have the following grammar:
rule: 'aaa' | 'a' 'a';
It can successfully parse the string 'aaa', but it fails to parse 'aa' with the following error:
line 1:2 mismatched character '<EOF>' ...
5
votes
2
answers
295
views
Define <LINE-START> and <LINE-END> in a lexer
I am trying to implement a front end which attempts to conform to a subset of this specification.
It seems that many things are clearly defined in the reference, except <LINE-START> and <...
3
votes
2
answers
642
views
Regular expression for "not belonging to" in OCaml
I would like to define non-line-termination-character = <any character other than %x000D / %x000A> in lexer.mll. I have tried let non_line_termination_character = [^('\x0D' '\x0A')], but it gave ...
1
vote
1
answer
3k
views
Clear buffers before calling YYACCEPT in yacc/lex
Is there any way to clear parser buffers before calling YYACCEPT in yacc.
If i do not clear buffer it causes some problems when i call yyparse for the second time.
Also note that I am using some ...
6
votes
3
answers
2k
views
semicolon insertion ala google go with flex
I'm interested in adding semi-colon insertion ala Google Go to my flex file.
From the Go documentation:
Semicolons
Like C, Go's formal grammar uses semicolons to terminate statements;
...
3
votes
2
answers
3k
views
C++ istream with lex
I have a working grammar (written in lex and bison) that parses polynomial expressions. It is like your standard, text-book calculator-like syntax. Here is a very simplified version of the grammar:
...
2
votes
2
answers
482
views
How to return more than one token on Bison parser?
My grammar is something like this
decl:
attributes; {/*create an object here with the attributes $$?*/ }
attributes:
|
att1 attributes {$$ = $1;}
|
att2 attributes {$$ = $1;}
|
...
2
votes
1
answer
743
views
How to write own parser for (f)lex?
I generated with flex a lexer.
[ \t\n\r\v] /* skip whitespace */
[_a-zA-Z]([_a-zA-Z]|[0-9])* printf("IDENT\n");
[0-9]+ printf("INTEGER\n");
[0-9]+\. printf("DOUBLE\n");
Now i ...
7
votes
2
answers
8k
views
Unable to compile output of lex
When I attempt to compile the output of this trivial lex program:
# lex.l
integer printf("found keyword INT");
using:
$ gcc lex.yy.c
I get:
Undefined symbols:
"_yywrap", referenced from:
...