Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
0 votes
0 answers
40 views

Regex Derivative DFA: This Should Work, But It's Breaking in Unexpected Ways

I’m working on a DFA-based lexer using regex derivatives for tokenizing lexemes. I've built a setup that, theoretically, should handle regex simplification and DFA transitions accurately. For the most ...
Jacques's user avatar
  • 51
0 votes
1 answer
293 views

How does lex match tokens

I am learning lex. I made a simple lex file containing one rule: %% “Hello” puts(“response\n”); %% After running lex file.l, I’d like to inspect the outputted file file.yy.c. I presume that the ...
Maslin's user avatar
  • 149
1 vote
1 answer
274 views

warning: 'extern' ignored on this declaration

I have in .l file extern struct symbols; I try to use it in .y file as %union {struct symbols symp;} but when I try to run it , this message is appear warning: 'extern' ignored on this declaration [-...
rema's user avatar
  • 13
1 vote
1 answer
490 views

Rule cannot be matched in Flex (LEX) when using ^

I'm trying to run some simple regex in Flex but I cannot get the operator "^" to work as a negation. This is my code:` %% [^abc] printf("rule triggered\n"); . \n It ...
Johnny Lerec's user avatar
0 votes
1 answer
102 views

How to handle OR operator between STRING expressions inside parenthesis with PLY

I want to interpret sentences like: "i + want + to + turn + ( on | off ) + the + lights" to get sentences like: "i want to turn on the lights" "i want to turn off the lights" I try defining ...
BC Coder's user avatar
0 votes
1 answer
83 views

Why are some of my tokens not being recognized?

I'm trying to build a simple HTML lexer, and I've defined my tokens like this: tokens = [ 'text', 'num', 'id', 'url', 'newline', 'space', 'bigger', 'sp', 'del', ...
Agus's user avatar
  • 1
0 votes
1 answer
661 views

Conditionals for flex

Is is possible to place conditional statements for rules in flex? I need this in order to match a specific rule only when some condition is true. Something like this: %option c++ %option noyywrap %% ...
Liviu's user avatar
  • 152
0 votes
1 answer
57 views

Scanning a language with non-delimited strings with nested tokens

I want to create a lexer/parser for a language that has non-delimited strings. Which part of the language is a string is defined by the command preceding it. For example it has statements that look ...
Aaron L's user avatar
  • 96
0 votes
1 answer
2k views

Does each call to `yylex()` generate a token or all the tokens for the input?

I am trying to understand how flex works under the hood. In the following first example, it seems that main() calls yylex() only once, and yylex() generates all the tokens for the entire input. In ...
Tim's user avatar
  • 98.9k
0 votes
1 answer
376 views

Make flex handle escaped newlines automatically

I am looking for a rule in flex that handles the escaped newlines and gives me a token ignoring that newline. Eg: I have a rule in my lex specification like: \"(\.|[^\"])*\" to capture all the ...
pranavk's user avatar
  • 1,835
1 vote
1 answer
9k views

Lex program to recognise valid arithmetic expression and also to recognise valid identifies and operators

the below program checks the arithmatic expression like a+b a-b it gives the output valid or invalid; %{ #include<stdio.h> #include<stdlib.h> int c,d,bo=0,bc=0; %} operand [a-zA-Z0-9]+ ...
Pollux 01's user avatar
  • 121
1 vote
3 answers
8k views

lex program on counting no of comment lines

here the program counts the no of comment lines, single line comments and multi line comments and gives a total comments output with a file.txt as input file.txt //hellow world /*hello world1*/ /*...
Pollux 01's user avatar
  • 121
4 votes
1 answer
398 views

Flex: trying to generate a C++ lexer using Flex; "unrecognized rule" error

I am trying to generate a lexer using flex. This is my definition file lexer.l: %{ #include <iostream> using namespace std; //#define YY_DECL extern "C" int yylex() %} staffType "grand" | "...
dylhunn's user avatar
  • 1,394
0 votes
1 answer
306 views

In a Flex Lexer, why are last chars moved to the beginning of a buffer before loading new input?

I'm trying to understand a Lexer (source) I'm porting to JavaScript and am stuck understanding how data from an input is read into a buffer. It's a standard Lexer so I'm hoping someone can give me ...
frequent's user avatar
  • 28.4k
0 votes
1 answer
159 views

flex 2.5.35 gives error when ctrl-M used in lex file

I have a simple lex file. %{ #include <stdio.h> %} space_char [ \t\^M] space {space_char}+ %% %% int yywrap(void) { return 1; } int main(void) { yylex(); return ...
Dharmendra's user avatar
0 votes
2 answers
344 views

How to detect partial unfinished token and join its pieces that are obtained from two consequent portions of input?

I am writing toy terminal, where I use Flex to parse normal text and control sequences that I get from tty. One detail of Cocoa machinery is that it reads from tty by chunks of 1024 bytes so that any ...
Stanislav Pankevich's user avatar
0 votes
1 answer
759 views

Python lexer lexical analysis token priority rule order dealing with ambiguities --- why STRING has priority over WORD?

I am studying lexer at Programming Languages course by Westley Weimer . The notes are here https://www.udacity.com/wiki/cs262/unit-2#quiz-rule-order {Video, if you care to watch, last 40 seconds.} ...
goughgough's user avatar
2 votes
1 answer
2k views

What is the difference between t_ignore, pass and t.lexer.skip() in ply.lex?

All three can be used to skip, ignore or pass over the characters. For example: def t_error(t): pass def t_error(t): t.lexer.skip() def t_default(t): # put at the extreme end and assuming there ...
aste123's user avatar
  • 1,242
0 votes
1 answer
172 views

What is the order of preference when we mix function and string type token definitions in ply.lex?

tokens = ( NUMBER2, NUMBER1, ) def t_NUMBER1(t): r '[0-9]+' return t t_NUMBER2 = r '[0-9][0-9]' If I use the above token specifications in ply.lex then which token ...
aste123's user avatar
  • 1,242
0 votes
1 answer
214 views

Explain the syntax of reserved.get(t.value,'ID') in lex.py

Code taken from ply.lex documentation: http://www.dabeaz.com/ply/ply.html#ply_nn6 reserved = { 'if' : 'IF', 'then' : 'THEN', 'else' : 'ELSE', 'while' : 'WHILE', ... } tokens = ['...
aste123's user avatar
  • 1,242
3 votes
1 answer
4k views

Use PLY to match a normal string

I am writing a parser by using PLY. The question is similar to this one How to write a regular expression to match a string literal where the escape is a doubling of the quote character?. However, I ...
Loi.Luu's user avatar
  • 383
10 votes
3 answers
13k views

How should I handle lexical errors in my Flex lexer?

I'm currently trying to write a small compiler using Flex+Bison but I'm kinda of lost in terms of what to do with error handlling, specially how to make everything fit together. To motivate the ...
hugomg's user avatar
  • 69.8k
6 votes
2 answers
9k views

How to make lex/flex recognize tokens not separated by whitespace?

I'm taking a course in compiler construction, and my current assignment is to write the lexer for the language we're implementing. I can't figure out how to satisfy the requirement that the lexer must ...
millimoose's user avatar
  • 39.9k
2 votes
1 answer
3k views

Not able to run JFlex generated lexer Java file

So I used JFlex to generate a file called Yylex.java without any problems. When I try to compile it with the command javac Yylex.java, I get 30 errors, originating with this one: Yylex.java:13: ...
John Roberts's user avatar
  • 5,946
1 vote
1 answer
2k views

Removing comments with JFlex, but keeping line terminators

I'm writing lexical specification for JFlex (it's like flex, but for Java). I have problem with TraditionalComment (/* */) and DocumentationComment (/** */). So far I have this, taken from JFlex User'...
Adam Stelmaszczyk's user avatar
1 vote
0 answers
232 views

Is there a lexer generator that can take standard lex files and produce a lexer in JavaScript?

I'm playing with CodeMirror, a browser-based editor written in JavaScript. It has a pluggable syntax highlighting component. I'd like to be able to take standard lex files for an arbitrary language ...
Bartosz Milewski's user avatar
4 votes
1 answer
3k views

ANTLR lexer can't lookahead at all

I have the following grammar: rule: 'aaa' | 'a' 'a'; It can successfully parse the string 'aaa', but it fails to parse 'aa' with the following error: line 1:2 mismatched character '<EOF>' ...
K J's user avatar
  • 4,743
5 votes
2 answers
295 views

Define <LINE-START> and <LINE-END> in a lexer

I am trying to implement a front end which attempts to conform to a subset of this specification. It seems that many things are clearly defined in the reference, except <LINE-START> and <...
SoftTimur's user avatar
  • 5,462
3 votes
2 answers
642 views

Regular expression for "not belonging to" in OCaml

I would like to define non-line-termination-character = <any character other than %x000D / %x000A> in lexer.mll. I have tried let non_line_termination_character = [^('\x0D' '\x0A')], but it gave ...
SoftTimur's user avatar
  • 5,462
1 vote
1 answer
3k views

Clear buffers before calling YYACCEPT in yacc/lex

Is there any way to clear parser buffers before calling YYACCEPT in yacc. If i do not clear buffer it causes some problems when i call yyparse for the second time. Also note that I am using some ...
nav_jan's user avatar
  • 2,543
6 votes
3 answers
2k views

semicolon insertion ala google go with flex

I'm interested in adding semi-colon insertion ala Google Go to my flex file. From the Go documentation: Semicolons Like C, Go's formal grammar uses semicolons to terminate statements; ...
Aaron Yodaiken's user avatar
3 votes
2 answers
3k views

C++ istream with lex

I have a working grammar (written in lex and bison) that parses polynomial expressions. It is like your standard, text-book calculator-like syntax. Here is a very simplified version of the grammar: ...
Nick's user avatar
  • 509
2 votes
2 answers
482 views

How to return more than one token on Bison parser?

My grammar is something like this decl: attributes; {/*create an object here with the attributes $$?*/ } attributes: | att1 attributes {$$ = $1;} | att2 attributes {$$ = $1;} | ...
Hohenheimsenberg's user avatar
2 votes
1 answer
743 views

How to write own parser for (f)lex?

I generated with flex a lexer. [ \t\n\r\v] /* skip whitespace */ [_a-zA-Z]([_a-zA-Z]|[0-9])* printf("IDENT\n"); [0-9]+ printf("INTEGER\n"); [0-9]+\. printf("DOUBLE\n"); Now i ...
multiholle's user avatar
7 votes
2 answers
8k views

Unable to compile output of lex

When I attempt to compile the output of this trivial lex program: # lex.l integer printf("found keyword INT"); using: $ gcc lex.yy.c I get: Undefined symbols: "_yywrap", referenced from: ...
dstnbrkr's user avatar
  • 4,335