1,091 questions
-1
votes
1
answer
45
views
Should an escaped unicode hex value '\u0000' in a JSON string be validated by the lexer or the parser?
Preamble
I am creating my own JSON lexer and eventually a full parser, purely as a learning experience because that is what I enjoy doing. As I understand it, the lexer's job is to tokenize the data (...
-5
votes
0
answers
37
views
Which way is better to code and execute my programming language? [closed]
I'm currently working on my own programming language. It is based on JavaScript, the syntax would be similar, but I would try to make it suitable not only for web development and server-side ...
0
votes
0
answers
16
views
Antlr4 lexer seems to have a problem processing token 'AX', and no semantic predicate runs on rule REG
In the following example, the input token 'AX' seems to cause errors for an unknown reason. The parse tree shows that other rule matches that contain register tokens such as 'DX' are working fine. I'...
1
vote
1
answer
34
views
Granularity of tokens for lexer
I want to build a little lexer and parser by myself. I want the lexer to produce a vector of tokens that I feed into the parser later. Now I think about what belongs into which stage.
Let's look at ...
0
votes
0
answers
40
views
Regex Derivative DFA: This Should Work, But It's Breaking in Unexpected Ways
I’m working on a DFA-based lexer using regex derivatives for tokenizing lexemes. I've built a setup that, theoretically, should handle regex simplification and DFA transitions accurately. For the most ...
1
vote
0
answers
33
views
C++ Code doesn't give any output and stuck
I am making a lexer & parser for a 8 bit cpu, my lexer is working fine but when I added AST class for parse, this problem started. Whats the problem and how to solve it
The code takes a string ...
2
votes
1
answer
53
views
Error: unknown type name 'ASTNode' in Bison parser when integrating with Flex lexer
I'm working on a project where I'm using Bison to generate a parser and Flex to generate a lexer. My parser is meant to generate an Abstract Syntax Tree (AST), and I've defined the ASTNode structure ...
2
votes
0
answers
107
views
Get the groups that returned a match in Java Regex
I'm coding a simple Lexer in Java and the class looks like this:
public class Lexer {
// This enum contains all the possible tokens accepted by the language and their respective regular expression
...
0
votes
1
answer
65
views
How to use EBNF to drive the Parser?
I have a simple EBNF:
<program> ::= <function>
<function> ::= int <id> ( ) { <statement> }
<statement> ::= return <expr> ;
<expr> ::= <int>...
0
votes
1
answer
70
views
I'm writing a programming language in Python, but I have a problem with the lexer function
I'm writing a programming language in Python, but I have a problem with the lexer function.
I'll leave you the code, which is fully functional:
import sys
inputError = """
(!) Error.
(...
0
votes
1
answer
41
views
Why is antlr4-parse generating java.lang.ArrayIndexOutOfBoundsException error?
Command antlr4-parse -Dlanguage=Python3 test.g4 gives me this error.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 2 out of bounds for length 2
at org.antlr....
0
votes
1
answer
44
views
i am getting tokens unused error while using sly
WARNING: C:\Program Files\adbms\strg_eng\done.py:66: Rule 'change_tname' defined, but not used
WARNING: C:\Program Files\adbms\strg_eng\done.py:70: Rule 'create_db' defined, but not used
WARNING: C:\...
0
votes
1
answer
88
views
Unclear reason for terminal collision
I am new to lexers and parsers in general and to working with Lark in particular. I am using the versions lark 1.1.9 and interegular 0.3.3.
I started writing a grammar which produces a warning when ...
0
votes
1
answer
105
views
Lexing Issue in ANTLR4 Grammar for Fortran 2018: Token Misclassification
I am developing a Fortran 2018 grammar in ANTLR4 using the ISO standard. I am encountering an issue during the lexing phase with some of the lexer rules. Specifically, certain keywords are being ...
1
vote
0
answers
121
views
Parsing a function call statement using Participle golang
I want to create a simple parser for parsing function calls i.e. function(arg1, arg2, kw1=v1, kw2=v2)
I started fairly simple by creating a parser grammar for the Key and Value as a Property struct ...
0
votes
1
answer
37
views
Why does my JavaScript lexer split my floats as an identifier and a float?
I am currently trying to make my own little programming language for the first time. I am, as of now, creating the basic lexer. I am trying to allow floating point values in my code, but alas, it gets ...
1
vote
1
answer
105
views
String augmentation and concatenation in ANTLR
I am having issues with ANTLR augmented strings.
My main issue is if I want augmentedStrings to be read right, i have to keep string as a parser rule.
But this causes the string to have quote, body, ...
1
vote
1
answer
77
views
When is ANTLR skip applied?
Does ANTLR wait to find the longest string that matches a token that should be skipped or it skips it as soon as it matches that token?
Suppose we have two tokens in our grammar like this:
TEST:
[\...
0
votes
0
answers
87
views
Conflict between lexer rules in ANTLR4 for Fortran grammar
I'm developing a Fortran parser using ANTLR4, adhering to the ISO Fortran Standard 2018 specifications. While implementing lexer rules, I encountered a conflict between the NAME and LETTERSPEC rules. ...
0
votes
1
answer
139
views
How to link custom lexer with ANTLR generated parser
I am new to creating compilers and interpreters and as an exercise, I created a handwritten lexer in Java that spits out tokens looking like the following.
public Token(TokenType type, String lexeme, ...
1
vote
0
answers
87
views
Is there a way to see where the Go lexer inserts semicolons?
In the Go spec, the section about semicolons says that semicolons are automatically inserted into the token stream if some rules are true.
When the input is broken into tokens, a semicolon is ...
0
votes
0
answers
40
views
antlr4 expression separator with '\n' and let parser to ignore '\n' like in Kotlin
I make grammar with antlr4 and encountered a problem with expression separation
I need to be able to separate expressions with the next line like in Kotlin, Swift or JavaScript
For example:
a = 5
b = ...
0
votes
0
answers
25
views
Does Bison/Flex have a feature to declare a specific terminal as the error recovery reset point?
I am currently using the Bison feature error in my grammar rules in various places similar to the following:
nonterminal:
someother_nonterminal SEMI
| another_nonterminal SEMI
| error ...
0
votes
1
answer
68
views
ANTLR4 matches to lexer rule instead of parser rule
Here is my short ANTLR4 language:
grammar test;
prog: (decl | expr)+
;
decl: doc | quiz
;
doc: '%doc' paramlist
;
quiz: '%quiz' paramlist STR? '%quiz' ENDL
;
paramlist: '(' VAR '=' PARAMVAL {, VAR '...
0
votes
1
answer
65
views
String goes from normal characters to garbage for an as of yet indiscernible reason
For my own personal learning I'm trying to make a parser for JSON in c. Currently I am having some trouble with the lexer. Everything works the I want apart from the STRING token. For whatever reason ...
0
votes
1
answer
45
views
Trouble in defining Array in ANTLR4
Im implementing a simple language lexer and parser, and I got stuck at defining array for this language.
So the requirement is to write grammar for a multi-dimensional arrays so that:
All elements of ...
0
votes
1
answer
51
views
Go lang else is called first when it should not
// Invalid token found, print error message and exit.
func abort(message string) {
panic(message)
}
// Skip whitespace except newlines, which we will use to indicate the end of a statement.
func (...
1
vote
0
answers
93
views
ANTLR Lexer fails when parsing C code with preprocessor directives
We're using the ObjectiveC preprocessor parser and lexer grammars for parsing directives in C code like #define, #include, #ifndef, etc. Below are relevant portions of the grammar (shortened for ...
0
votes
0
answers
84
views
Parsing and evaluating typed expressions (BNF-like grammar for earley / Nearley.js)
I'm playing with nearley.js and moo at the Nearley Parser Playground to learn about parsing.
I understand the calculator example but can't wrap my head around the recursion when trying to write ...
0
votes
3
answers
374
views
Which runs first in bash? lexer, or expander?
I am trying to understand bash's parser and lexer mechanism. (My ultimate goal is implementing a bash-like shell).
The first case
$ test='o a'
$ ech$test
a
(^ edit: I removed double quotes for second ...
0
votes
0
answers
149
views
Parsing "case of" statement with Happy
Another day, another question:
I'm trying to create a parser in Happy which can recognize Haskell integers, identificators, function applications, let in and case of statements. I don't really know ...
2
votes
1
answer
137
views
Haskell parser created with Alex and Happy doesn't work because of main function, someone can tell me why?
Basically i just created a parser to print the derivation tree of a Reverse Polish Notation expression, which recognizes regular expression built in RPN, tokenizes them recognizes only the expressions ...
-2
votes
1
answer
59
views
where does `input(yytext, yyleng)` function come from in flex scanner?
I'm doing Stanford's CS143: Compiler on edx
I saw this code from this repo
/* string ends, we need to deal with some escape characters */
<STRING>\" {
std::string input(yytext, yyleng);...
1
vote
0
answers
161
views
building a parser that recognizes extended backus-naur form grammar in C
I am trying to build a parser that recognizes grammar expressions written in the extended Backus-naur form.
I have already built a lexer that takes in the input of a String and turns a list of tokens.
...
3
votes
0
answers
122
views
How to push a synthetic token (which is not related to any symbol) in Menhir Incremental?
I am parsing the Elm lang-like syntax. And look at the pattern matching (inside pattern matching)
case a of
1 -> case b of
"hello" -> 'h'
_ -> 'w'
_ -> 'e'
We see there is ...
1
vote
1
answer
83
views
Drawing cfg using antlr4, graphiz and python and parser is empty
I am using 2 files I will be sharing link if any doubt can visit one is cfg_from_stding.py and cfg_extractor_visitor.py and also I will share the lexer and go file go file is simple and correct.
I ...
2
votes
1
answer
156
views
Can you conditionally change ANTLR lexer modes?
I'm working on a language where there is an outer grammar that defines objects and an inner grammar that defines code. The inner grammar is embedded in various places. The inner grammar starts with ...
0
votes
1
answer
34
views
Multiline inputs are not possible through code
I have a simple converting script between a custom language and JS
function lexer(code) {
// lexes code into parts
const codeParts = [];
const regex = /"(.*?)"/g; // quote split regex
...
0
votes
0
answers
109
views
How do I test out my parser grammar in ANTLR4?
I wrote a parser grammar in ANTLR4 which looks like this:
parser grammar IFJ23;
tokens {
Identifier, Type,
LeftBracket, RightBracket,
LeftCurlyBracket, RightCurlyBracket,
Semicolon, ...
0
votes
0
answers
48
views
Why does having EOF in parser rules influence the lexer in ANTLR4?
Start with simple grammar:
grammar Simple;
file : lines ;
lines : (ID | INT | STRING)+ '\r'? '\n';
ID : [a-zA-Z_]*;
INT : [0-9]*;
STRING : '"' ~["\r\n]* '"';
COMMA : '...
2
votes
0
answers
380
views
Sphinx: custom Pygments' lexer not found
I created Pygments customized lexer and style:
acetexlexer.py (lexer file),
acedracula.py (style file),
that work pretty well since the following command returns the expected result:
pygmentize -O ...
0
votes
0
answers
51
views
Per-rule hidden terminals are not captured in the internal.tokens file generated by Xtext
I have the following minimalistic grammar:
grammar org.example.minimalDSL hidden (WS, SL_COMMENT, ML_COMMENT)
...
Class:
(Documentation=documentation)?
'class' name=ValidId '{' (attributes+=...
1
vote
1
answer
55
views
How ANTLR4 lexer consume more any tokens and stop at existing rules?
Is ANTLR4 lexer can consume more any tokens and stop at existing rules?
Expect consume more chars into one token.
Small rules
lexer grammar PhpLexer;
options {
superClass = PhpLexerBase;
...
0
votes
1
answer
68
views
How can I add the -ml flag (or any flag, really) to ocamllex in a dune file?
Here is my current dune file:
(library
(name parsing)
(libraries toto fmt menhirLib)
(modules parser lexer lex_and_parse)
)
(ocamllex lexer)
(menhir
(modules parser)
(flags --explain -v))
I know ...
0
votes
0
answers
178
views
Using ANTLR4-Intellij-Adaptor library with UVL grammar - IntelliJ Custom Lanugage Support
I am currently developing an Intellij plugin to support UVL as a custom language. As there already exists an ANTLR4 grammar for UVL 1, I wanted to use this grammar for the parser and lexer as ...
-1
votes
1
answer
101
views
why doesn't pygments discover my custom lexer after installation
i've developed and tested a custom pygments lexer, as described here... i then prepared a pyproject.toml file whose contents are as follows:
[build-system]
requires = ["setuptools"]
build-...
-2
votes
1
answer
73
views
Creation of a programming language
I have several classes for a Programming Language.
Class Lexer:
from rply import LexerGenerator
class Lexer():
def __init__(self):
self.lexer = LexerGenerator()
def _add_token(self):
...
-1
votes
1
answer
160
views
Trouble with ANTLR4 Fortran 2018 Grammar - Unexpected Errors and Mismatched Input
I've been working on creating an ANTLR4 grammar for Fortran 2018 based on the BNF rules provided in the J3 Fortran 2018 document. I've directly converted each rule mentioned in the document into ...
0
votes
1
answer
31
views
Error recognizing character inside character literal in ANTLR4
I need to recognize a "character literal" in my lexical analyzer, however I am having some problems.
The lexical analyzer specification is as follows:
lexer grammar LexerRules;
INT ...
0
votes
1
answer
135
views
Compilation error with flex and lex.yy.cc on MacOS
I have written a flex file and using the flex command to create lex.yy.cc. When I try to run the lex.yy.cc to create the lexer it gives me and error from lex.yy.cc.
error: out-of-line definition of '...