3

I would like to define non-line-termination-character = <any character other than %x000D / %x000A> in lexer.mll. I have tried let non_line_termination_character = [^('\x0D' '\x0A')], but it gave me a syntax error.

I think let non_line_termination_character = [^'\x0D'] intersect [^'\x0A'] would work, but I don't know how to express intersect.

Could anyone help?

PS: The rule is at 12.2.4 Regular expressions of : http://caml.inria.fr/pub/docs/manual-ocaml/manual026.html

2 Answers 2

3

The syntax of a character set in ocamllex doesn't allow parentheses. The following works for me:

let non_line_termination_character = [^ '\x0d' '\x0a' ]

There's no general operator for intersecting regular expressions in ocamllex. But for two character sets a and b you can write a # (a # b).

let nona = [^ 'a']
let nonb = [^ 'b']
let nonab = nona # (nona # nonb)

(Weirdly, my tests show this works for every character set I try, except it fails for your specific example of non-CR and non-LF. It actually seems like a bug. But maybe I'm missing something obvious.)

2

The set described by [^'\x0D'] includes '\x0A' and vice-versa, so the union of the two sets includes everything. I think this is what you were trying for:

[^'\x0D' '\x0A']
0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.