Things Worth Trying:: Hacking Erlang

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Things Worth Trying:

code injection meta programming reverse engineering byte code anything that makes Ericsson cringe...

Hacking Erlang
building strange and magical creations
http://jacobvorreuter.com/hacking-erlang http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

The Abstract Format


a tree-like structure representing parsed Erlang code

Step 1
understanding the abstract format

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

The Abstract Format


a tree-like structure representing parsed Erlang code comprised of a list of forms

The Abstract Format


a tree-like structure representing parsed Erlang code comprised of a list of forms

What are forms?

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

The Abstract Format


a tree-like structure representing parsed Erlang code comprised of a list of forms
Forms are tuples that represent top-level constructs like function declarations and attributes

The Abstract Format


a tree-like structure representing parsed Erlang code comprised of a list of forms

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

The Abstract Format


a tree-like structure representing parsed Erlang code comprised of a list of forms

The Abstract Format


a tree-like structure representing parsed Erlang code comprised of a list of forms

form

form

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

The Abstract Format


a tree-like structure representing parsed Erlang code comprised of a list of forms

The Abstract Format


a tree-like structure representing parsed Erlang code comprised of a list of forms
Taking a step back:

Where do forms come from?


form

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

The Abstract Format


a tree-like structure representing parsed Erlang code comprised of a list of forms

Scanning Source Code


the rst step in compiling

Forms are generated by grouping and interpreting tokens scanned from source code.

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

Scanning Source Code


the rst step in compiling

Scanning Source Code


the rst step in compiling

use regular expressions to tokenize string


input

use regular expressions to tokenize string input generate a list of tuples, each representing
an atomic unit of source code
http://jacobvorreuter.com/hacking-erlang http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

erl_scan
This module contains functions for tokenizing characters into Erlang tokens.

erl_scan

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

token

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

token

token

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

erl_parse
This module is the basic Erlang parser which converts tokens into the abstract form of either forms, expressions, or terms.

erl_parse

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

erl_parse

erl_parse

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

compile
This module provides an interface to the standard Erlang compiler. It can generate either a new le which contains the object code, or return a binary which can be loaded directly.

compile

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

compile

compile

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

compile

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

dynamic_compile
The dynamic_compile module performs the actions weve just seen, plus takes care of macro expansion and inclusion of external header les.
http://github.com/JacobVorreuter/dynamic_compile

dynamic_compile

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

moving on...

the parse_transform debate...

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

yeah, you can do everything with macros anyway Programmers are strongly advised NOT to engage in parse transformations

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

wait! parse_transforms are cool and have their place in the language...in moderation.

How do parse_transforms work?

If the option {parse_transform, Module} is passed to the compiler, a user written function parse_transform/2 is called by the compiler before the code is checked for errors.

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

How do parse_transforms work?

How do parse_transforms work?

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

a pizza example

a pizza example

encode pizza

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

a pizza example
remember, at runtime all references to record instances have been replaced with indexed tuples.

a pizza example

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

expand_records.erl

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

intermission

Act II
compiling custom syntax

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

Compiling Custom Syntax

Compiling Custom Syntax

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

Compiling Custom Syntax


leex - A regular expression based lexical analyzer generator for Erlang, similar to lex or ex. yecc - An LALR-1 parser generator for Erlang, similar to yacc.

leex
The leex module takes a denition le with the extension .xrl as input and generates the source code for a lexical analyzer as output. <Header> Denitions. <Macro Denitions> Rules. <Token Rules> Erlang Code. <Erlang Code>
http://jacobvorreuter.com/hacking-erlang http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

example_scanner.xrl
<Header> Denitions. <Macro Denitions> Rules. <Token Rules> Erlang Code. <Erlang Code>

example_scanner.xrl
Denitions. A = [a-z][0-9a-zA-Z_]* I = [0-9]+ WS = ([\000-\s]|%.*) Rules. <Token Rules> Erlang Code. <Erlang Code>

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

example_scanner.xrl
Denitions. A = [a-z][0-9a-zA-Z_]* I = [0-9]+ WS = ([\000-\s]|%.*) Rules. \ \ \ \ \ {A} {I} \ \ \ \ {WS} {WS}+ : : : : : : : : : : : : {token,{module,TokenLine}}. {token,{function,TokenLine}}. {token,{'->',TokenLine}}. {token,{'[',TokenLine}}. {token,{']',TokenLine}}. {token,{atom,TokenLine,list_to_atom(TokenChars)}}. {token,{integer,TokenLine,list_to_integer(TokenChars)}}. {token,{'<-',TokenLine}}. {token,{'||',TokenLine}}. {token,{heart,TokenLine}}. {end_token,{dot,TokenLine}}. skip_token. Rules. \ \ \ \ \ {A} {I} \ \ \ \ {WS} {WS}+ : : : : : : : : : : : :

example_scanner.xrl
Denitions. A = [a-z][0-9a-zA-Z_]* I = [0-9]+ WS = ([\000-\s]|%.*) {token,{module,TokenLine}}. {token,{function,TokenLine}}. {token,{'->',TokenLine}}. {token,{'[',TokenLine}}. {token,{']',TokenLine}}. {token,{atom,TokenLine,list_to_atom(TokenChars)}}. {token,{integer,TokenLine,list_to_integer(TokenChars)}}. {token,{'<-',TokenLine}}. {token,{'||',TokenLine}}. {token,{heart,TokenLine}}. {end_token,{dot,TokenLine}}. skip_token.

Erlang Code. <Erlang Code>


http://jacobvorreuter.com/hacking-erlang http://github.com/JacobVorreuter

Erlang code.
http://jacobvorreuter.com/hacking-erlang http://github.com/JacobVorreuter

example_scanner.xrl yecc
The yecc module takes a BNF* grammar denition as input, and produces the source code for a parser.
<Header> <Non-terminals> <Terminals> <Root Symbol> <End Symbol> <Erlang Code>
* BackusNaur Form (BNF) is a metasyntax used to express context-free grammars: that is, a formal way to describe formal languages

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

example_parse.yrl
<Header> <Non-terminals> <Terminals> <Root Symbol> <End Symbol> <Erlang Code> The header provides a chance to add documentation before the module declaration in your parser

example_parse.yrl
Header "%% Copyright (C)" "%% @Author Jacob Vorreuter" <Non-terminals> <Terminals> <Root Symbol> <End Symbol> <Erlang Code>

We could do something like this, but whatever

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

example_parse.yrl
<Non-terminals> <Terminals> <Root Symbol> <End Symbol> <Erlang Code>

example_parse.yrl
<Non-terminals> <Terminals> <Root Symbol> <End Symbol> <Erlang Code> Terminal symbols are literal strings forming the input of a formal grammar and cannot be broken down into smaller units without losing their literal meaning

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

example_parse.yrl
<Non-terminals> Terminals atom integer heart module function '[' ']' '->' '<-' '||'. <Root Symbol> <End Symbol> <Erlang Code>

example_parse.yrl
<Non-terminals> Terminals atom integer heart module function '[' ']' '->' '<-' '||'. <Root Symbol> <End Symbol> <Erlang Code>

These terminal symbols are the products of the regular expressions in our lexical analyzer

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

example_parse.yrl
<Non-terminals> Terminals atom integer heart module function '[' ']' '->' '<-' '||'. <Root Symbol> <End Symbol> <Erlang Code> Nonterminal symbols are the rules within the formal grammar consisting of a sequence of terminal symbols or nonterminal symbols. Nonterminal symbols may self reference to specify recursion.

example_parse.yrl
Nonterminals element module_declaration function_declaration function_body comprehension. Terminals atom integer heart module function '[' ']' '->' '<-' '||'. <Root Symbol> <End Symbol> <Erlang Code>

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

example_parse.yrl
Nonterminals element module_declaration function_declaration function_body comprehension. Terminals atom integer heart module function '[' ']' '->' '<-' '||'. Here we are declaring symbols <Root Symbol> that will be further dened as <End Symbol> descendants of the root symbol <Erlang Code>

example_parse.yrl
Nonterminals element module_declaration function_declaration function_body comprehension. Terminals atom integer heart module function '[' ']' '->' '<-' '||'. <Root Symbol> The root symbol is the most <End Symbol> general syntactic category <Erlang Code> which the parser ultimately will parse every input string into.

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

example_parse.yrl
Nonterminals element module_declaration function_declaration function_body comprehension. Terminals atom integer heart module function '[' ']' '->' '<-' '||'. Rootsymbol element. element -> module_declaration : '$1'. element -> function_declaration : '$1'. module_declaration -> module atom : {attribute,line_of('$2'),module,value_of('$2')}. function_declaration -> function atom '->' function_body : {function,line_of('$2'),value_of('$2'),0,[{clause,line_of('$2'),[],[],'$4'}]}. function_body -> comprehension : ['$1']. comprehension -> '[' ']' : nil. comprehension -> '[' integer '<-' integer '||' heart ']' : {lc,line_of('$2'),{var,line_of('$2'),'A'},[{generate,line_of('$2'), {var,line_of('$2'),'A'}, {call,line_of('$2'),{remote,line_of('$2'),{atom,line_of('$2'),lists}, {atom,line_of('$2'),seq}},['$2','$4']}}]}. <End Symbol> <Erlang Code>
http://jacobvorreuter.com/hacking-erlang http://github.com/JacobVorreuter

example_parse.yrl
Nonterminals element module_declaration function_declaration function_body comprehension. Terminals atom integer heart module function '[' ']' '->' '<-' '||'. Rootsymbol element. element -> module_declaration : '$1'. element -> function_declaration : '$1'. module_declaration -> module atom : {attribute,line_of('$2'),module,value_of('$2')}. function_declaration -> function atom '->' function_body : {function,line_of('$2'),value_of('$2'),0,[{clause,line_of('$2'),[],[],'$4'}]}. function_body -> comprehension : ['$1']. comprehension -> '[' ']' : nil. comprehension -> '[' integer '<-' integer '||' heart ']' : {lc,line_of('$2'),{var,line_of('$2'),'A'},[{generate,line_of('$2'), {var,line_of('$2'),'A'}, {call,line_of('$2'),{remote,line_of('$2'),{atom,line_of('$2'),lists}, {atom,line_of('$2'),seq}},['$2','$4']}}]}. <End Symbol> <Erlang Code> the end symbol is a declaration of the end_of_input symbol that your scanner is expected to use.

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

example_parse.yrl
Nonterminals element module_declaration function_declaration function_body comprehension. Terminals atom integer heart module function '[' ']' '->' '<-' '||'. Rootsymbol element. element -> module_declaration : '$1'. element -> function_declaration : '$1'. module_declaration -> module atom : {attribute,line_of('$2'),module,value_of('$2')}. function_declaration -> function atom '->' function_body : {function,line_of('$2'),value_of('$2'),0,[{clause,line_of('$2'),[],[],'$4'}]}. function_body -> comprehension : ['$1']. comprehension -> '[' ']' : nil. comprehension -> '[' integer '<-' integer '||' heart ']' : {lc,line_of('$2'),{var,line_of('$2'),'A'},[{generate,line_of('$2'), {var,line_of('$2'),'A'}, {call,line_of('$2'),{remote,line_of('$2'),{atom,line_of('$2'),lists}, {atom,line_of('$2'),seq}},['$2','$4']}}]}. Endsymbol dot. <Erlang Code>
http://jacobvorreuter.com/hacking-erlang http://github.com/JacobVorreuter

example_parse.yrl
Nonterminals element module_declaration function_declaration function_body comprehension. Terminals atom integer heart module function '[' ']' '->' '<-' '||'. Rootsymbol element. element -> module_declaration : '$1'. element -> function_declaration : '$1'. module_declaration -> module atom : {attribute,line_of('$2'),module,value_of('$2')}. function_declaration -> function atom '->' function_body : {function,line_of('$2'),value_of('$2'),0,[{clause,line_of('$2'),[],[],'$4'}]}. function_body -> comprehension : ['$1']. comprehension -> '[' ']' : nil. comprehension -> '[' integer '<-' integer '||' heart ']' : {lc,line_of('$2'),{var,line_of('$2'),'A'},[{generate,line_of('$2'), {var,line_of('$2'),'A'}, {call,line_of('$2'),{remote,line_of('$2'),{atom,line_of('$2'),lists}, {atom,line_of('$2'),seq}},['$2','$4']}}]}. The Erlang code section Endsymbol dot. can contain any functions that we need to call from <Erlang Code> our symbol denitions
http://jacobvorreuter.com/hacking-erlang http://github.com/JacobVorreuter

example_parse.yrl
Nonterminals element module_declaration function_declaration function_body comprehension. Terminals atom integer heart module function '[' ']' '->' '<-' '||'. Rootsymbol element. element -> module_declaration : '$1'. element -> function_declaration : '$1'. module_declaration -> module atom : {attribute,line_of('$2'),module,value_of('$2')}. function_declaration -> function atom '->' function_body : {function,line_of('$2'),value_of('$2'),0,[{clause,line_of('$2'),[],[],'$4'}]}. function_body -> comprehension : ['$1']. comprehension -> '[' ']' : nil. comprehension -> '[' integer '<-' integer '||' heart ']' : {lc,line_of('$2'),{var,line_of('$2'),'A'},[{generate,line_of('$2'), {var,line_of('$2'),'A'}, {call,line_of('$2'),{remote,line_of('$2'),{atom,line_of('$2'),lists}, {atom,line_of('$2'),seq}},['$2','$4']}}]}. Endsymbol dot. Erlang code. value_of(Token) -> element(3, Token). line_of(Token) -> element(2, Token).
http://jacobvorreuter.com/hacking-erlang http://github.com/JacobVorreuter

example_parse.yrl

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

example_parse.yrl

example_parse.yrl

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

example_parse.yrl

example4.erl

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

http://jacobvorreuter.com/hacking-erlang

http://github.com/JacobVorreuter

custom syntax in the wild...



Lisp Flavored Erlang Prolog Interpreter for Erlang Erlang implementation of the Django Template Language

END
http://jacobvorreuter.com/hacking-erlang http://github.com/JacobVorreuter
http://jacobvorreuter.com/hacking-erlang http://github.com/JacobVorreuter

You might also like