Compiler Construction-Ii Project "C2ASM" (Cross Compiler)

Download as rtf, pdf, or txt
Download as rtf, pdf, or txt
You are on page 1of 27

COMPILER

CONSTRUCTION-II
PROJECT
“C2ASM”
(Cross Compiler)
COURSE TITLE: COMPILER CONSTRUCTION-II
(BSCS-603)

PROJECT TITLE: C2ASM,Cross Compiler from C to ASSEMBLY

HELP IN DOCUMENTATION BY:


Iqbal Ahmed Siddiqui. He is my class mate. Thanks a Lot

HELP IN GRAMMAR MAKING BY:


Rana Kashif . He’s my Senior and a very good friend of mine.He
always tries his best to help me, Whenever I need him.Thanks a
LOT.

COURSE SUPERVISOR:
Sir Tafseer Ahmed Khan
Madam Sadaf Alvi

DEPARTMENT: COMPUTER SCIENCE


PREFACE:
At last the project ended and more important than this that it ends well.I’ve achieved
all of my mentioned goals, THANKS TO ALLAH.It was an iterative process,make something and
than after sometime review that and remake if necessary or amend it.Indeed Breathtaking work.
I like to share all of my experiences regarding this project. Some are GOOD and
SOME are BAD.So,BAD ones 1st.
It’s a proven fact,that these difficult things have very less scope in Pakistan.So,Mostly
Nobody likes to put that much effort in it’s project as they do with some database project.The other
reason is that things are not well documented.If u like to do more than parsing then u’re on your
own. Really Frustrating!
My worst experience is trying to take HELP from ULMAN’s Book for INTERMEDIATE
code GENERATION.I think that he shld re-write his book and try to place a solid example like he
did for a LEXICAL BOX.Because,It takes a lot of time to go from ABSTRACTION to
IMPLEMENTATION.I think more than teaching he invents only new JARGONS.His Own
Terms.And if an average student like me tries to take help from his BOOK,he/she found him/her lost
in difficult terms and at the end gained nothing more than HEADACHE.
I’ve taken only LEXICAL BOX,Three Address Code idea from his book and rest i generated
myself with the help of my previous programming knowledge and Imagination. So,If he won’t
bother to re-write his book , Then for an average student like me it’s not more than Theoretical Stuff
for impressing others and Nothing MORE!
Now Comes the Assembly GIANT.I’ve witnessed that EAX register lost all the upper 16-
Bits,As soon as i assign it a value and went to next instruction.Well,I tried to locate the fault but
i’venot found it yet.
Here Comes the Good PART of Compiler Project.The only thing i get is SELF-
SATISFACTION. Making a COMPILER project proofs that one is able to create LOGIC and he/she
can go from ABSTRACTION to REALITY, Means He/She is not merely a TOOL USER.Sounds
Over exaggerated?. Make One then Re-Evaluate it,If it’s the case.
Lastly,I would like to Thanks Sir.Tafseer Ahmed Khan. He’s one of the 3 great teachers of
our DEPT. Whom i’ve seen till now.I SALUTE you,U’re indeed simply MASTER of the
GAME,Simply He’s the BEST!

Thanks,
M Owais Khan Afridi,
C2ASM Programmer.
We Need To Make Transition Diagrams For Identifiers And Keywords For
Others.

FA For Identifiers And Keywords :-


l/-/d

*
Start l/- Other

FA For Relational Operators :-

For ‘= =’ And ‘=” :-

Start = =

Other *

For ‘< >’,’<=’,’<’ :-

Start < =

>

Other *
For ‘ >’,’>=’ :-

Start > =

Other *

FA For Arithmetic Operators :-

For ‘%’,‘/’,’*’ ,’+’,’-’:-

Start / / / /-
% / * +

FA For Punctuations :-

For ( , ) , { , } , , , ;

Start ( ,),{,},,,;

FA For Numbers :-

Start d L/l

Other
*
FA For Errors :-
^

TOKEN SET DESCRIPTION:


The token set of the language represents the non-terminals
in the grammar of the language. The token set given below contains more tokens than that required
by the grammar that is implemented. Further attributes are also mentioned for some of the tokens
that may be required in the implementation of the code generator but are not very essential for the
working of the parser

Keywords Punctuations Numbers

Void ( 123
Main ) 231
Int { 0 etc
Long }
If , Identifiers
Else ; variable names
For function names
While
Do
Return

OPERATORS

Relational operators Arithmetic operators Assignment operator

== + =
<> -
> /
>= *
< %
<=

Following are the Tokens generated by Lexical Analyzer.

Format:
(class , value)
Keywords:
(void , ----)
(main, ---)
(dt, int/long)
(if, ---)
(else, ---)
(for, ---)
(while, ---)
(do, ---)
(return, ---)
Identifiers:
(Id, _1/_fld/…….etc )

Punctuations:
braces_open={
braces_close=}
paranthesis_open=(
paranthesis_close=)
comma= ,
semicolon= ;
square_open= [
square_close= ]

Operators:
relop= ‘==’, ‘<>’ , ‘>=’ , ‘<=’ , ‘>’ , ‘<’
assignop= ‘=’
add_sub= ‘+’,’-‘
mul_div_mod= ‘*’ , ’/’ , ‘%’

Numbers:
int_const = 0,1,2,232,2323,…….etc
long_const = 1L,2L,232l,2312123L,…….etc

GRAMMAR:
Features
The grammar for the C-Language subset has the following notable features.
1. Multiple Global Variable Declarations
2. Multiple Global Function Declarations
3. Main program
4. Variable Declarations at the start of the program as in C-Language
5. for, while and do-while loops
6. Nested loops
7. Function calls but they are different from original C-Language function calls,
8. The ‘return’ key word
9. The argument to functions or the right hand side of an assignment operator can be function
calls
10. Recursion is allowed.

Productions
The grammar for the language has been split into three sections for its easy understanding. Three
sections in which the grammar is split into are

1. Main Program Productions


2. Statement Productions
3. Expression Productions

Following are the productions for each section of the grammar

Main Program Productions :

1. <Program> à dt id <data-or-function> <Program>


Selection-Set = dt

2. <Program> àvoid <function-or-main>


Selection-Set = void

,
3. <data-or-function> à id <data-or-function>
Selection-Set = ,
4. <data-or-function> à ;

Selection-Set = ;
5. <data-or-fucntion> à (<argument-list>) <function-body>
Selection-Set = (

6. <function-or-main> à main(void) <function-body>


Selection-Set = main

7. <function-or-main> à id (<argument-list>) <function-body> <Program>


Selection-Set = id

8. <function-body> à { <variable-declarations> <statements> }


Selection-Set = {

9. <variable-declarations> à dt id <variable-list>; <variable-declarations>


Selection-Set = dt

10. <variable-declarations> à Є
Selection-Set = id , { , while , for , do , return , if , [ , }

11. <variable-list> à , id <variable-list>


Selection-Set = ,

12. <variable-list> à Є
Selection-Set = ;
13. <argument-list> à void
Selection-Set = void

14. <argument-list> àdt id <argument>


Selection-Set = dt

15. <argument> à , dt id <argument>


Selection-Set = ,

16. <argument> à Є
Selection-Set = )

17. <compound-statement> à {<statements>}


Selection-Set = {

Statement Productions :

18. <statements> à <statement> <statements>


Selection-Set = id , { , while , for , do , return , if , [

19. <statements> à Є
Selection-Set = }

20. <statement> à id = <right-hand-side>;


Selection-Set = id

21. <statement> à <compound-statement>


Selection-Set = {

22. <statement> à while (<expression>) <statement>


Selection-Set = while

23. <statement> à for (id=<expression>; id relop <expression>; id =<expression>)


<statement>
Selection-Set = for

24. <statement> à do <compound-statement>while(<expression>);


Selection-Set = do

25. <statement> à return <right-hand-side>;


Selection-Set = return

26. <statement> à if (<expression>) <statement><optional-else>


Selection-Set =if

27. <statement> à [<function-call>];


Selection-Set = [

28. <optional-else> à else <statement>


Selection-Set = else

29. <optional-else> à Є
Selection-Set = id , while , { , for , do , return , if , } , [

30. <right-hand-side> à <expression>


Selection-Set = ( , id , int_const , long_const

31.<right-hand-side> à [<fucntion-call>]
Selection-Set = [

32.<fucntion-call> à id (<optional-expression-list>)
Selection-Set = id

33. <optional-expression-list> à Є
Selection-Set = )
34.<optional-expression-list> à <expression-list-element> <expression-listf>
Selection-Set = ( , id , int_const , long_const , [

35. <expression-listf> à , <expression-list-element> <expression-listf>


Selection-Set = ,

36. <expression-listf> à Є
Selection-Set = )

37.<expression-list-element> à <expression>
Selection-Set = ( , id , int_const , long_const

Expression Productions :

38.<expression> à <arithmetic> <relational>


Selection-Set = ( , id , int_const , long_const

39.<relational> à relop <arithmetic>


Selection-Set = relop

40. <relational> à Є
Selection-Set = ) , ; , ,

41.<arithmetic> à <T> <Subract>


Selection-Set = ( , id , int_const , long_const

42. <Subract> à - <T><Subract>


Selection-Set = -

43. <Subract> à Є
Selection-Set = relop , ) , ; , ,

44.<T> à <U> <add>


Selection-Set = ( , id , int_const , long_const

45.<Add> à + <U><add>
Selection-Set = +
46. <Add> à Є
Selection-Set = - , relop , ) , ; , ,

47.<U> à <V> <Multiply>


Selection-Set = ( , id , int_const , long_const

48.<Multiply> à * <V><Multiply>
Selection-Set = *

49. <Multiply> à Є
Selection-Set = + , - , relop , ) , ; , ,

50.<V>à <W> <divide>


Selection-Set = ( , id , int_const , long_const

51.<divide> à / <W><divide>
Selection-Set = /

52. <divide> à Є
Selection-Set = * , + , - , relop , ) , ; , ,

53.<W>à <X> <mod>


Selection-Set = ( , id , int_const , long_const

54.<mod> à % <X> <mod>


Selection-Set = %

55. <mod> à Є
Selection-Set = / , * , + , - , relop , ) , ; , ,
56.<X> à (<expression>)
Selection-Set = (

57.<X>à id
Selection-Set = id

58.<X> à int-const
Selection-Set = int_const

59.<X> à long-const
Selection-Set = long_const

Intermediate Code Generation Or Syntax Box:


If we wanna make Inetermediate Code from
the TOKENs,which are being parsed by a PARSER,Like Recrusive Descend Parser.We need to have
a Well Defined Architecture,Including Action Symbols{Type System Implemented as A Part Of
it},Symbol Tables and ATOM SET.
In order to get this work started we’ll first make an ATTRIBUTED TRANSLATION
GRAMMAR from the above LL(1) Grammar.

ATTRIBUTED TRANSLATION GRAMMAR:

Convention:
• Those ACTION SYMBOLS,Which are in small letters are used for type checking,No
INTERMEDIATE CODE is Generated for them.
• Those ACTION SYMBOLS,Which are in CAPITAL LETTERS letters are used to show
ATOMS,means INTERMEDIATE CODE is generated for them.
• Those in Bold Italic belong to TOKEN SET.

1. <Program> à dtt idi { Set_type i,t } <data-or-function>k <Program>


kßt
2. <Program> àvoid <function-or-main>

3. <data-or-function>k à , idi {Set_typei,k } <data-or-function>k

4. <data-or-function>k à ;

5. <data-or-fucntion>k à ({isreturn=false}<argument-list>) <function-body>

6. <function-or-main> à main{main_flag=true}(void) <function-body>

7. <function-or-main> à idi {Set_typei,22} (<argument-list>) <function-body> <Program>

8. <function-body> à { {PROC_MARKERSTART}{Update-scope} <variable-declarations>


<statements> {PROC_MARKEREND}{Update-Scope} {If it’s not
main( ) Calc_Local_Offset}{Check If Function has to return
something ,TEST isreturn flag,whether TRUE or NOT}{If it’s main( )
then main_func=false}{De-Registering Fucntion if it was local Set
func_index=-10}{Re-Setting the Return Flag isreturn=false }}

9. <variable-declarations> à dtt idi {Set_typei,t} <variable-list>t ; <variable-declarations>

10. <variable-declarations> à Є

11. <variable-list>t à , idi {Set_typei,t} <variable-list>t

12. <variable-list>t à Є

13. <argument-list> à void {Set parameter Info of this Particular Function to Void}
{Do Function binding}{Registering Function’s Start, Set func_index}

14. <argument-list> àdtt idi {Set_type_argsi,t} <argument> {Do Function binding}


{Registering Function’s Start, Set func_index}{Caculate Fucntion
Parameter's Offsets Use calc_param_offset( ) }{Store the Total SIZE of
PARAMETERS}

15. <argument> à , dtt idi {Set_type_argsi,t} <argument>

16. <argument> à Є

17. <compound-statement> à {<statements>}

18. <statements> à <statement> <statements>

19. <statements> à Є

20. <statement> à idi {Check_identi} = <right-hand-side>r {Chk_typesi,r} ; {ASSIGNr,i}

21. <statement> à <compound-statement>

22. <statement> à while {newlabelstart,end} {LABELstart} (<expression>)k


{CONDJUMPk,0,end} <statement> {JUMPstart} {LABELend}

23. <statement> à for{newlabelstart,end,t1,t2} (idi {Check_identi}=<expression>e


{Chk_typesi,e} {ASSIGNe,i};{LABELstart} idi2 {Chk_identi2} relopr
<expression>e2 {Chk_typesi2,e2} {JUMPFi2,e2,end}
{JUMPt1};{LABELt2} idi4 {Chk_identi4}=<expression>e4
{Chk_typesi4,e4} {ASSIGNi4,e4} {JUMPstart}) {LABELt1} <statement>
{JUMPt2} {LABELend}

24. <statement> à do {newlabelstart} {LABELstart} <compound-statement>


while(<expression>k); {CONDJUMPk,1,start}

25. <statement> à return {isreturn=true} <right-hand-side>r ; {Chk with function’s return


type} {RETURNbinding,r}

26. <statement> à if {newlabelend,end2} (<expression>k) {CONDJUMPk,0,end} <statement>


{JUMPend2}{LABELend}<optional-else>{LABELend2}

27. <statement> à [<function-call>f]{Check Whether Return Value is being caught or


NOT,If available};

28. <optional-else> à else <statement>

29. <optional-else> à Є

30. <right-hand-side>r à <expression>e


rße

31.<right-hand-side>r à [<fucntion-call>f]
{Find Function’s Return Value and assign it’s referece to
Right hand Side i.e,r }
rßf

32.<fucntion-call>f à idi {Chk_funci} (<optional-expression-list>)


{Chk_for_parameter_mismatch} {CALLn,id.name}
n ß Total parameter

33. <optional-expression-list> à Є

34.<optional-expression-list> à <expression-list-element> <expression-listf>


35. <expression-listf> à , <expression-list-element> <expression-listf>

36. <expression-listf> à Є

37.<expression-list-element> à <expression>e {Chk-types e,Argument’s type expected}


{PARAM e}

--CONVENTIONS:
*<Subract/Add/Multiply/Divide/Mod>p,q
p=Inherited Attribute,q=Synthesized Attribute
*<arithmetic/T/U/V/W/X>p
p= Synthesized Attribute
*<relational>k,t1
k,t1= Synthesized Attribute

38.<expression>k à <arithmetic>t1 <relational>k,t1

39.<relational>k,t1 à relop <arithmetic>t2 {Chk_types t1,t2} {newtempk} {CMPt1,t2,k}

40. <relational>k,t1 à Є {ASSIGN k,t1}

41.<arithmetic>p à <T>p1 <Subract>p1,p

42. <Subract>p,q à - <T> q1 {Chk-types}p,q1 {newtempr1} {SUBp,q1,r1} <Subract>r1,q

43. <Subract>p,q à Є {ASSIGNp,q}

44.<T>p à <U>p1 <add>p1,p

45.<Add>p,q à + <U>q1 {Chk-typesp,q1} {newtempr1} {ADDp,q1,r1} <add>r1,q

46. <Add>p,q à Є {ASSIGNq,p}

47.<U>p à <V>p1 <Multiply>p1,p

48.<Multiply>p,q à * <V>q1 {Chk-typesp,q1} {newtempr1} {MULTp,q1,r1} <Multiply>r1,q

49. <Multiply>p,q à Є {ASSIGNp,q}


50.<V>p à <W>p1 <divide>p1,p

51.<divide>p,q à / <W>q1 {Chk-typesp,q1} {newtempr1} {DIVp,q1,r1} <divide>r1,q

52. <divide>p,q à Є {ASSIGNp,q}

53.<W>p à <X>p1 <mod>p1,p

54.<mod>p,q à % <X>q1 {Chk-typesp,q1} {newtempr1} {MODp,q1,r1} <mod>r1,q

55. <mod>p,q à Є {ASSIGNp,q}

56.<X>p à (<expression>p)

57.<X>p à idi {Chk-identi}


pßi

58.<X>p à int-consti
pßi

59.<X>p à long-consti
pßi

SYMBOL TABLES FORMAT:

1) Symbol table for Local/Main/Global/Temporary Variables/Function Names/Parameters.


Specific instances in our code are syn_identifier, args_identifier. Which is a Vector of
STL(C++)

Name Datatype Scope Binding Function Type Offset Xtra


_or_Not
Datatype String 0,1,22 0,1,2. -2,-1, 0 or 1 0,1,2 -10 or -10 or
. Func_indx ,3,4 2n 2n
……. n=1,2…n n=1,2….n

• Name:
It’s a String Class Object
• Datatype
0=Int
1=Long
22=Void
• Scope
0=Global Scope
1,2,3…….n= ScopeStack( ).Top
Where “ScopeStack( ).Top” is a Method, that will give CURRENT SCOPE.
• Binding
-2= IdentifierBinded Globally
-1= IdentifierBinded to Main()
Func_indx=Index of Variable[Must be a Function Variable] to which a Local/Temp
Variable is binded

• Function_or_not
Function_or_not=0, If Identifier is not a function
Function_or_not=1, If Identifier is a Function
• Type
0=Global Identifier
1=Main Identifier
2=Local Identifier
3=Parameter Identifier
4=Temporary Identifier

• Offset
-10 = Undefined
2n=Where n=1,2,3,4,……..n
2n must be calculated by programmer defined function for Local And Parameters Of
Functions

• Xtra
-10 = Undefinded
2n=Where n=1,2,3,4,……..n
2n must be calculated by programmer defined function for the TOTAL SUM of all
PARAMETERS

2) Symbol tables for Numbers either Long or Integer CONSTANTs.


Specific instances in our code are number_long,number_int. Which is a Vector of
STL(C++).

Int/Long NUMBER
3) Symbol table for Labels.
Specific instance in our code is labels. Which is a Vector of STL(C++)

STRING

4) Since,We’ll generate Three Address Code as an Intermediate Code.Therefore,Each element


of three address code shld have a representation.Which is as follows.
Op Type Arg1 Arg2 Result
Datatype String Int Expr Structure Expr Structure Expr Structure

• op
op=OP-Code Like in ASSEMBLY or they’are ATOMS generated by the Syntax
Box.In Our case it can’ve values like LABEL,ASSIGN,CONDJUMP,JUMP,
RETURN,PARAM,CALL,JUMPF,CMP,ADD,SUB,MUL,DIV,MOD,
PROC_MARKER.

• Type
It’s an Internal representation of all the ATOMs in an INTEGER.Like LABEL has a
type 25,CALL has type 31,etc.

• Expr
It’s an structure,desgined for making ATOM Sets as small and with no REDUNDANT
information already present in other tables.It has following structure.
Index Datatype Whichtable
Datatype Int Int int

- Index
It points to the ORIGNAL position of identifiers in other tables.Which may be
syn_identifier’s or args_identifier’s INDEX.
- Datatype
It has a value
0=int
1=long
22=void
23=int_const
24=long_const
- Whichtable
As we’ve been discussing the TABLE formats in our compiler. It’s evident that
there are 5 tables with which we’re running the whole SYNTAx Box and Code
Generator.To Facilitate the programming these tables are ASSUMED to have
some INTEGER NUMBERs attached to them,Which is as Follows.
0=syn_identifier
1=args_identifier
2=number_long
3=number_int
4=lables

ATOM SET or INTERMEDIATE CODE:


Now we’ve enough information about the table
formats used in this compiler we can show the Atom Set or Intermediate Code generated in the 3
ADDRESS FORMAT as follows.
Op Type Arg1 Arg2 Result
(Internal Code)
LABEL 25 X X Label
ASSIGN 26 R.H.S X L.H.S
CONDJUMP 27 ID or Expression 0 or 1 Label
JUMP 28 X X Label
RETURN 29 Function’s X ID or Expression
Location
PARAM 30 X X ID or Expression
CALL 31 No.of Arguments X Function Name
JUMPF 32 ID Expression Label
CMP 100 ID or Expression ID or Expression ID or Expression
PROC_MARKER 34 Function Name X X
ADD 15 ID or Expression ID or Expression ID
SUB 16 ID or Expression ID or Expression ID
MUL 17 ID or Expression ID or Expression ID
DIV 18 ID or Expression ID or Expression ID
MOD 19 ID or Expression ID or Expression ID

Note: Since Arg1,Arg2,Result are all Expr type Structures,Therefore They all are defined in terms
of Expr’s Fields.Those Fields which’ve an X in their place are NOT USED.We’ve used “-10” for
all the things which are UNDEFINED or have no Relevant meaning in that context

LABEL:
It outputs a label in the code.
Result.Index=Pointer to “labels “ symbol table particular entry
Result.Datatype= -10
Result.Whichtable=Table Number,here it’s 4

ASSIGN:
It’ll perform the assignment operation like a=b or a=v+f….etc in the program.
Arg1=R.H.S
Arg1.Index=Points to “syn_identifier” or “args_identifier” or “number_long” or
“number_int” SYMBOL table’s particular entry.
Arg1.Datatype=0 for int
1 for long
23 for int_const
24 for long_const
Arg1.Whichtable
0=syn_identifier
1=args_identifier
2=number_long
3=number_int

Result =L.H.S
Result.Index=Points to “syn_identifier” or “args_identifier” SYMBOL table’s
particular entry.
Result.Datatype=
0 for int
1 for long
Result.Whichtable=
0=syn_identifier
1=args_identifier

CONDJUMP:
Arg1:
Arg1.Index=Points to “syn_identifier” SYMBOL table’s
particular entry.
Arg1.Datatype=
0 for int
1 for long
Arg1.Whichtable=
0=syn_identifier
Arg2:
Arg2.Index= -10
Arg2.Datatype=0 or 1
We’ll use this value for comparison with Arg1
Arg2.Whichtable= -10
Result:
Result.Index=Pointer to “labels “ symbol table particular entry
Result.Datatype= -10
Result.Whichtable=Table Number,here it’s 4

JUMPF:
Result:
Result.Index=Pointer to “labels “ symbol table particular entry
Result.Datatype= -10
Result.Whichtable=Table Number,here it’s 4

RETURN:
Arg1:
Arg1.Index=Pointer to “syn_identifier” table’s particular entry
Arg1.Datatype=0 or 1 or 22
Arg1.Whichtable=Table Number,Here it’s 0
Result:
Result.Index=Pointer to “syn_identifier“ or “args_identifier” or “number_long”
or ”number_int” symbol table’s particular entry
Result.Datatype=0 for int
1 for long
23 for int_const
24 for long_const
Result.Whichtable
0=syn_identifier
1=args_identifier
2=number_long
3=number_int

PARAM:
Result:
Result.Index=Pointer to “syn_identifier“ or “args_identifier” or “number_long”
or ”number_int” symbol table’s particular entry
Result.Datatype=0 for int
1 for long
23 for int_const
24 for long_const
Result.Whichtable
0=syn_identifier
1=args_identifier
2=number_long
3=number_int

CALL:
Arg1:
Arg1.Index=Number Of arguments expected
Arg1.Datatype= -10
Arg1.WhichTable= -10
Result:
Result.Index=Pointer to “syn_identifier“ symbol table’s particular entry
Result.Datatype=0 for int
1 for long
22 for Void
It Shows Function’s return type
Result.Whichtable
0=syn_identifier
JUMPF:
Arg1:
Arg1.Index=Pointer to “syn_identifier” symbol table’s particular entry.
Arg1.Datatype=0 for int
1 for long
Arg1.Whichtable
0=syn_identifier
1=args_identifier
Arg2:
Arg2.Index=Pointer to “syn_identifier” or “args_identifier” or
“number_int” or “number_long” symbol table’s particular
entry.
Arg2.Datatype=0 for int
1 for long
23 for int_const
24 for long_const
Arg2.Whichtable
0=syn_identifier
1=args_identifier
2=number_long
3=number_int
Result:
Result.Index=Pointer to “labels” symbol table’s particular entry.
Result.Datatype= -10
Result.Whichtable
4=labels
CMP:
Arg1:
Arg1.Index=Pointer to “syn_identifier” or “args_identifier” or
“number_int” or “number_long” symbol table’s particular
entry.
Arg1.Datatype=0 for int
1 for long
23 for int_const
24 for long_const
Arg1.Whichtable
0=syn_identifier
1=args_identifier
2=number_long
3=number_int
Arg2:
Arg2.Index=Pointer to “syn_identifier” or “args_identifier” or
“number_int” or “number_long” symbol table’s particular
entry.
Arg2.Datatype=0 for int
1 for long
23 for int_const
24 for long_const
Arg2.Whichtable
0=syn_identifier
1=args_identifier
2=number_long
3=number_int

Result:
Result.Index=Pointer to “syn_identifier”
symbol table’s particular entry.
Result.Datatype=0 for int
1 for long
Result.Whichtable
0=syn_identifier

ADD/SUB/MUL/DIV/MOD:
Arg1:
Arg1.Index=Pointer to “syn_identifier” or “args_identifier” or
“number_int” or “number_long” symbol table’s particular
entry.
Arg1.Datatype=0 for int
1 for long
23 for int_const
24 for long_const
Arg1.Whichtable
0=syn_identifier
1=args_identifier
2=number_long
3=number_int
Arg2:
Arg2.Index=Pointer to “syn_identifier” or “args_identifier” or
“number_int” or “number_long” symbol table’s particular
entry.
Arg2.Datatype=0 for int
1 for long
23 for int_const
24 for long_const
Arg2.Whichtable
0=syn_identifier
1=args_identifier
2=number_long
3=number_int
Result:
Result.Index=Pointer to “syn_identifier”symbol table’s particular entry.
Result.Datatype=0 for int
1 for long
Result.Whichtable
0=syn_identifier

PROC_MARKER:
Arg1:
Arg1.Index= -1 For main() or Function’s Index For OTHER FUNCTIONS
Arg1.Datatype= 1 for Start or 0 for End
Arg1.Whichtable=Saving The TOTAL LENGTH of OFFSETS of Local
Variables/Temporaries of a Particular Function

Action Symbols:
There are many action Symbols used in this particular compiler.Many of them
are used for type checking and preparing other information which is being used by the code
generator.We’ve implemented them as Helper Functions having Declarations as follows.There
names help us guess their respective functionality

//Helper Functions
void settype(int index,int type);
void settype_args(int index,int type);
expr newtemp(int type);
string newtempname(void);
long chk_ident(long index);
bool chk_types(expr v1,expr v2);
void setatom(int op,expr arg1,expr arg2,expr result);
int newlabel(void);
long chk_func(long index);
void args_info(long index,long &init_arg,long &fin_arg);
int calc_param_offset();
int calc_local_offset();

CODE GENERATOR:
Now,The Code generator will take ATOMs STREAMS as input and
start making ASSEMBLY CODE.We’ve coded Functions against all ATOMS.So,When a Particular
Atoms is seen it’s corresponding function is called, generating ASSEMBLY code for it.Which can
be TESTED on an assembler.

MY CODING CONVENTION:
I’ve used mapping of all Keywords, Punctuations, Operators, Tokens, Atoms or
Intermediate Code to INTEGER NUMBERS, To ease Programming. Take a look at following to
better understand what I’m trying to say. I’ve used these number counterparts all over in my
implementation,since numbers are easy to handle then strings and more efficient.

"int" = 0
"long" = 1
"{" = 2
"}" = 3
"(" = 4
")" = 5
"," = 6
";" = 7
"==" = 8
"<>" = 9
">=" = 10
"<=" = 11
">" = 12
"<" = 13
"=" = 14
"+" = 15
"-" = 16
“*" = 17
"/" = 18
"%" = 19
"[" = 20
"]" = 21
"void" = 22
“int_const" = 23
"long_const" = 24
label" = 25
"assign" = 26
"condjump" = 27
"jump" = 28
"return" = 29
"param" = 30
"call" = 31
"jumpf" = 32
"cmp" = 100
"proc_marker" = 34

You might also like