DLP Practicals 1 To 7

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 24

CE442 Design of Language Processors 16CE076

PRACTICAL 1
Aim: Implement a simple ‘C’ program and generate the following codes for that. (1)
Preprocessed code (3) Object Code (2) Assembly Code (4) Executable Code.

Software Requirements: Unix/Linux Telnet


Hardware Requirements: PC / Laptop
Knowledge Requirements: Basic knowledge of processed code , object code , Assembly
code and Executable code.

Theory:
Initially, the program is in C language hence program_name.c
i. The pre-processor:
This is the first phase through which source code is passed. This phase includes:
 Removal of Comments
 Expansion of Macros
 Expansion of the included files.
The preprocessed output is stored in the program_name i

ii. The compiler:


The compiler analyzes the expanded code by lexical analyzer, syntax analyzer and semantic
analyzer. The next step is to compile program_name.i and produce an intermediate compiled
output file program_name.s, which is an assembly code. This file is in assembly level instructions.
iii. The assembler:
The assembler converts assembly language code to machine language code. In this phase the
program_name.s is taken as input and turned into program_name.o by assembler. This file contain
machine level instructions. At this phase, only existing code is converted into machine language,
the function calls like printf() are not resolved.
iv. The linker:
This is the final phase in which all the linking of function calls with their definitions are done.
Linker knows where all these functions are implemented. Linker does some extra work also, it
adds some extra code to our program which is required when the program starts and ends.
v. The loader:
Loader is the program of the operating system which loads the executable from the disk into the
primary memory(RAM) for execution. It allocates the memory space to the executable module in
main memory and then transfers control to the beginning instruction of the program.
CE442 Design of Language Processors 16CE076

Program:
To find area of a circle:

Output:

1. Pre-processed code:
CE442 Design of Language Processors 16CE076

2. Object code:
CE442 Design of Language Processors 16CE076

3. Assembly code:
CE442 Design of Language Processors 16CE076

4. Executable code:

Conclusion: We are able to learn how to implement a program by understanding how it is


processed by the pre-processor, compiler, assembler, loader and linker.
CE442 Design of Language Processors 16CE076

Practical-2
Aim : Implement a program to implement DFA for any RE. Program will also take input string
from the user and check that string is accepted by DFA or not.
Software Requirements: Code block
Hardware Requirements: PC / Laptop
Knowledge Requirements: Basic knowledge of conversation of DFA from from RE
Theory:
DFA (Deterministic Finite Automaton or Acceptor) is a finite state machine that accepts or rejects
strings of symbols. DFA accepts the string if it reaches the final state and rejects otherwise.
Now the problem is, provided a string as input character by character and we have to check whether
the string starts and ends with ‘a’. We can only store the current character, since there is no concept
of memory and hence the DFA cannot store the string provided. Otherwise, we could have just
checked the first and last character for this problem. The input set for this problem is (a, b).
We cannot store anything accept the current character, which make this program a little different
and tough than other string related problems
Code :
#include<stdio.h>
#include<conio.h>
int ninputs;
int check(char,int );
int dfa[10][10];
char c[10], string[10];
int main()
{
int nstates, nfinals;
int f[10];
int i,j,s=0,final=0;
printf("enter the number of states that your dfa consist of \n");
scanf("%d",&nstates);
printf("enter the number of input symbol that dfa have \n");
scanf("%d",&ninputs);
printf("\nenter input symbols\t");
for(i=0; i<ninputs; i++)
{
printf("\n\n %d input\t", i+1);
printf("%c",c[i]=getch());
CE442 Design of Language Processors 16CE076

}
printf("\n\nenter number of final states\t");
scanf("%d",&nfinals);

for(i=0;i<nfinals;i++)
{
printf("\n\nFinal state %d : q",i+1);
scanf("%d",&f[i]);
}
printf("-----------------------------------------------------------------------");
printf("\n\ndefine transition rule as (initial state, input symbol ) = final state\n");
for(i=0; i<ninputs; i++)
{
for(j=0; j<nstates; j++)
{
printf("\n(q%d , %c ) = q",j,c[i]);
scanf("%d",&dfa[i][j]);
}
}
do
{
i=0;
printf("\n\nEnter Input String.. ");
scanf("%s",string);

while(string[i]!='\0')
if((s=check(string[i++],s))<0)
break;
for(i=0 ;i<nfinals ;i++)
if(f[i] ==s )
final=1;
if(final==1)
printf("\n valid string");
else
printf("invalid string");
getch();
printf("\nDo you want to continue.? \n(y/n) ");
}
while(getch()=='y');
getch();
}
CE442 Design of Language Processors 16CE076

int check(char b,int d)


{
int j;
for(j=0; j<ninputs; j++)
if(b==c[j])
return(dfa[d][j]);
return -1;
}

Output :

Conclusion: We learn how to implement any DFA from any RE.


CE442 Design of Language Processors 16CE076

Practical-3
Aim : Implement Lexical Analyzer using C Language. (The lexical analyzer should ignore
redundant spaces, tabs and new lines. It should also ignore comments. Simulate the same
in C language.)
Software Requirements: Code block
Hardware Requirements: PC / Laptop
Knowledge Requirements: Basic knowledge of Lexical Analyzer
Theory:

Compiler is responsible for converting high level language in machine language. There are several
phases involved in this and lexical analysis is the first phase.

Lexical analyzer reads the characters from source code and convert it into tokens.

Different tokens or lexemes are:

 Keywords
 Identifiers
 Operators
 Constants

Code :
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<ctype.h>
int isKeyword(char buffer[]){
CE442 Design of Language Processors 16CE076

char keywords[32][10] = {"auto","break","case","char","const","continue","default",


"do","double","else","enum","extern","float","for","goto",
"if","int","long","register","return","short","signed",
"sizeof","static","struct","switch","typedef","union",
"unsigned","void","volatile","while"};
int i, flag = 0;
for(i = 0; i < 32; ++i){
if(strcmp(keywords[i], buffer) == 0){
flag = 1;
break;
}
}
return flag;
}
int main(){
char ch, buffer[15], operators[] = "+-*/%=";
FILE *fp;
int i,j=0;
fp = fopen("p1.txt","r");

if(fp == NULL){
printf("error while opening the file\n");
exit(0);
}
while((ch = fgetc(fp)) != EOF){
for(i = 0; i < 6; ++i){
if(ch == operators[i])
printf("%c is operator\n", ch);
}
if(isalnum(ch)){
CE442 Design of Language Processors 16CE076

buffer[j++] = ch;
}
else if((ch == ' ' || ch == '\n') && (j != 0)){
buffer[j] = '\0';
j = 0;

if(isKeyword(buffer) == 1)
printf("%s is keyword\n", buffer);
else
printf("%s is indentifier\n", buffer);
}
}
fclose(fp);
return 0;
}
Input: Output :

Conclusion: We learn how to ignore redundant spaces, tabs and new lines through lexical
analyzer.
CE442 Design of Language Processors 16CE076

Practical-4

Aim: Implement following programs using Lex.


a) Write a Lex program to print out all numbers from the given file.
b) Write a Lex program to printout all HTML tags in file.
c) Write a Lex program which adds line numbers to the given file and display the same onto
the standard output.
d) Write a Lex program to count the number of comment lines in a given C program. Also
eliminate them and copy that program into separate file.

Software Requirements: Linux Operating System


Hardware Requirements: PC / Laptop
Knowledge Requirements: Basic knowledge of Lexical programming
Theory:
Lex is a computer program that generates lexical analyzers and was written by Mike Lesk and
Eric Schmidt.
Lex reads an input stream specifying the lexical analyzer and outputs source code implementing
the lexer in the C programming language.

Code:
a) %{
#include<stdio.h>
%}
digit [0-9]
%option noyywrap
%%
{digit} {printf("%s\n",yytext);}
.|\n ;
%%
int main()
{
yylex();
return 0;
}
Output:
CE442 Design of Language Processors 16CE076

b) %{
#include<stdio.h>
%}
%option noyywrap
%%
"<"[^>]*">" {printf("%s\n",yytext);}
.|\n ;
%%
int main(int argc, char **argv)
{
yyin=fopen(argv[1],"r");
yylex();
fclose(yyin);
return 0;
}

Output:

c) %{
int line_number = 1;
%}
line .*\n
%%
{line} { printf("%10d %s", line_number++, yytext); }
%%
CE442 Design of Language Processors 16CE076

int yywrap(){}

int main(int argc, char*argv[])


{
extern FILE *yyin;

yyin = fopen("test.c","r");
yylex();

return 0;
}

Output:

d) %{
int com=0;
%}
%option noyywrap

%%
"/*"[^\n]+"*/" {com++;fprintf(yyout, " ");}
"//"([a-z]|[0-9]|[A-Z]|" ")* {com++;fprintf(yyout, " ");}
%%
int main(int argc, char **argv)
{
yyin=fopen(argv[1],"r");
yyout=fopen("output", "w");
yylex();
printf("Comment=%d\n",com);
return 0;
}
CE442 Design of Language Processors 16CE076

Output:

TEST
FILE

HTML
FILE

CONCLUSION:
We implemented the above mentioned programs using lex and understood the
working of lexical analyzer.
CE442 Design of Language Processors 16CE076

Practical-5

Aim: Implement a program that specifies the type of the Grammar (Follow the
Chomsky Hierarchy)
Software Requirements: Code block
Hardware Requirements: PC / Laptop
Knowledge Requirements: Basic knowledge of different types of grammer
Theory:
Type 0 : Unrestricted Grammar:

Type-0 grammars include all formal grammars. Type 0 grammar language are recognized by turing
machine. These languages are also known as the Recursively Enumerable languages.

Type 1: Context Sensitive Grammar) :

Type-1 grammars generate the context-sensitive languages. The language generated by the
grammar are recognized by the Linear Bound Automata
In Type 1 ,
I. First of all Type 1 grammar should be Type 0.
II. Grammar Production in the form of

Type 2: Context Free Grammar:

Type-2 grammars generate the context-free languages. The language generated by the grammar
is recognized by a Pushdown automata. Type-2 grammars generate the context-free languages.
In Type 2,
1. First of all it should be Type 1.
2. Left hand side of production can have only one variable.

Type 3: Regular Grammar:

Type-3 grammars generate regular languages. These languages are exactly all
languages that can be accepted by a finite state automaton.
Type 3 is most restricted form of grammar.
Type 3 should be in the given form only :
CE442 Design of Language Processors 16CE076

V –> VT* / T*.


(or)
V –> T*V /T*
Code:
package grammarcheck;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.Scanner;
import java.util.Stack;
public class GrammarCheck {
static int flag = 0, flag1 = 0;
static Stack<Character> stack = new Stack<Character>();
public static void main(String[] args) throws Exception {
// TODO code application logic here
Scanner sc = new Scanner(System.in);
String filePath = "Grammar.txt";
fileOperation(filePath);
}
static void fileOperation(String path) throws Exception {
BufferedReader br = new BufferedReader(new FileReader(path));
int flagMultiComment = 0;
String st;
while ((st = br.readLine()) != null) {
System.out.println(st);
stack.clear();
// System.out.println(st);
char stackStore[] = st.toCharArray();
int productionSymbol = 0;
int variableCount = 0;
int terminalCount = 0;
for (int i = 0; i < stackStore.length; i++) {
stack.push(stackStore[i]);
String variableRegex = "[A-Z]{1}";
String terminalRegex = "[a-z]";
String checkVariable = stack.peek().toString();
if (productionSymbol == 0) {
if (checkVariable.matches("-")) {
CE442 Design of Language Processors 16CE076

productionSymbol += 1;
stack.pop();
}
if (!checkVariable.matches("-")) {
if (checkVariable.matches(variableRegex)) {
variableCount += 1;
// System.out.println(variableCount);
}
if (checkVariable.matches(terminalRegex)) {
terminalCount += 1;
}
stack.pop();
}
}
if (productionSymbol == 1 && checkVariable.matches(">")) {
productionSymbol += 1;
stack.pop();
}
}
if (variableCount + terminalCount == 1) {
System.out.println("Type 2 or 3");
//type 2;
//type 3;
}
if (variableCount > 1 || terminalCount > 1 || variableCount + terminalCount > 1) {
// System.out.println("Type 0 or 1");

type0(variableCount, terminalCount);
//type 0;
//type 1;
}
System.out.println("");
}
if (flag1 < flag) {
System.out.println("Type-0 Grammar");
flag1 = 0;
flag = 0;
}
if (flag1 > flag) {
System.out.println("Type-1 Grammar");
flag = 0;
flag1 = 0;
}
CE442 Design of Language Processors 16CE076

static void type0(int variable, int terminal) {

String variableRegex = "[A-Z]";


String terminalRegex = "[a-z]";
int alpha = variable + terminal;
int variableCounter = 0;
int terminalCounter = 0;
int totalVariable = 0;
int totalTerminal = 0;

List<Integer> beta = new ArrayList<Integer>();

int stackSize = stack.size();

for (int i = 0; i < stackSize; i++) {

String topStack = stack.peek().toString();


if (!topStack.equals("|")) {
if (topStack.matches(variableRegex)) {
variableCounter += 1;
stack.pop();
}
if (topStack.matches(terminalRegex)) {
terminalCounter += 1;
stack.pop();
}
if (topStack.matches("~")) {
stack.pop();
}

}
if (topStack.equals("|")) {
stack.pop();
System.out.println("");
beta.add(variableCounter + terminalCounter);
//System.out.print(variableCounter);
//System.out.print(terminalCounter);
variableCounter = 0;
terminalCounter = 0;
}
CE442 Design of Language Processors 16CE076

if (topStack.equals("$")) {
System.out.println("");
beta.add(variableCounter + terminalCounter);
//System.out.print(variableCounter);
//System.out.print(terminalCounter);
}

}
Iterator<Integer> iterator = beta.iterator();
while (iterator.hasNext()) {
Integer beta1 = iterator.next();
if (beta1 >= alpha) {

flag1 += 1;

} else {
flag += 1;
}
}
}
}

Output:

Conclusion: From this practical, we came to know about Chomsky Hierarchy type of the
Grammar.
CE442 Design of Language Processors 16CE076

PRACTICAL-6
AIM: Implement a program that remove left recursion on given grammar.
Software Requirements: Code block
Hardware Requirements: PC / Laptop
Knowledge Requirements: Basic knowledge of Recursion.
Theory:
Left Recursion: A grammar is left recursive if it has a nonterminal A such that there is a
derivation A -> Aα | β where α and β are sequences of terminals and nonterminals that do not start
with A.
While designing a top down-parser, if the left recursion exist in the grammar then the parser falls in
an infinite loop, here because A is trying to match A itself, which is not possible. We can eliminate
the above left recursion by rewriting the offending production. As-

A -> βA'
A' -> αA' | epsilon
Left Factoring: Left factoring is required to eliminate non-determinism of a grammar. Suppose a
grammar, S -> abS | aSb
Here, S is deriving the same terminal a in the production rule(two alternative choices for S), which
follows non-determinism. We can rewrite the production to defer the decision of S as-

S -> aS'
S' -> bS | Sb
Thus, S' can be replaced for bS or Sb

Program:
#include<stdio.h>
#include<string.h>
void main() {
char input[100],*l,*r,*temp,tempprod[20],productions[25][50];
int i=0,j=0,flag=0;
printf("Enter the productions: ");
scanf("%s",input);
l = strtok(input,"->");
r = strtok(NULL,"->");
temp = strtok(r,"|");
CE442 Design of Language Processors 16CE076

while(temp) {
if(temp[0] == l[0]) {
flag = 1;
sprintf(productions[i++],"%s'->%s%s'\0",l,temp+1,l);
}
else
sprintf(productions[i++],"%s->%s%s'\0",l,temp,l);
temp = strtok(NULL,"|");
}
sprintf(productions[i++],"%s->\356\0",l);
if(flag == 0)
printf("The given productions don't have Left Recursion");
else
for(j=0;j<i;j++) {
printf("\n%s",productions[j]);
}
}

OUTPUT:

Conclusion: From this Practical i understood about left recursion.


CE442 Design of Language Processors 16CE076

PRACTICAL-7
AIM: Implement a program that performs left factoring on given grammar.
Software Requirements: Code block
Hardware Requirements: PC / Laptop
Knowledge Requirements: Basic knowledge of Left factoring
Theory:
Left factoring is removing the common left factor that appears in two productions of the same non-
terminal. It is done to avoid back-tracing by the parser.

Consider a part of regular grammar,

E->aE+bcD

E->aE+cBD

Here, grammar is non-left recursive, and unambiguous but there is left factoring.

How to resolve ?

E=aB | aC | aD | ............

then,

E=aX

X=B | C | D |...........

So, the above grammar will be as :

E=aE+X

X=bcD | cBD

Program:
#include<stdio.h>
#include<string.h>
void main() {
char input[100],*l,*r,*temp,tempprod[20],productions[25][50];
int i=0,j=0,flag=0;
printf("Enter the productions: ");
scanf("%s",input);
l = strtok(input,"->");
CE442 Design of Language Processors 16CE076

r = strtok(NULL,"->");
temp = strtok(r,"|");
while(temp) {
if(temp[0] == l[0]) {
flag = 1;
sprintf(productions[i++],"%s'->%s%s'\0",l,temp+1,l);
}
else
sprintf(productions[i++],"%s->%s%s'\0",l,temp,l);
temp = strtok(NULL,"|");
}
sprintf(productions[i++],"%s->\356\0",l);
if(flag == 0)
printf("The given productions don't have Left Recursion");
else
for(j=0;j<i;j++) {
printf("\n%s",productions[j]);
}
}

OUTPUT:

Conclusion: We implemented a program that is left recursion and we removed left recursion
from the production.

You might also like