Lexical Analysis: 4.1 Motivation of The Chapter

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Chapter 4 Lexical Analysis

Our review of the underlying cognitive mechanisms in-


volved in word recognition has covered a wide range of
topics — from the contextual effects phases, to logogens,
to connectionism, to lexical decision task (LDT).

Lexicaldecision task is a priming task in which a subject


is shown a related word and asked to evaluate quickly
whether a second string of letters makes a legal word
or not.

Robert L. Solso

4.1 Motivation of the Chapter

According to the cognitive science, the understanding of the language by


mankind starts with the word recognition. Without the phase, the under-
standing of language cannot take place at all. Similarly, as the first phase of
a compiler, the main task of the lexical analyzer is to read the input charac-
ters of the source program, group them into lexemes, and produce as output
of a sequence of tokens for each lexeme in the source program. However,
before we discuss the lexical analyzer further, we would like to discuss the
language understanding in terms of intelligence first. We want to explore why
we need the lexical analysis in language understanding.
Among the various intelligences that human being possesses, language
ability no doubt is a very important one which people use to communicate
each other, to express minds and feelings, to keep the past, present, and
future things for oneself or for others. If the oral language is said also an
ability which other high-level animals possess, the written language ability
certainly is a unique characteristic of mankind.
From the perspective of intelligence, how does the human being produce
language and understand language? From the research and discovery of cog-
nitive scientists we know that the baby starts learning language from grasping
the words or vocabulary [1]. Only when one grasps enough words, can he/she
understand the real things in his/her surroundings. Some estimates (Baddele,

Y. Su et al., Principles of Compilers


© Higher Education Press, Beijing and Springer-Verlag Berlin Heidelberg 2011
80 Chapter 4 Lexical Analysis

1990) showed that the number of words a person knows shall be about 20000
to 40000 and the recognition memory would be many times of that num-
ber. With these words in mind, a person is able to know the meaning of the
string of words if he/she also knows the arrangement of these words. There-
fore to understand a language starts from understanding of words. Language
is composed of sentences and each sentence is the string of words arranged
according to some existing rules. For written language, the hierarchy of a
sentence is lexeme → word or morphology → phrase → sentence. As for the
sentence expressed via sound the hierarchy is phoneme → syllable → sound
words → sound sentence. Among them each layer has to be bound by the
grammar rules. Therefore, according to the modern linguists, to understand
a language involves five layers: phonetic analysis, lexical analysis, syntac-
tic analysis, semantic analysis and pragmatical analysis. Phonetic analysis
means that according to the phoneme rules the independent phonemes are
separated one by one from the speech sound stream. Then according to
phoneme morphological rules, the syllable and its corresponding lexeme or
words are found one by one. As for the analysis of the sentence of written
language, the phonetic analysis is not necessary, because the lexical analysis
is done via the reading in order for one to understand the meaning. When
a person reads a language which he/she is familiar with, the understanding
layers are what we mentioned above, excluding the layer of phonetic anal-
ysis. When one wants to understand oral language, the phonetic analysis
must be included. Therefore, the phonetic analysis is the essential basis for
understanding oral language.
Take English as an example. In English, there are approximately 45 dif-
ferent phonemes. For example, when you hear some one saying “right” and
“light”, if you are English native speaker, you will not have any difficulty
in discerning between phonemes r and l. But if the native language of the
speaker is Japanese, then it is likely that he/she could not pronounce them
clearly. Since in Chinese there are many words that have the same pronuncia-
tion, the same situation is likely to happen. Only when the analysis is carried
out for the whole context, may the discerning of these words be possible.
The lexical analysis, therefore, is an essential step for language under-
standing, as well as for the compilation because it is also taken as the basis
of understanding programs. This is why we have the chapter, and we also
regard it as the commencement step of the compilation.

4.2 Lexical Analyzer

Talking about the role of the lexical analyzer, we first should talk about the
role of the compiler since the lexical analyzer is part of it. The role of the
compiler is to compile or to translate a kind of languages into another, usu-
ally into a language executable on computer. In other words, it compiles or

You might also like