NLP Unit I

PBR VITS (AUTONOMOUS)
III B.TECH CSE-AI

NATURAL LANGUAGE PROCESSING
TEXT BOOK: James Allen, Natural Language Understanding, 2nd Edition, 2003, Pearson Education
Course Instructor: Dr KV Subbaiah, M.Tech, Ph.D, Professor, Dept. of CSE
UNIT–I Introduction to Natural language

The Study of Language, Applications of NLP, Evaluating Language Understanding Systems, Different Levels
of Language Analysis, Representations and Understanding, Organization of Natural language
Understanding Systems, Linguistic Background: An outline of English Syntax.
1.1 Introduction to NLP:

As we all know Natural Language refers to the language spoken by people e.g. English, Hindi, Telugu as
opposed to artificial/programming languages, like C, C++, Java, etc.
Natural Language Processing (NLP) is a field of research which determines the way computers can be used
to understand and manage natural language text or speech to do useful things.
Natural Language Processing (NLP) a subset technique of Artificial Intelligence which is used to narrow
the communication gap between the Computer and Human. It is originated from the idea of Machine
Translation (MT) which came to existence during the 1950. The primary aim was to translate one human
language to another human language, for example, Russian language to English language using the brain of
the Computers but after that, the thought of conversion of human language to computer language and vice-
versa emerged, so that communication with the machine became easy.
NLP is a process where input provided in a human language and converts this input into a useful form of
representation. The field of NLP is primarily concerned with getting computers to perform interesting and
useful tasks with human languages. The field of NLP is secondarily concerned with helping us come to a
better understanding of human language.
As stated above the idea had emerged from the need for Machine Translation in the 1950s. Then the
original language was English and Russian. But the use of other words such as Chinese also came into
existence in the initial period of the 1960s.
In the 1960s, the NLP got a new life when the idea and need of Artificial Intelligence emerged.
In 1978 LUNAR is developed by W.A woods; it could analyze, compare and evaluate the chemical data on
a lunar rock and soil composition that was accumulating as a result of Apollo moon missions and can
answer the related question.
In the 1980s the area of computational grammar became a very active field of research which was linked
with the science of reasoning for meaning and considering the user„s beliefs and intentions.
In the period of 1990s, the pace of growth of NLP increased. Grammars, tools and Practical resources
related to NLP became available with the parsers. Probabilistic and data-driven models had become quite
standard by then.
In 2000, Engineers had a large amount of spoken and textual data available for creating systems. Today,
large amount of work is being done in the field of NLP using Machine Learning or Deep Neural Networks
in general, where we are able to create state-of-the-art models in text classification, Question and Answer
generation, Sentiment Classification, etc.
1.2 GENERIC NLP SYSTEM

Natural language Processing should start with some input and ends with effective and accurate output. The
possible inputs to an NLP are quite broad. Language can be in a variety of forms such as the paragraphs of
text, commands that are typed directly to a computer system, etc. The input language might be given to the
system a sentence at a time or it might be multiple sentences all at once. The inputs for natural language
processor can be typed input, message text or speech. NLP systems have some kind of pre-processor. Data
preprocessing involves preparing and cleaning text data for machines to be able to analyze it. Preprocessing
puts data in workable form and highlights features in the text that an algorithm can work with.
Preprocessing does dictionary lookup, morphological analysis, lexical substitutions, and part-of-speech
assignment.
There are variety of output can be generated by the system. The output from a system that incorporates an
NLP might be an answer from a database, a command to change some data in a database, a spoken
response, Semantics, Part of speech, Morphology of word, Semantics of the word or some other action on
the part of the system. Remember these are the output of the system as a whole, not the output of the NLP
component of the system.
Figure 1.2.1 shows a generic NLP system and its input-output variety. Figure 1.2.2 shows a typical view of
what might be inside the NLP box of Figure 1.2.1. Each of the boxes in Figure 1.2.2 represents one of the
types of processing that make up an NLP analysis.
Figure 1.2.1 Generic NLP system.
Figure 1.2.2 Pipeline view of the components of a generic NLP system.

1.3 LEVELS OF NLP
There are two components of Natural Language Processing
I. Natural Language Understanding (NLU)

• NLU takes some spoken/typed sentence and working out what it means.
• Here different level of analysis required such as morphological analysis, syntactic
analysis, semantic analysis, discourse analysis, …
II Natural Language Generation (NLG)

• NLG takes some formal representation of what you want to say and working out a way
to express it in a natural (human) language (e.g., English)
• Here different level of synthesis required: deep planning (what to say), syntactic
generation .
Difference between NLU and NLG

NLU NLG
It is the process of reading and interpreting It is the process of writing or generating
language. language.
NLU explains the meaning behind the written NLG generates the natural language using
text or speech in natural language machines.
NLU draws facts from the natural language

NLG uses the insights generated from parsers,
using various tools and technologies such as
POS tags, etc. to generate the natural language.
parsers, POS taggers, etc,
NLU understands the human language and NLG uses the structured data and generates
converts it into data meaningful narratives out of it
The NLP can broadly be divided into various levels as shown in Figure 1.3.1.
Figure 1.3.1 Levels of NLP
1. Phonology:
It concerned with interpretation of speech sound within and across words.
2. Morphology:
It deals with how words are constructed from more basic meaning units called morphemes. A morpheme is
the primitive unit of meaning in a language. For example, “truth+ful+ness”.
3. Syntax:
It concerns how words can be put together to form correct sentences and determines what structural role
each word plays in the sentence and what phrases are subparts of other phrases.
For example, “the dog ate my homework”
4. Semantics:
It is a study of the meaning of words and how these meaning combine in sentences to form sentence
meaning. It is study of context- independent meaning. For example, plant: industrial plant/ living organism .
Pragmatics concerns with how sentences are used in different situations and how it affects the interpretation
of the sentence. Discourse context deals with how the immediately preceding sentences affect the
interpretation of the next sentence. For example, interpreting pronouns and interpreting the temporal aspects
of the information.
5. Reasoning:
To produce an answer to a question which is not explicitly stored in a database; Natural Language Interface
to Database (NLIDB) carries out reasoning based on data stored in the database. For example, consider the
database that holds the student academic information, and user posed a query such as: „Which student is
likely to fail in Science subject?‟ To answer the query, NLIDB needs a domain expert to narrow down the
reasoning process.
1.4 The Study of Language:
Language is studied in several different academic disciplines. Each discipline defines its own set of
problems and has its own methods for addressing them. The linguist, for instance, studies the structure of
language itself, considering questions such as why certain combinations of words form sentences but others
do not, and why a sentence can have some meanings but not others. The psycholinguist, on the other hand,
studies the processes of human language production and comprehension, considering questions such as how
people identify the appropriate structure of a sentence and when they decide on the appropriate meaning for
words. The philosopher considers how words can mean anything at all and how they identify objects in the
world. Philosophers also consider what it means to have beliefs, goals, and intentions, and how these
cognitive capabilities relate to language. The goal of the computational linguist is to develop a compu
tational theory of language, using the notions of algorithms and data structures from computer science. Of
course, to build a computational model, you must take advantage of what is known from all the other
disciplines.
Table 1.1 summarizes these different approaches to studying language.
Discipline Typical Problems Tools

Linguists How do words form phrases and Intuitions about wellformedness
sentences? What constrains the and meaning; mathematical
possible meanings for a sentence? models of structure (for example,
formal language theory,model
theoretic semantics)
Psycholinguists How do people identify the structure Experimental techniques based on
of sentences? How are word meanings measuring human performance;
identified? When does understanding statistical analysis of observations
take place?
Philosophers What is meaning, and how do words Natural language argumentation
and sentences acquire it? How do using intuition about counter-
words identify objects in the world? examples; mathematical models
(for example, logic and model
theory)
Computational How is the structure of sentences Algorithms, data structures; formal
Linguists identified? How can knowledge and models of representation and
reasoning be modeled? How can reasoning; AI techniques (search
language be used to accomplish and representation methods)
specific tasks?
Table 1.1 The major disciplines studying language

1.5 Applications of NLP:
a) Machine Translation
Machine translation is basically used to convert text or speech from one natural language to another natural
language. Machine translation, an integral part of Natural Language Processing where translation is done
from source language to target language preserving the meaning of the sentence. Example: Google
Translator
b)Information Retrieval
It refers to the human-computer interaction (HCI) that happens when we use a machine to search a body of
information for information objects (content) that match our search query. A Person's query is matched
against a set of documents to find a subset of 'relevant' document. Examples: Google, Yahoo, Altavista, etc.
c) Text Categorization
Text categorization (also known as text classification or topic spotting) is the task of automatically sorting a
set of documents into categories (clusters).
Uses of Text Categorization
• Filtering of content
• Spam filtering Identification of document content
• Survey coding
d) Information Extraction
Identify specific pieces of information in unstructured or semi-structured textual document. Transform
unstructured information in a corpus of documents or web pages into a structured database.
Applied to different types of text:
• Newspaper articles
• Web pages
• Scientific articles
• Newsgroup messages
• Classified ads
• Medical notes
e)Grammar Correction-
In word processor software like MS-word, NLP techniques are widely used for spelling correction &
grammar check.
f) Sentiment Analysis-
Sentiment Analysis is also known as opinion mining. It is mainly used on the web to analyse the behaviour,
attitude, and emotional state of the sender. This application is implemented through a combination of NLP)
and statistics by assigning the values to the text (natural, positive or negative), identify the mood of the
context (sad, happy, angry, etc.)
g)Question-Answering systems-
Question Answering focuses on constructing systems that automatically answer the questions asked by
humans in a natural language. It presents only the requested information instead of searching full
documents like search engine. The basic idea behind the QA system is that the users just have to ask the
question and the system will retrieve the most appropriate and correct answer for that question.
E.g.
Q. “What is the birth place of Shree Krishna?
A. Mathura
h)Spam Detection
To detect unwanted e-mails getting to a user's inbox, spam detection is used
i)Chatbot
Chatbot is one of the most important applications of NLP. It is used by many companies to provide the
customer's chat services.
j) Speech Recognition-
Speech recognition is used for converting spoken words into text. It is used in applications, such as mobile,
home automation, video recovery, dictating to Microsoft Word, voice biometrics, voice user interface, and
so on.
k)Text summarization
This task aims to create short summaries of longer documents while retaining the core content and
preserving the overall meaning of the text.
1.6 Evaluating Language Understanding Systems
Natural language understanding (NLU) Natural language understanding is a branch of artificial

intelligence that uses computer software to understand input in the form of sentences using text or speech.
NLU enables human-computer interaction. It is the comprehension of human language such as English,
Hindi, Telugu, Spanish and French, for example, that allows computers to understand commands without
the formalized syntax of computer languages. NLU also enables computers to communicate back to humans
in their own languages.
The main purpose of NLU is to create chat- and voice-enabled bots that can interact with the public without
supervision. Many major IT companies, such as Amazon, Apple, Google and Microsoft, and startups have
NLU projects underway.
How does natural language understanding work?
NLU analyzes data to determine its meaning by using algorithms to reduce human speech into a structured
ontology -- a data model consisting of semantics and pragmatics definitions. Two fundamental concepts of
NLU are intent and entity recognition.
Intent recognition is the process of identifying the user's sentiment in input text and determining their
objective. It is the first and most important part of NLU because it establishes the meaning of the text.
Entity recognition is a specific type of NLU that focuses on identifying the entities in a message, then
extracting the most important information about those entities. There are two types of entities: named
entities and numeric entities. Named entities are grouped into categories -- such as people, companies and
locations. Numeric entities are recognized as numbers, currencies and percentages.
For example, a request for an island camping trip on Vancouver Island on Aug. 18 might be broken down
like this: ferry tickets [intent] / need: camping lot reservation [intent] / Vancouver Island [location] / Aug.
18 [date].
Natural language understanding Systems

Here are examples of applications that are designed to understand language as humans do, rather than as a
list of keywords. NLU is the basis of speech recognition software -- such as Siri on iOS -- that works
toward achieving human-computer understanding.
IVR and message routing. Interactive Voice Response (IVR) is used for self-service and call routing.
Early iterations were strictly touchtone and did not involve AI. However, as IVR technology advanced,
features such as NLP and NLU have broadened its capabilities and users can interact with the phone system
via voice. The system processes the user's voice, converts the words to text, and then parses the
grammatical structure of the sentence to determine the probable intent of the caller.
Customer support and service through intelligent personal assistants. NLU is the technology
behind chatbots, which is a computer program that converses with a human in natural language via text or
voice. Chatbots follow a script and can only answer questions in that script. These intelligent personal
assistants can be a useful addition to customer service. For example, chatbots are used to provide answers to
frequently asked questions. Accomplishing this involves layers of different processes in NLU technology,
such as feature extraction and classification, entity linking and knowledge management.
Machine translation. Machine learning (ML) is a branch of AI that enables computers to learn and change
behavior based on training data. Machine learning algorithms are also used to generate natural language text
from scratch. In the case of translation, a machine learning algorithm analyzes millions of pages of text --
say, contracts or financial documents -- to learn how to translate them into another language. The more
documents it analyzes, the more accurate the translation. For example, if a user is translating data with an
automatic language tool such as a dictionary, it will perform a word-for-word substitution. However, when
using machine translation, it will look up the words in context, which helps return a more accurate
translation.
Data capture. Data capture is the process of gathering and recording information about an object, person or
event. For example, if an e-commerce company used NLU, it could ask customers to enter their shipping
and billing information verbally. The software would understand what the customer meant and enter the
information automatically.
Conversational interfaces. Many voice-activated devices -- including Amazon Alexa and Google Home --
allow users to speak naturally. By using NLU, conversational interfaces can understand and respond to
human language by segmenting words and sentences, recognizing grammar, and using semantic knowledge
to infer intent.
1.7 The Different Levels of Language Analysis
A natural language-system must use considerable knowledge about the structure of the language
itself, including what the words are, how words combine to form sentences, what the words mean, how
word meanings contribute to sentence meanings, and so on. However, we cannot completely account for
linguistic behavior without also taking into account another aspect of what makes humans intelligent —
their general world knowledge and their reasoning abilities. For example, to answer questions or to
participate in a conversation, a person not only must know a lot about the structure of the language being
used, but also must know about the world in general and the conversational setting in particular.
The following are some of the different forms of knowledge relevant for natural language understanding:
Phonetic and phonological knowledge - concerns how words are related to the sounds that realize them.
Such knowledge is crucial for speech-based systems.
Morphological knowledge - concerns how words are constructed from more basic meaning units called
morphemes. A morpheme is the primitive unit of meaning in a language (for example, the meaning of the
word "friendly" is derivable from the meaning of the noun "friend" and the suffix "-ly", which transforms a
noun into an adjective).
Syntactic knowledge - concerns how words can be put together to form correct sentences and determines
what structural role each word plays in the sentence and what phrases are subparts of what other phrases.
Semantic knowledge - concerns what words mean and how these meanings -combine in sentences to form
sentence meanings. This is the study of context-independent meaning - the meaning a sentence has
regardless of the context in which it is used.
Pragmatic knowledge - concerns how sentences are used in different situations and how use affects the
interpretation of the sentence.
Discourse knowledge-concerns how the immediately preceding sentences affect the interpretation of the
next sentence. This information is especially important for interpreting pronouns and for interpreting the
temporal aspects of the information conveyed.
World knowledge - includes the general knowledge about the struc ture of the world that language users
must have in order to, for example, maintain a conversation. It includes what each language user must know
about the other user’s beliefs and goals.
1.8 Representation and Understanding
Ambiguity can occur at all NLP levels. It is a property of linguistic expressions. If an expression
(word/phrase/sentence) has more than one meaning we can refer it as ambiguous. To represent meaning, we
must have a more precise language. The tools to do this come from mathematics and logic and involve the
use of formally specified representation languages. Formal languages are specified from very simple
building blocks. The most fundamental is the notion of an atomic symbol which is distinguishable from any
other atomic symbol simply based on how it is written. Useful representation languages have the following
two properties:
● The representation must be precise and unambiguous. You should be able to express every distinct
reading of a sentence as a distinct formula in the representation.
● The representation should capture the intuitive structure of the natural language sentences that it
represents. For example, sentences that appear to be structurally similar should have similar structural
representations, and the meanings of two sentences that are paraphrases of each other should be closely
related to each other.
Syntax: Representing Sentence Structure

The syntactic structure of a sentence indicates the way that words in the sentence are related to each other.
This structure indicates how the words are grouped together into phrases, what words modify what other
words, and what words are of central importance in the sentence. In addition, this structure may identify the
types of relationships that exist between phrases and can store other information about the particular
sentence structure that may be needed for later processing. For example, consider the following sentences:
1. John sold the book to Mary.
2. The book was sold to Mary by John.
These sentences share certain structural properties. In each, the noun phrases are "John", "Mary", and "the
book", and the act described is some selling action. In other respects, these sentences are significantly
different. For instance, even though both sentences are always either true or false in the exact same
situations, you could only give sentence 1 as an answer to the question "What did John do for Mary?"
Sentence 2 is a much better continuation of a sentence beginning with the phrase "After it fell in the river",
as sentences 3 and 4 show. Following the standard convention in linguistics, this book will use an asterisk
(*) before any example of an ill-formed or questionable sentence.
Most syntactic representations of language are based on the notion of context-free grammars, which
represent sentence structure in terms of what phrases are subparts of other phrases. This information is often
presented in a tree form, such as the one shown in Figure 1.4, which shows two different structures for the
sentence "Rice flies like sand". In the first reading, the sentence is formed from a noun phrase (NP)
describing a type of fly‟ rice flies, and a verb phrase (VP) that asserts that these flies like sand. In the
second structure, the sentence is formed from a noun phrase describing a type of substance, rice, and a verb
phrase stating that this substance flies like sand (say, if you throw it). The two structures also give further
details on the structure of the noun phrase and verb phrase and identify the part of speech for each word. In
particular, the word "like" is a verb (V) in the first reading and a preposition (P) in the second.
3. *After it fell in the river, John sold Mary the book.
4. After it fell in the river, the book was sold to Mary by John.
Many other structural properties can be revealed by considering sentences that are not well-formed.
Sentence 5 is ill-formed because the subject and the verb do not agree in number (the subject is singular and
the verb is plural), while 6 is ill-formed because the verb put requires some modifier that describes where
John put the object.
5. *John are in the corner.
6. *John put the book.

The Logical Form
The structure of a sentence doesn‟t reflect its meaning, however. For example, the NP "the catch" can have
different meanings depending on whether the speaker is talking about a baseball game or a fishing
expedition. Both these inter-pretations have the same syntactic structure, and the different meanings arise
from an ambiguity concerning the sense of the word "catch". Once the correct sense is identified, say the
fishing sense, there still is a problem in determining what fish are being referred to. The intended meaning
of a sentence depends on the situation in which the sentence is produced.
The Final Meaning Representation

The final representation needed is a general knowledge representation (KR), which the system uses to
represent and reason about its application domain. This is the language in which all the specific knowledge
based on the application is represented. The goal of contextual interpretation is to take a representation of
the structure of a sentence and its logical form, and to map this into some expression in the KR that allows
the system to perform the appropriate task in the domain. In a question-answering application, a question
might map to a database query, in a story-understanding application, a sentence might map into a set of
expressions that represent the situation that the sentence describes.
1.9 The Organization of Natural Language Understanding Systems
There are general five steps in natural language processing
Figure 1.6.1 General five steps of NLP

1) Lexical Analysis:
It is the first stage in NLP. It is also known as morphological analysis. It consists of identifying and
analyzing the structure of words. Lexicon of a language means the collection of phrases and words in a
language. Lexical analysis is dividing the whole chunk of text into words, sentences, and paragraphs.
2) Syntactic Analysis:
Syntactic analysis consists of analysis of words in the sentence for grammar and ordering words in a way
that shows the relationship among the words. For example the sentence such as “The school goes to boy” is
rejected by English syntactic analyzer.
3) Semantic Analysis:
Semantic analysis is a structure created by the syntactic analyzer which assigns meanings. This component
transfers linear sequences of words into structures. It shows how the words are associated with each other.
Semantics focuses only on the literal meaning of words, phrases, and sentences. This only draws the
dictionary meaning or the real meaning from the given text. The structures assigned by the syntactic
analyzer always have assigned meaning.
The text is checked for meaningfulness. It is done by mapping syntactic structure and objects in the task
domain. E.g. “colorless green idea”. This would be rejected by the Symantec analysis as colorless here;
green doesn‟t make any sense.
4) Discourse Integration:
The meaning of any sentence depends upon the meaning of the sentence just before it. Furthermore, it also
brings about the meaning of immediately succeeding sentence. For example, “He wanted that”, in this
sentence the word “that” depends upon the prior discourse context.
5) Pragmatic Analysis:
Pragmatic analysis concerned with the overall communicative and social content and its effect on
interpretation. It means abstracting or deriving the meaningful use of language in situations. In this analysis,
what was said is reinterpreted on what it truly meant. It contains deriving those aspects of language which
necessitate real world knowledge.
E.g., “close the window?” should be interpreted as a request instead of an order.
Figure 1.6.2 Stages of NLP
Morphological Analysis:
Consider we have an English interface to an operating system and the following sentence is typed: “I want
to print Bill‟s .init file “.
Morphological analysis must do the following things:
• Pull apart the word “Bill‟s” into proper noun “Bill” and the possessive suffix “‟s”.
• Recognize the sequence “.init” as a file extension that is functioning as an adjective in the sentence.
This process will usually assign syntactic categories to all the words in the sentence. Consider the word
“prints”. This word is either a plural noun or a third person singular verb (he prints).
Syntactic analysis:
This method examines the structure of a sentence and performs detailed analysis of the sentence and
semantics of the statement. In order to perform this, the system is expected to have through knowledge of
the grammar of the language. The basic unit of any language is sentence, made up of group of words,
having their own meanings and linked together to present an idea or thought. Apart from having meanings,
words fall under categories called parts of speech. In English languages, there are eight different parts of
speech. They are nouns, pronoun, adjectives, verb, adverb, prepositions, conjunction and interjections.
In English language, a sentence S is made up of a noun phrase (NP) and a verb phrase (VP), i.e.
S=NP+VP
The given noun phrase (NP) normally can have an article or delimiter (D) or an adjective (ADJ) and the
noun (N), i.e.
NP=D+ADJ+N
Also a noun phrase may have a prepositional phrase (PP) which has a preposition (P), a delimiter (D) and
the noun (N), i.e.
PP=D+P+N
The verb phrase (VP) has a verb (V) and the object of the verb. The object of the verb may be a noun (N)
and its determiner, i.e.
VP=V+N+D
These are some of the rules of the English grammar that helps one to construct a small parser for NLP.
Syntactic analysis must exploit the results of morphological analysis to build a structural description of the
sentence. The goal of this process, called parsing, is to convert the flat list of words that forms the sentence
into a structure that defines the units that are represented by that flat list. The important thing here is that a
flat sentence has been converted into a hierarchical structure and that the structures correspond to meaning
units when semantic analysis is performed. Reference markers are shown in the parenthesis in the parse
tree. Each one corresponds to some entity that has been mentioned in the sentence.
Semantic Analysis:
The structures created by the syntactic analyzer assigned meanings. Also, a mapping made between the
syntactic structures and objects in the task domain. Moreover, Structures for which no such mapping
possible may reject.
Semantic analysis must do two important things: It must map individual words into appropriate objects in
the knowledge base or database It must create the correct structures to correspond to the way the meanings
of the individual words combine with each other.
Discourse Integration:
The meaning of an individual sentence may depend on the sentences that precede it. And also, may
influence the meanings of the sentences that follow it Specifically we do not know whom the pronoun “I”
or the proper noun “Bill” refers to. To pin down these references requires an appeal to a model of the
current discourse context, from which we can learn that the current user is USER068 and that the only
person named “Bill” about whom we could be talking is USER073.Once the correct referent for Bill is
known, we can also determine exactly which file is being referred to.
Pragmatic Analysis
Moreover, The structure representing what said reinterpreted to determine what was actually meant. The
final step toward effective understanding is to decide what to do as a results. One possible thing to do is to
record what was said as a fact and be done with it. For some sentences, whose intended effect is clearly
declarative, that is precisely correct thing to do. But for other sentences, including this one, the intended
effect is different. We can discover this intended effect by applying a set of rules that characterize
cooperative dialogues. The final step in pragmatic processing is to translate, from the knowledge based
representation to a command to be executed by the system.
Results of each of the main processes combine to form a natural language system.
The results of the understanding process are lpr /ali/stuff.init. All of the processes are important in a
complete natural language understanding system. Not all programs are written with exactly these
components. Sometimes two or more of them collapsed. Doing that usually results in a system that is easier
to build for restricted subsets of English but one that is harder to extend to wider coverage.
1.10 Linguistic Background: An outline of English Syntax
 Words
 The Elements of Simple Noun Phrases
 Verb Phrases and Simple Sentences
 Noun Phrases Revisited
 Adjective Phrases
 Adverbial Phrases
1.Words
At first glance the most basic unit of linguistic structure appears to be the word. The word, though, is
far from the fundamental element of study in linguistics; it is already the result of a complex set of
more primitive parts. The study of morphology concerns the construction of words from more basic
components corresponding roughly to meaning units. There are two basic ways that new words are
formed, traditionally classified as inflectional forms and derivational forms. Inflectional forms use a
root form of a word and typically add a suffix so that the word appears in the appropriate form given
the sentence. Verbs are the best examples of this in English. Each verb has a basic form that then is
typically changed depending on the subject and the tense of the sentence. For example, the verb sigh
will take suffixes such as -s, -ing, and -ed to create the verb forms sighs, sighing, and sighed,
respectively. These new words are all verbs and share the same basic meaning. Derivational
morphology involves the derivation of new words from other forms. The new words may be in
completely different categories from their subparts. For example, the noun friend is made into the
adjective friendly by adding the suffix -ly. A more complex derivation would allow you to derive the
noun friendliness from the adjective form. There are many interesting issues concerned with how
words are derived and how the choice of word form is affected by the syntactic structure of the
sentence that constrains it.
Traditionally, linguists classify words into different categories based on their uses. Two related areas
of evidence are used to divide words into categories. The first area concerns the word‟s contribution to
the meaning of the phrase that contains it, and the second area concerns the actual syntactic structures
in which the word may play a role. For example, you might posit the class noun as those words that
can be used to identify the basic type of object, concept, or place being discussed, and adjective
as those words that further qualify the object, concept, or place. Thus green would be an adjective
and book a noun, as shown in the phrases the green book and green books. But things are not so
simple: green might play the role of a noun, as in That green is lighter than the other, and book
might play the role of a modifier, as in the book worm. In fact, most nouns seem to be able to be used
as a modifier in some situations. Perhaps the classes should be combined, since they overlap a great
deal. But other forms of evidence exist. Consider what words could complete the sentence It’s so . .
..
You might say It’s so green, It’s so hot, It’s so true, and so on. Note that although book can be a
modifier in the book worm, you cannot say *It’s so book about anything. Thus there are two classes of
modifiers: adjective modifiers and noun modifiers.
Consider again the case where adjectives can be used as nouns, as in the green. Not all adjectives
can be used in such a way. For example, the noun phrase the hot can be used, given a context
where there are hot and cold plates, in a sentence such as The hot are on the table. But this refers to the
hot plates; it cannot refer to hotness in the way the phrase the green refers to green. With this evidence
you could subdivide adjectives into two subclasses—those that can also be used to describe a concept
or quality directly, and those that cannot. Alter- natively, however, you can simply say that green is
ambiguous between being an adjective or a noun and, therefore, falls in both classes. Since green can
behave like any other noun, the second solution seems the most direct.
Using similar arguments, we can identify four main classes of words in English that contribute to
the meaning of sentences. These classes are nouns, adjectives, verbs, and adverbs. Sentences are built
out of phrases centered on these four word classes. Of course, there are many other classes of words
that are necessary to form sentences, such as articles, pronouns, prepositions, particles, quantifiers,
conjunctions, and so on. But these classes are fixed in the sense that new words in these classes are
rarely introduced into the language. New nouns, verbs, adjectives and adverbs, on the other hand, are
regularly introduced into the language as it evolves. As a result, these classes are called the open class
words, and the others are called the closed class words.
A word in any of the four open classes may be used to form the basis of a phrase. This word is
called the head of the phrase and indicates the type of thing, activity, or quality that the phrase
describes. For example, with noun phrases, the head word indicates the general classes of objects
being described. The phrases
the dog
the mangy dog
the mangy dog at the pound
are all noun phrases that describe an object in the class of dogs. The first describes a member from
the class of all dogs, the second an object from the class of mangy dogs, and the third an object
from the class of mangy dogs that are at the pound. The word dog is the head of each of these phrases.
Noun Phrases Verb Phrases
The president of the company looked up the chimney

believed that the world was
His desire to succeed flatate the pizza
Adjective Phrases Adverbial Phrases
Several challenges from the opposing team
easy to assemble rapidly like a bat
happy that he’d won the prize intermittently throughout the day
Figure 2.1asExamples
angry a hippo of heads and complements
inside the house
Similarly, the adjective phrases

hungry
very hungry hungry as a horse
all describe the quality of hunger. In each case the word hungry is the head.
In some cases a phrase may consist of a single head. For example, the word sand can be a noun
phrase, hungry can be an adjective phrase, and walked can be a verb phrase. In many other cases the
head requires additional phrases following it to express the desired meaning. For example, the verb put
cannot form a verb phrase in isolation; thus the following words do not form a meaningful sentence:
*Jack put.
To be meaningful, the verb put must be followed by a noun phrase and a phrase describing a location,
as in the verb phrase put the dog in the house. The phrase or set of phrases needed to complete the
meaning of such a head is called the complement of the head. In the preceding phrase put is the head
and the dog in the house is the complement. Heads of all the major classes may require complements.
Figure 2.1 gives some examples of phrases, with the head indicated by boldface and the complements
by italics. In the remainder of this chapter, we will look at these different types of phrases in more
detail and see how they are structured and how they contribute to the meaning of sentences.
2.1 The Elements of Simple Noun Phrases

Noun phrases (NPs) are used to refer to things: objects, places, concepts, events, qualities, and so on.
The simplest NP consists of a single pronoun: he, she, they, you, me, it, I, and so on. Pronouns can
refer to physical objects, as in the sentence
It hid under the rug.
to events, as in the sentence
Once I opened the door, I regretted it for months. and to qualities, as in the sentence
He was so angry, but he didn‟t show it.
Pronouns do not take any modifiers except in rare forms, as in the sentence He who hesitates is lost.
Another basic form of noun phrase consists of a name or proper noun, such as John or
Rochester. These nouns appear in capitalized form in carefully written English. Names may also
consist of multiple words, as in the New York Times and Stratford-on-Avon.
Excluding pronouns and proper names, the head of a noun phrase is usually a common noun.
Nouns divide into two main classes:
count nouns—nouns that describe specific objects or sets of objects.
mass nouns—nouns that describe composites or substances.
Count nouns acquired their name because they can be counted. There may be one dog or many dogs,
one book or several books, one crowd or several crowds. If a single count noun is used to describe a
whole class of objects, it must be in its plural form. Thus you can say Dogs are friendly but not *Dog
is friendly.
Mass nouns cannot be counted. There may be some water, some wheat, or some sand. If you try
to count with a mass noun, you change the meaning. For example, some wheat refers to a portion of
some quantity of wheat, whereas one wheat is a single type of wheat rather than a single grain of
wheat. A mass noun can be used to describe a whole class of material without using a plural form.
Thus you say Water is necessary for life, not *Waters are necessary for life.
In addition to a head, a noun phrase may contain specifiers and qualifiers preceding the head.
The qualifiers further describe the general class of objects identified by the head, while the specifiers
indicate how many such objects are being described, as well as how the objects being described relate
to the speaker and hearer. Specifiers are constructed out of ordinals (such as first and second),
cardinals (such as one and two), and determiners. Determiners can be sub- divided into the following
general classes:
articles—the words the, a, and an.
demonstratives—words such as this, that, these, and those.
possessives—noun phrases followed by the suffix ’s, such as John’s and the fat man’s, as well as
possessive pronouns, such as her, my, and whose.
wh-determiners—words used in questions, such as which and what.
quantifying determiners—words such as some, every, most, no, any, both, and half.
Number First Person Second Person Third
Person
he
singular I you (masculine)
she
(feminine)
it (neuter)
plural we you they
Figure 2.2 Pronoun system (as subject)

Person
singular my your his, her, its
plural our your their
Figure 2.3 Pronoun system (possessives)
A simple noun phrase may have at most one determiner, one ordinal, and one cardinal. It is
possible to have all three, as in the first three contestants. An exception to this rule exists with a few
quantifying determiners such as many, few, several, and little. These words can be preceded by an
article, yielding noun phrases such as the few songs we knew. Using this evidence, you could
subcategorize the quantifying determiners into those that allow this and those that don‟t, but the present
coarse categorization is fine for our purposes at this time.
The qualifiers in a noun phrase occur after the specifiers (if any) and before the head. They consist
of adjectives and nouns being used as modifiers. The following are more precise definitions:
adjectives—words that attribute qualities to objects yet do not refer to the qualities themselves (for
example, angry is an adjective that attributes the quality of anger to something).
noun modifiers—mass or count nouns used to modify another noun, as in the cook book or the ceiling
paint can.
Before moving on to other structures, consider the different inflectional forms that nouns take and
how they are realized in English. Two forms of nouns—the singular and plural forms—have already
been mentioned. Pronouns take forms based on person (first, second, and third) and gender
(masculine, feminine, and neuter). Each of these distinctions reflects a systematic analysis that is
almost wholly explicit in some languages, such as Latin, while implicit in others. In French, for
example, nouns are classified by their gender. In English many of these distinctions are not explicitly
marked except in a few cases. The pronouns provide the best example of this. They distinguish
number, person, gender, and case (that is, whether they are used as possessive, subject, or object), as
shown in Figures 2.2 through 2.4.
Person
him her
singular me you
it
plural us you them
Mood Example
declarative (or The cat is sleeping.

assertion)yes/no Is the cat sleeping?
question
What is sleeping? or Which cat is sleeping?
wh-question Shoot the cat!
Figure 2.4 Pronoun system (as object)
imperative (or
command)
Figure 2.5 Basic moods of sentences
Form Examples Example Uses

base hit, cry, go, be Hit the ball! I want to go.
simple present hit, cries, go, am The dog cries every day. I am
thirsty.
simple past hit, cried, went,I was thirsty.
was I went to the store.
present participle hitting, crying,I‟m going to the store.
going, being Being the last in line aggravates
me.
past participle hit, cried, gone,I‟ve been there before. The cake
been was gone.
Figure 2.6 The five verb forms
2.2 Verb Phrases and Simple Sentences

While an NP is used to refer to things, a sentence (S) is used to assert, query, or command. You may
assert that some sentence is true, ask whether a sentence is true, or command someone to do something
described in the sentence. The way a sentence is used is called its mood. Figure 2.5 shows four basic
sentence moods.
A simple declarative sentence consists of an NP, the subject, followed by a verb phrase (VP), the
predicate. A simple VP may consist of some adverbial
Tense The Verb Sequence Example

simple present simple present He walks to the store.
simple past simple past He walked to the store.
simple future will + infinitive He will walk to the store.
present perfect have in present He has walked to the store.
+ past participle
future perfect will + have inI will have walked to the
infinitive store.
+ past participle
past perfect have in past I had walked to the store.
(or pluperfect) + past participle
Figure 2.7 The basic tenses
Tense Structure Example
present be in present He is walking.

progressive
past + present participle He was walking.
progressive
future be in past He will be walking.
progressive
present perfect have in present He has been walking.
progressive + be+ present
in pastparticiple
participle
+ present
will + be inparticiple
infinitive
future perfect will + have in present He will have been
+ present participle walking.
progressive + be as past participle
+ present participle
past perfect have in past He had been walking.
progressive + be in past participle
+ present participle
Figure 2.8 The progressive tenses
modifiers followed by the head verb and its complements. Every verb must appear in one of the
five possible forms shown in Figure 2.6.
Verbs can be divided into several different classes: the auxiliary verbs, such as be, do, and have;
the modal verbs, such as will, can, and could; and the main verbs, such as eat, ran, and believe. The
auxiliary and modal verbs usually take a verb phrase as a complement, which produces a sequence
of verbs, each the head of its own verb phrase. These sequences are used to form sentences with
different tenses.
The tense system identifies when the proposition described in the sentence is said to be true. The
tense system is complex; only the basic forms are outlined in Figure 2.7. In addition, verbs may be in
the progressive tense. Corresponding
First Second Third
Singular I am you are he is
I walk you walk she walks
Plural we are you are they are
we walk you walk they walk
Figure 2.9 Person/number forms of verbs
to the tenses listed in Figure 2.7 are the progressive tenses shown in Figure 2.8. Each progressive tense
is formed by the normal tense construction of the verb be followed by a present participle.
Verb groups also encode person and number information in the first word in the verb group. The
person and number must agree with the noun phrase that is the subject of the verb phrase. Some verbs
distinguish nearly all the possibilities, but most verbs distinguish only the third person singular (by
adding an -s suffix). Some examples are shown in Figure 2.9.
Transitivity and Passives

The last verb in a verb sequence is called the main verb, and is drawn from the open class of verbs.
Depending on the verb, a wide variety of complement structures are allowed. For example, certain
verbs may stand alone with no complement. These are called intransitive verbs and include
examples such as laugh (for example, Jack laughed) and run (for example, He will have been
running). Another common complement form requires a noun phrase to follow the verb. These are
called transitive verbs and include verbs such as find (for example, Jack found a key). Notice that find
cannot be intransitive (for example, *Jack found is not a reasonable sentence), whereas laugh cannot
be transitive (for example, *Jack laughed a key is not a reasonable sentence). A verb like run, on the
other hand, can be transitive or intransitive, but the meaning of the verb is different in each case (for
example, Jack ran vs. Jack ran the machine).
Transitive verbs allow another form of verb group called the passive form, which is constructed
using a be auxiliary followed by the past participle. In the passive form the noun phrase that would
usually be in the object position is used in the subject position, as can be seen by the examples in
Figure 2.10. Note that tense is still carried by the initial verb in the verb group. Also, even though the
first noun phrase semantically seems to be the object of the verb in passive sentences, it is
syntactically the subject. This can be seen by checking the pronoun forms. For example, I was hit is
correct, not *Me was hit. Furthermore, the tense and number agreement is between the verb and the
syntactic subject. Thus you say I was hit by them, not *I were hit by them.
Active Sentence Related Passive Sentence
Jack saw the ball. The ball was seen by Jack. The
clue will be found by me.I was
I will find the clue. hit by Jack.
Jack hit me.
Figure 2.10 Active sentences with corresponding passive sentences
Some verbs allow two noun phrases to follow them in a sentence; for example, Jack gave Sue a
book or Jack found me a key. In such sentences the second NP corresponds to the object NP outlined
earlier and is sometimes called the direct object. The other NP is called the indirect object.
Generally, such sentences have an equivalent sentence where the indirect object appears with a
preposition, as in Jack gave a book to Sue or Jack found a key for me.
Particles
Some verb forms are constructed from a verb and an additional word called a particle. Particles
generally overlap with the class of prepositions considered in the next section. Some examples are up,
out, over, and in. With verbs such as look, take, or put, you can construct many different verbs by
combining the verb with a particle (for example, look up, look out, look over, and so on). In some
sentences the difference between a particle and a preposition results in two different readings for the
same sentence. For example, look over the paper would mean reading the paper, if you consider over a
particle (the verb is look over). In contrast, the same sentence would mean looking at something else
behind or above the paper, if you consider over a preposition (the verb is look).
You can make a sharp distinction between particles and prepositions when the object of the verb
is a pronoun. With a verb-particle sentence, the pronoun must precede the particle, as in I looked it up.
With the prepositional reading, the pronoun follows the preposition, as in I looked up it. Particles
also may follow the object NP. Thus you can say I gave up the game to Mary or I gave the game up to
Mary. This is not allowed with prepositions; for example, you cannot say *I climbed the ladder up.
Clausal Complements
Many verbs allow clauses as complements. Clauses share most of the same properties of sentences
and may have a subject, indicate tense, and occur in passivized forms. One common clause form
consists of a sentence form preceded by the complementizer that, as in that Jack ate the pizza. This
clause will be identified by the expression S[that], indicating a special subclass of S structures. This
clause may appear as the complement of the verb know, as in Sam knows
that Jack ate the pizza. The passive is possible, as in Sam knows that the pizza was eaten by Jack.
Another clause type involves the infinitive form of the verb. The VP[inf] clause is simply a VP
starting in the infinitive form, as in the complement of the verb wish in Jack wishes to eat the pizza.
An infinitive sentence S[inf] form is also possible where the subject is indicated by a for phrase, as in
Jack wishes for Sam to eat the pizza.
Another important class of clauses are sentences with complementizers that are wh-words, such as
who, what, where, why, whether, and how many. These question clauses, S[WH], can be used as a
complement of verbs such as know, as in Sam knows whether we went to the party and The police
know who committed the crime.
Prepositional Phrase Complements

Many verbs require complements that involve a specific prepositional phrase (PP). The verb give takes
a complement consisting of an NP and a PP with the preposition to, as in Jack gave the book to the
library. No other preposition can be used. Consider
*Jack gave the book from the library. (OK only if from the library
modifies book.)
In contrast, a verb like put can take any PP that describes a location, as in Jack put the book in the box.
Jack put the book inside the box. Jack put the book by the door.
To account for this, we allow complement specifications that indicate prepositional phrases with
particular prepositions. Thus the verb give would have a complement of the form NP+PP[to].
Similarly the verb decide would have a complement form NP+PP[about], and the verb blame would
have a complement form NP+PP[on], as in Jack blamed the accident on the police.
Verbs such as put, which take any phrase that can describe a location (complement
NP+Location), are also common in English. While locations are typically prepositional phrases, they
also can be noun phrases, such as home, or particles, such as back or here. A distinction can be made
between phrases that describe locations and phrases that describe a path of motion, although many
location phrases can be interpreted either way. The distinction can be made in some cases, though. For
instance, prepositional phrases beginning with to generally indicate a path of motion. Thus they cannot
be used with a verb such as put that requires a location (for example, *I put the ball to the box). This
distinction will be explored further in Chapter 4.
Figure 2.11 summarizes many of the verb complement structures found in English. A full list
would contain over 40 different forms. Note that while the
Verb Complement Example
Structure
laugh Empty (intransitive) Jack laughed.
find NP (transitive) Jack found a key.
give NP+NP (bitransitive) Jack gave Sue the paper.
give NP+PP[to] Jack gave the book to the library.
reside Location phrase Jack resides in Rochester
put NP+Location phrase Jack put the book inside.
speak PP[with]+PP[about] Jack spoke with Sue about the
book.
try VP[to] Jack tried to apologize.
tell NP+VP[to] Jack told the man to go.
wish S[to] Jack wished for the man to go.
keep VP[ing] Jack keeps hoping for the best.
catch NP+VP[ing] Jack caught Sam looking in his
desk.
watch NP+VP[base] Jack watched Sam eat the pizza.
regret S[that] Jack regretted that he‟d eaten the
whole thing.
tell NP+S[that] Jack told Sue that he was sorry.
seem ADJP Jack seems unhappy in his new job.
think NP+ADJP Jack thinks Sue is happy in her job.
know S[WH] Jack knows where the money is.
Figure 2.11 Some common verb complement structures in English
examples typically use a different verb for each form, most verbs will allow several different
complement structures.
2.3 Noun Phrases Revisited

Section 2.2 introduced simple noun phrases. This section considers more complex forms in which
NPs contain sentences or verb phrases as subcomponents.
All the examples in Section 2.2 had heads that took the null complement. Many nouns, however,
may take complements. Many of these fall into the class of complements that require a specific
prepositional phrase. For example, the noun love has a complement form PP[of], as in their love of
France, the noun reliance has the complement form PP[on], as in his reliance on handouts, and the
noun familiarity has the complement form PP[with], as in a familiarity with computers.
Many nouns, such as desire, reluctance, and research, take an infinitive VP form as a
complement, as in the noun phrases his desire to release the guinea pig, a reluctance to open the
case again, and the doctor’s research to find a cure for cancer. These nouns, in fact, can also take the
S[inf] form, as in my hope for John to open the case again.
Noun phrases can also be built out of clauses, which were introduced in the last section as the
complements for verbs. For example, a that clause (S[that]) can be used as the subject of a sentence,
as in the sentence That George had the
ring was surprising. Infinitive forms of verb phrases (VP[inf]) and sentences (S[inf]) can also function
as noun phrases, as in the sentences To own a car would be delightful and For us to complete a
project on time would be unprecedented. In addition, the gerundive forms (VP[ing] and S[ing]) can
also function as noun phrases, as in the sentences Giving up the game was unfortunate and John’s
giving up the game caused a riot.
Relative clauses involve sentence forms used as modifiers in noun phrases. These clauses are
often introduced by relative pronouns such as who, which, that, and so on, as in
The man who gave Bill the money . . . The rug that George gave to Ernest . . .
The man whom George gave the money to . . .
In each of these relative clauses, the embedded sentence is the same structure as a regular sentence
except that one noun phrase is missing. If this missing NP is filled in with the NP that the sentence
modifies, the result is a complete sentence that captures the same meaning as what was conveyed by
the relative clause. The missing NPs in the preceding three sentences occur in the subject position, in
the object position, and as object to a preposition, respectively. Deleting the relative pronoun and
filling in the missing NP in each produces the following:
The man gave Bill the money. George gave the rug to Ernest. George gave the money to the man.
As was true earlier, relative clauses can be modified in the same ways as regular sentences. In
particular, passive forms of the preceding sentences would be as follows:
Bill was given the money by the man. The rug was given to Ernest by George.
The money was given to the man by George.
Correspondingly, these sentences could have relative clauses in the passive form as follows:
The man Bill was given the money by . . .
The rug that was given to Ernest by George . . .
The man whom the money was given to by George . . .
Notice that some relative clauses need not be introduced by a relative pronoun. Often the relative
pronoun can be omitted, producing what is called a base relative clause, as in the NP the man
George gave the money to. Yet another form deletes the relative pronoun and an auxiliary be form,
creating a reduced relative clause, as in the NP the man given the money, which means the same as
the NP the man who was given the money.
2.4 Adjective Phrases
You have already seen simple adjective phrases (ADJPs) consisting of a single adjective in several
examples. More complex adjective phrases are also possible, as adjectives may take many of the
same complement forms that occur with verbs. This includes specific prepositional phrases, as with
the adjective pleased, which takes the complement form PP[with] (for example, Jack was pleased with
the prize), or angry with the complement form PP[at] (for example, Jack was angry at the committee).
Angry also may take an S[that] complement form, as in Jack was angry that he was left behind. Other
adjectives take infinitive forms, such as the adjective willing with the complement form VP[inf], as in
Jack seemed willing to lead the chorus.
These more complex adjective phrases are most commonly found as the complements of verbs
such as be or seem, or following the head in a noun phrase. They generally cannot be used as modifiers
preceding the heads of noun phrases (for example, consider *the angry at the committee man vs. the
angry man vs. the man angry at the committee).
Adjective phrases may also take a degree modifier preceding the head, as in the adjective phrase
very angry or somewhat fond of Mary. More complex degree modifications are possible, as in far too
heavy and much more desperate. Finally, certain constructs have degree modifiers that involve their
own complement forms, as in too stupid to come in out of the rain, so boring that everyone fell asleep,
and as slow as a dead horse.
2.5 Adverbial Phrases

You have already seen adverbs in use in several constructs, such as indicators of degree (for example,
very, rather, too), and in location phrases (for example, here, everywhere). Other forms of adverbs
indicate the manner in which something is done (for example, slowly, hesitantly), the time of
something (for example, now, yesterday), or the frequency of something (for example, frequently,
rarely, never).
Adverbs may occur in several different positions in sentences: in the sentence initial position (for
example, Then, Jack will open the drawer), in the verb sequence (for example, Jack then will open the
drawer, Jack will then open the drawer), and in the sentence final position (for example, Jack opened
the drawer then). The exact restrictions on what adverb can go where, however, is quite idiosyncratic
to the particular adverb.
In addition to these adverbs, adverbial modifiers can be constructed out of a wide range of
constructs, such as prepositional phrases indicating, among other things, location (for example, in the
box) or manner (for example, in great haste); noun phrases indicating, among other things, frequency
(for example, every day); or clauses indicating, among other things, the time (for example, when the
bomb exploded). Such adverbial phrases, however, usually cannot occur except in the sentence initial
or sentence final position. For example, we can say Every day
John opens his drawer or John opens his drawer every day, but not *John every day opens his drawer.
Because of the wide range of forms, it generally is more useful to consider adverbial phrases
(ADVPs) by function rather than syntactic form. Thus we can consider manner, temporal, duration,
location, degree, and frequency adverbial phrases each as its own form. We considered the location
and degree forms earlier, so here we will consider some of the others.
Temporal adverbials occur in a wide range of forms: adverbial particles (for example, now),
noun phrases (for example, today, yesterday), prepositional phrases (for example, at noon, during the
fight), and clauses (for example, when the clock struck noon, before the fight started).
Frequency adverbials also can occur in a wide range of forms: particles (for example, often), noun
phrases (for example, every day), prepositional phrases (for example, at every party), and clauses
(for example, every time that John comes for a visit).
Duration adverbials appear most commonly as prepositional phrases (for example, for three
hours, about 20 feet) and clauses (for example, until the moon turns blue).
Manner adverbials occur in a wide range of forms, including particles (for example, slowly),
noun phrases (for example, this way), prepositional phrases (for example, in great haste), and
clauses (for example, by holding the embers at the end of a stick).
In the analyses that follow, adverbials will most commonly occur as modifiers of the action or
state described in a sentence. As such, an issue arises as to how to distinguish verb complements from
adverbials. One distinction is that adverbial phrases are always optional. Thus you should be able to
delete the adverbial and still have a sentence with approximately the same meaning (missing,
obviously, the contribution of the adverbial). Consider the sentences
Jack put the box by the door. Jack ate the pizza by the door.
In the first sentence the prepositional phrase is clearly a complement, since deleting it to produce *Jack
put the box results in a nonsensical utterance. On the other hand, deleting the phrase from the second
sentence has only a minor effect: Jack ate the pizza is just a less general assertion about the same
situation described by Jack ate the pizza by the door.
30

NLP Unit I

Uploaded by

Copyright:

Available Formats

NLP Unit I

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NLP Unit I

Uploaded by

Copyright:

Available Formats

PBR VITS (AUTONOMOUS)

III B.TECH CSE-AI

Course Instructor: Dr KV Subbaiah, M.Tech, Ph.D, Professor, Dept. of CSE

UNIT–I Introduction to Natural language

1.1 Introduction to NLP:

1.2 GENERIC NLP SYSTEM

Figure 1.2.2 Pipeline view of the components of a generic NLP system.

I. Natural Language Understanding (NLU)

II Natural Language Generation (NLG)

Difference between NLU and NLG

NLU draws facts from the natural language

Discipline Typical Problems Tools

Table 1.1 The major disciplines studying language

1.6 Evaluating Language Understanding Systems

Natural language understanding (NLU) Natural language understanding is a branch of artificial

How does natural language understanding work?

Natural language understanding Systems

1.8 Representation and Understanding

Syntax: Representing Sentence Structure

1. John sold the book to Mary.

2. The book was sold to Mary by John.

3. *After it fell in the river, John sold Mary the book.

5. *John are in the corner.

6. *John put the book.

The Final Meaning Representation

There are general five steps in natural language processing

Figure 1.6.1 General five steps of NLP

Figure 1.6.2 Stages of NLP

1.10 Linguistic Background: An outline of English Syntax

The president of the company looked up the chimney

Similarly, the adjective phrases

2.1 The Elements of Simple Noun Phrases

demonstratives—words such as this, that, these, and those.

wh-determiners—words used in questions, such as which and what.

Figure 2.2 Pronoun system (as subject)

Figure 2.3 Pronoun system (possessives)

declarative (or The cat is sleeping.

Form Examples Example Uses

2.2 Verb Phrases and Simple Sentences

Tense The Verb Sequence Example

Figure 2.7 The basic tenses

Tense Structure Example

present be in present He is walking.

Figure 2.9 Person/number forms of verbs

Transitivity and Passives

Figure 2.10 Active sentences with corresponding passive sentences

Prepositional Phrase Complements

Figure 2.11 Some common verb complement structures in English

2.3 Noun Phrases Revisited

The rug that was given to Ernest by George . . .

The man whom the money was given to by George . . .

2.5 Adverbial Phrases

You might also like