Unit II

Unit-II
Syntax
• Parsing Natural Languages
• Tree Banks: A Data Driven Approach to Syntax
• Representation of Syntactic Structure
• Parsing Algorithms
• Models for Ambiguity Resolution in Parsing
• Multilingual Issues
Syntax-Introduction
• Parsing uncovers the hidden structure of linguistic input.
• In Natural Language Applications the predicate structure of sentences
can be useful.
• In NLP the syntactic analysis of the input can vary from:
• Very low level- POS tagging.
• Very high level- recovering a structural analysis that identifies the dependency
between predicates and arguments in the sentence.
• The major problem in parsing natural language is the problem of
ambiguity.
Parsing Natural Languages
• Let us look at the following spoken sentences:
• He wanted to go for a drive in movie.
• He wanted to go for a drive in the country.
• There is a natural pause between drive and in in the second sentence.
• This gap reflects an underlying hidden structure to the sentence.
• Parsing provides a structural description that identifies such a break
in the intonation.
• Let us look at another sentence:
• The cat who lives dangerously had nine lives.
• A text-to-speech system needs to know that the first instance of the
word lives is a verb and the second instance a noun.
• This is an instance of POS tagging problem.
• Another important application where parsing is important is text
summarization.
• Let us look at examples for summarization:
• Beyond the basic level, the operations of the three products vary widely.
• The above sentence can be summarized as follows:
• The operations of the products vary.
• To do this task we first parse the first sentence.
• Deleting the circled constituents PP, CD and ADVP in the previous
diagram results in the short sentence.
• Let us look at another example:
• Open borders imply increasing racial fragmentation in EUROPEAN
COUNTRIES.
• In the above example the capitalized phrase can be replaced with
other phrases without changing the meaning of the sentence.
• Open borders imply increasing racial fragmentation in the countries of Europe.
• Open borders imply increasing racial fragmentation in European states.
• Open borders imply increasing racial fragmentation in Europe.
• Open borders imply increasing racial fragmentation in European Nations.
• Open borders imply increasing racial fragmentation in the European countries.
• In NLP syntactic parsing is used in many applications like:
• Statistical Machine Translation
• Information extraction from text collections
• Language summarization
• Producing entity grids for language generation
• Error correction in text
• Knowledge acquisition from language
Treebanks: A Data-Driven Approach to Syntax
• Parsing recovers information that is not explicit in the input sentence.

• Parsers require some additional knowledge beyond the input sentence
that should be produced as output.
• We can write down the rules of the syntax of a sentence as a CFG.
• Here we have a CFG which represents a simple grammar of transitive
verbs in English (verbs that have a subject and object noun phrase
(NP), plus modifiers of verb phrases (VP) in the form of prepositional
phrases (PP) ).
The above CFG can produce the syntax analysis of a sentence like:
John bought a shirt with pockets
• Parsing the previous sentence gives us two possible derivations.
• Writing a CFG for the syntactic analysis of natural language is
problematic.
• A simple list of rules does not consider interactions between different
components in the grammar.
• Listing all possible syntactic constructions in a language is a difficult
task.
• It is difficult to exhaustively list lexical properties of words. This is a
typical knowledge acquisition problem.
• One more problem is that the rules interact with each other in
combinatorially explosive ways.
• Let us look at an example of noun phrases as a binary branching tree.
• N -> NN (Recursive Rule)
• N -> ‘natural’|’language’|’processing’|’book’
• For the input ‘natural language processing’ the recursive rules
produce two ambiguous parses.
• For CFGs it can be proved that the number of parsers obtained by
using the recursive rule n times is the Catalan number (1,1,2,5,14,42,
132, 429, 1430, 4862,….) of n:
• For the input “natural language processing book” only one out of the
five parsers obtained using the above CFG is correct:
• This is the second knowledge acquisition problem- We need to know
not only the rules but also which analysis is most plausible for a given
input sentence.
• The construction of a tree bank is a data driven approach to syntax
analysis that allows us to address both the knowledge acquisition
bottlenecks in one stroke.
• A treebank is a collection of sentences where each sentence is
provided a complete syntax analysis.
• The syntax analysis for each sentence has been judged by a human
expert.
• A set of annotation guidelines is written before the annotation process to
ensure a consistent scheme of annotation throughout the tree bank.
• No set of syntactic rules are provided by a treebank.
• No exhaustive set of rules are assumed to exist even though assumptions
about syntax are implicit in a treebank.
• The consistency of syntax analysis in a treebank is measured using
interannotator agreement by having approximately 10% overlapped
material annotated by more than one annotator.
• Treebanks provide annotations of syntactic structure for a large sample of
sentences.
• A supervised machine learning method can be used to train the parser.
• Treebanks solve the first knowledge acquisition problem of finding the
grammar underlying the syntax analysis because the analysis is directly
given instead of a grammar.
• The second problem of knowledge acquisition is also solved by
treebanks.
• Each sentence in a treebank has ben given its most plausible syntactic
analysis.
• Supervised learning algorithms can be used to learn a scoring function
over all possible syntax analyses.
• For real time data the parser uses the scoring function to return the
syntax analysis that has the highest score.
• Two main approaches to syntax analysis that are used to construct
treebanks are:
• Dependency graphs
• Phase structure trees
• These two representations are very closely connected to each other and
under some assumptions one can be converted to the other.
• Dependency analysis is used for free word order languages like Indian
languages.
• Phrase structure analysis is used to provide additional information about
long distance dependencies for languages like English and French.
• In the discussion to follow we examine three main components for
building a parser:
• The representation of syntactic structure- it involves the use of a
varying amount of linguistic knowledge to build a treebank.
• The training and decoding algorithms- they deal with the potentially
exponential search space.
• Methods to model ambiguity- provides a way to rank parses to recover
the most likely parse.
Representation of Syntactic Structure
• Syntax Analysis using dependency graphs
• Syntax Analysis using phrase structure trees
Syntax Analysis using Dependency Graphs
• In dependency graphs the head of a phrase is connected with the
dependents in that phrase using directed connections.
• The head-dependent relationship can be semantic (head-modifier) or
syntactic (head-specifier).
• The main difference between dependency graphs and phrase structure
trees is that dependency analysis make minimal assumptions about
syntactic structure.
• Dependency graphs treat the words in the input sentence as the only
vertices in the graph which are linked together by directed arcs
representing syntactic dependencies.
• One typical definition of dependency graph is as follows:
• In dependency syntactic parsing the task is to derive a syntactic structure for an
input sentence by identifying the syntactic head of each word in the sentence.
• The nodes are the words of the input sentence and the arcs are the binary relations
from head to dependent.
• It is often assumed that all words except one have a syntactic head.
• It means that the graph will be a tree with the single independent node as the root.
• In labeled dependency parsing the parser assigns a specific type to each dependency
relation holding between head word and dependent word.
• In the current discussion we will be discussing about dependency trees only
where each word depends on exactly one parent either another word or a
dummy symbol.
• In dependency trees the 0 index is used to indicate the root symbol
and the directed arcs are drawn from head word to the dependent
word.
• In the figure in the previous slide [fakulte,N3,7] is the seventh word in
the sentence with POS tag N3 and it has dative case.
• Here is a textual representation of a labeled dependency tree:
• An important notion in dependency analysis is the notion of projectivity.
• A projective dependency tree is one where we put the words in a linear
order based on the sentence with the root symbol in the first position.
• The dependency arcs are then drawn above the words without any
crossing dependencies.
• Example:
• Let us look at an example where a sentence contains an extra position
to the right of a noun phrase modifier phrase which requires a
crossing dependency.
• English has very few cases in a treebank that needs such a non projective
analysis.
• In languages like Czech, Turkish, Telugu the number of non productive
dependencies are much higher.
• Let us look at a multilingual comparison of crossing dependencies across
a few languages:
• Ar=Arabic;Ba=Basque;Ca=Catalan;Ch=Chinese;Cz=Czech;En=English;
Gr=Greek;Hu=Hungarian;It=Italian;Tu=Turkish
• Dependency graphs in treebanks do not explicitly distinguish between
projective and non-projective dependency tree analyses.
• Parsing algorithms are sometimes forced to distinguish between
projective and non-projective dependencies.
• Let us try to setup dependency links in a CFG.
• In the CFG the terminal symbols are x0,x1,x2,x3.

• The asterisk picks out a single symbol in the right hand side of each rule that
specifies the dependency link.
• We can look at the asterisk as either a separate annotation on the non-
terminal or simply as a new nonterminal in the probabilistic context-free
grammar(PCFG).
• The dependency tree equivalent to the preceding CFG is as follows:
• If we can convert a dependency tree into an equivalent CFG then the

dependency tree must be projective.
• In a CFG converted from a dependency tree we have only the following
three types of rules:
• One type of rule to introduce the terminal symbol
• Two rules where Y is dependent on X or vice-versa.
• The head word of X or Y can be traced by following the asterisk symbol.
• Now let us look at an example non-projective dependency tree:
• When we convert this dependency tree to a CFG with * notation it

can only capture the fact that X3 depends on X2 or X1 depends on X3.
• There is no CFG that can capture the non-projective dependency.
• Projective dependency can be defined as follows:
• For each word in the sentence its descendants form a contiguous substring of the
sentence.
• Non-Projectivity can be defined as follows:
• A non-projective dependency means that there is a word in the sentence such
that its descendants do not form a contiguous substring of the sentence.
• In non-projective dependency there is a non-terminal Z such that Z
derives spans(xi,xk) and (xk+p,xj) for some p>0.
• It means there must be a rule Z->PQ where P derives (x1,xk) and Q
derives (xk+p,xj).
• By the definition of CFG this is valid if p=0 because P and Q must be
continuous substrings.
• Hence non-projective dependency can not be converted to a CFG.
Syntax Analysis using Phrase Structure Trees
• A phrase structure syntax analysis can be viewed as implicitly having a
predicate argument structure associated with it.
• Now let us look at an example sentence:
• Mr. Baker seems especially sensitive
• The same sentence gets the following dependency tree analysis.
• We can find some similarity between Phase Structure Trees and

Dependency Trees.
• We now look at some examples of Phase Structure Analysis in tree banks
and see how null elements are used to localize certain predicate-
argument dependencies in the tree structure.
• In this example we can see that an NP dominates a trace *T* which is
a null element (same as ϵ).
• The empty trace has an index and is associated with the WHNP (WH-
noun phrase) constituent with the same index.
• This co-indexing allows us to infer the predicate-argument structure
shown in the tree.
• In this example “the ball” is actually not the logical subject of the
predicate.
• It has been displaced due to the passive construction.
• “Chris” the actual subject is marked as LGS (Logical subjects in
passives) enabling the recovery of predicate argument structure for
the sentence.
• In this example we can see that different syntactic phenomena are
combined in the corpus.
• Both the analysis are combined to provide the predicate argument
structure in such cases.
• Here we see a pair of examples to show how null elements are used
to annotate the presence of a subject for a predicate even if it is not
explicit in the sentence.
• In the first case the analysis marks the missing subject for “take back”
as the object of the verb persuaded.
• In the second case the missing subject for “take back ” is the subject
of the verb promised.
• The dependency analysis for “persuaded” and “promised” do not
make such a distinction.
• The dependency analysis for the two sentences is as follows:
• Most statistical parsers trained using Phrase Structure treebanks ignore
these differences.
• Here we look at one example from the Chinese treebank which uses IP
instead of S.
• This is a move from transformational grammar-based phrase structure to
government-binding (GB) based phrase structure.
• It is very difficult to take a CFG-based parser initially developed for English
parsing and adapt it to Chinese parsing by training it on Chinese phrase
structure treebank.
Parsing Algorithms
• Introduction to Parsing Algorithms
• Shift-Reduce Parsing
• Hypergraphs and Chart Parsing
• Minimum Spanning Trees and Dependency Parsing
Parsing Algorithms
• Given an input sentence a parser produces an output analysis of that
sentence.
• The analysis will be consistent with the tree-bank used to train the
parser.
• Tree-banks do not need to have an explicit grammar.
• Here for better understanding we assume the existence of a CFG.
Parsing Algorithms
• Let us look at an example of a CFG that is used to derive strings such
as “a and b or c” from the start symbol N:
• An important concept of parsing is a derivation.

• In the derivation each line is called sentential form.
• The method of derivation used here is called rightmost derivation.
Parsing Algorithms
• An interesting property of rightmost derivation is revealed if we
arrange the derivation in reverse order.
• The above sequence corresponds to the construction of the above

parse tree from left to right, one symbol at a time.
Parsing Algorithms
• There can be many parse trees for the given input string.
• Another rightmost derivation for the given input string is as follows:
Shift-Reduce Parsing
• Every CFG has an automaton equivalent to it called the pushdown
automaton.
• The shift-reduce parsing algorithm is defined as follows:
• For the example “a and b or c” the parsing steps are s follows:
Shift-Reduce Parsing- for dependency parsing
Hyper Graphs and Chart Parsing
• Shift-reduce parsing allows a linear time parser but requires access to
an oracle.
• CFGs in the worst case need backtracking and have a worst case
parsing algorithm which run in O(n3) where n is the size of the input.
• Variants of this algorithm are used in statistical parsers that attempt
to search the space of possible parse trees without the limitation of
left-to-right parsing.
• Our example CFG G is rewritten as new CFG Gc which contains up to
two non-terminals on the right hand side.
• We can specialize the CFG Gc to a particular input string by creating a new
CFG that represents all possible parse trees that are valid in grammar Gc
for this particular input sentence.
• For the input “a and b or c” the new CFG Cf that represents the forest of
parse trees can be constructed.
• Let the input string be broken up into spans 0 a 1 and 2 b 3 or 4 c 5.
• Here a parsing algorithm is defined as taking as input a CFG and an
input string and producing a specialized CFG that represents all legal
parsers for the input.
• A parser has to create all the valid specialized rules from the start
symbol nonterminal that spans the entire string to the leaf nodes that
are the input tokens.
• Now let us look at the steps the parser has to take to construct a
specialized CFG.
• Let us consider the rules that generate only lexical items:

Unit II

Uploaded by

Copyright:

Available Formats

Unit II

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit II

Uploaded by

Copyright:

Available Formats

Unit-II

• Parsing recovers information that is not explicit in the input sentence.

• In the CFG the terminal symbols are x0,x1,x2,x3.

• If we can convert a dependency tree into an equivalent CFG then the

• When we convert this dependency tree to a CFG with * notation it

• We can find some similarity between Phase Structure Trees and

• An important concept of parsing is a derivation.

• The above sequence corresponds to the construction of the above

You might also like