Unit Ii Part of Speech Tagging and Syntactic Parsing

ML1701 Natural Language Processing Department of AML 2024 - 2025
UNIT II PART OF SPEECH TAGGING AND SYNTACTIC PARSING

1. POS Tagging-
2. Named Entities and Named Entity Tagging-
3. Conditional Random Fields (CRFs)-
4. Evaluation of Named Entity Recognition-
5. HMM Part-of-Speech Tagging-
6. Trigram Hidden Markov Models-
7. Decoding with HMMs: the Viterbi Algorithm-
8. Syntactic Parsing-
9. Efficient parsing for context-free grammars (CFGs)-
10. Semantic Parser –
11. Semantic Role Labelling
1. PART-OF-SPEECH (POS) TAGGING

Part-of-speech (POS) tagging is a crucial step in natural language processing (NLP)
that involves identifying the grammatical categories of words in a text. This process helps in
understanding the syntactic structure and meaning of sentences by classifying each word into
its respective part of speech, such as noun, verb, adjective, adverb, etc.
Explanation of POS Tagging
1. What is POS Tagging?

o POS tagging involves assigning a part of speech label to each word in a
sentence. The labels indicate the grammatical role of each word, helping to
clarify the structure and meaning of the sentence.
2. Why is POS Tagging Important?
o POS tagging aids in syntactic parsing, information retrieval, machine
translation, and various other NLP tasks. It provides a foundational
understanding of sentence structure, which is essential for more complex
language processing tasks.
3. Example of POS Tagging
Let's take a simple sentence for illustration:
Sentence: "The quick brown fox jumps over the lazy dog."
POS Tagging Breakdown:
o The – Determiner (DT): Specifies a noun.

o quick – Adjective (JJ): Describes a noun.
o brown – Adjective (JJ): Describes a noun.
o fox – Noun (NN): The main subject of the sentence.
o jumps – Verb (VBZ): The action performed by the subject.
o over – Preposition (IN): Shows the relationship between the noun (fox) and
another noun (dog).
o the – Determiner (DT): Specifies a noun.
o lazy – Adjective (JJ): Describes a noun.
o dog – Noun (NN): The object of the preposition.
St. Joseph’s College of Engineering Page 1 of 29

Tagged Sentence:
o "The/DT quick/JJ brown/JJ fox/NN jumps/VBZ over/IN the/DT lazy/JJ

dog/NN."
POS Tagging Techniques
1. Rule-Based Tagging:
o Uses a set of hand-crafted rules to assign tags. For instance, if a word follows a
determiner, it might be a noun. This approach can be accurate but is labor-
intensive and may not generalize well.
2. Statistical Tagging:
o Uses probabilistic models based on observed frequencies in annotated corpora.
The Hidden Markov Model (HMM) is a common example. It calculates the
probability of a sequence of tags given a sequence of words.
3. Machine Learning Tagging:
o Employs machine learning algorithms like Conditional Random Fields (CRF)
or neural networks to predict POS tags based on training data. These methods
learn patterns from large annotated datasets and tend to be more accurate and
adaptable.
4. Deep Learning Tagging:
o Utilizes advanced neural network architectures, such as Recurrent Neural
Networks (RNNs) and Transformers, to capture complex dependencies in text.
These models often provide state-of-the-art performance in POS tagging tasks.
Practical Application
POS tagging can be used in various NLP applications, such as:
 Syntactic Parsing: Understanding sentence structure.

 Named Entity Recognition (NER): Identifying proper names and entities.
 Information Extraction: Extracting relevant data from text.
 Sentiment Analysis: Understanding sentiment based on word categories.
2.NAMED ENTITY RECOGNITION (NER)

Named Entity Recognition (NER) is a fundamental task in natural language processing (NLP)
that involves identifying and classifying key entities in a text. These entities often refer to
specific, distinct elements such as people, organizations, locations, dates, and other proper
nouns. NER is crucial for various applications, including information extraction, question
answering, and machine translation.
What is Named Entity Recognition?
Named Entity Recognition is the process of detecting named entities within a text and
categorizing them into predefined classes. These entities typically include:
 People: Names of individuals (e.g., "Barack Obama")

 Organizations: Names of companies, institutions, or groups (e.g., "Google", "United

Nations")
 Locations: Names of places or geographic locations (e.g., "Paris", "Mount Everest")
 Dates and Times: Specific dates or time-related expressions (e.g., "August 15, 2024",
"3 PM")
 Other: Includes entities such as product names, monetary values, or percentages.
Why is NER Important?
1. Information Extraction: Helps in extracting specific information from large volumes

of text.
2. Search and Retrieval: Improves the precision of search engines by identifying relevant
entities.
3. Knowledge Graphs: Assists in constructing knowledge graphs by linking entities to
structured data.
4. Question Answering Systems: Enhances systems by allowing them to recognize and
respond to entity-based queries.
5. Text Classification: Helps in categorizing texts based on the entities mentioned.
Techniques for Named Entity Recognition
1. Rule-Based Systems:
o Description: Use handcrafted rules and dictionaries to identify named entities.
o Example: Regular expressions might be used to match patterns typical of dates
or names.
o Pros: Can be effective for specific, well-defined tasks.
o Cons: Limited flexibility and scalability; may not handle variations well.
2. Statistical Models:
o Hidden Markov Models (HMM): These models use probabilistic sequences
to recognize patterns and entities based on previous data.
o Conditional Random Fields (CRF): A type of discriminative model that
provides more accurate recognition by considering the context of neighboring
words.
o Pros: Generally more flexible than rule-based systems and can improve with
more data.
o Cons: Require substantial annotated training data and feature engineering.
3. Machine Learning Models:
o Support Vector Machines (SVM): Classify tokens into entity categories using
feature vectors.
o Pros: Effective for various types of text, but performance depends on feature
selection.
o Cons: May require extensive tuning and feature engineering.
4. Deep Learning Models:
o Recurrent Neural Networks (RNNs): Useful for capturing sequential
dependencies in text.
o Long Short-Term Memory Networks (LSTMs): An advanced type of RNN
that handles long-range dependencies better.
o Transformers: Modern architectures like BERT (Bidirectional Encoder
Representations from Transformers) and GPT (Generative Pre-trained

Transformer) have set new standards in NER by leveraging contextual

information from the entire text.
o Pros: High accuracy and ability to handle complex language patterns.
o Cons: Requires large amounts of data and computational resources.
Example of Named Entity Recognition
Consider the following sentence:
Sentence: "Barack Obama was born on August 4, 1961, in Honolulu, Hawaii."
NER Process:
1. Tokenization: Breaking down the sentence into individual tokens (words and
punctuation).
o Tokens: ["Barack", "Obama", "was", "born", "on", "August", "4", ",", "1961",
",", "in", "Honolulu", ",", "Hawaii", "."]
2. Entity Recognition:
o "Barack Obama" – Person
o "August 4, 1961" – Date
o "Honolulu" – Location
o "Hawaii" – Location
3. Tagged Sentence:
o "Barack Obama/PERSON was born on August 4, 1961/DATE, in
Honolulu/LOCATION, Hawaii/LOCATION."
Challenges in NER
1. Ambiguity: Some words or phrases can be ambiguous and may refer to different
entities based on context.
2. Variability: Entities can be expressed in multiple forms or spellings.
3. Domain-Specific Terms: Different domains may use specialized terminology that
needs tailored NER models.
4. Language and Cultural Differences: NER systems must handle different languages
and cultural contexts effectively.
Applications of NER
1. Search Engines: Improve search results by recognizing entities in queries and

documents.
2. Social Media Monitoring: Track mentions of brands, people, or events.
3. Content Categorization: Automatically categorize and tag news articles or other
content.
4. Healthcare: Extract medical terms, drug names, and patient information from clinical
notes.
CASE-STUDY QUESTION:

Design an application that leverages Named Entity Recognition (NER) to improve

customer support in an e-commerce platform.
(a) Describe how the application will use NER to extract and categorize relevant
information from customer queries and reviews.
(b) Explain the workflow of the application, including the integration of NER for tasks
such as automatically routing queries, summarizing feedback, and enhancing
search functionalities.
(c) Illustrate with examples and diagrams how NER can be applied in these scenarios.
Answer:
Application Design: Customer Support Enhancement for an E-Commerce

Platform
Objective:
Enhance customer support by utilizing Named Entity Recognition (NER) to automatically

process and categorize customer queries and reviews. This will improve the efficiency of
query routing, feedback analysis, and search functionalities.
Workflow
1. Data Collection:
o Source: Collect customer queries and reviews from various channels (e.g.,
email, chat, feedback forms).
2. Preprocessing:
o Tokenization: Break down text into sentences and words.
o Normalization: Convert text to lowercase, remove special characters, etc.
3. Named Entity Recognition (NER):
o Application: Apply an NER model to identify and classify named entities
within the customer queries and reviews.
Diagram:
4.
Entity Extraction and Categorization:
o Entities Identified: Extract relevant entities such as products, organizations,

dates, and customer names.

o Categorization: Classify entities into predefined categories (e.g., Product,

Brand, Company).
Example Workflow:
oInput Query: "Can I get a refund for my Samsung Galaxy S21 purchased on
July 10, 2024?"
o NER Output:
 Samsung Galaxy S21: PRODUCT
 July 10, 2024: DATE
5. Query Routing:
o Automatic Routing: Use extracted entities to route queries to appropriate
departments or support agents.
6.
Feedback Summarization:
o Summarization: Aggregate and summarize feedback related to specific
products, brands, or services.
o Diagram:
7.
Enhanced Search Functionality:
o Search Improvement: Enhance search functionality by tagging and indexing
named entities.
o Example: Allow users to search for reviews or support documents by specific
products, brands, or organizations.

Integration Example
Scenario 1: Query Routing
Input: "I need assistance with my Dell XPS 13 laptop purchased on January 5, 2024."
NER Output:
 Dell XPS 13: PRODUCT

 January 5, 2024: DATE
Workflow:
 Identification: The system identifies the product as "Dell XPS 13" and the date
"January 5, 2024."
 Routing: Routes the query to the Technical Support team for product-related issues
and includes details for a potential warranty check.
Scenario 2: Feedback Summarization
Input Reviews:
 "The Samsung Galaxy Note 20 has a fantastic camera but poor battery life."
 "Samsung's customer service was responsive but slow."
NER Output:

 Samsung Galaxy Note 20: PRODUCT

 Samsung: ORGANIZATION
Summary:
 Product Feedback: Positive about the camera, negative about battery life.
 Service Feedback: Negative about customer service response time.
Conclusion
By incorporating Named Entity Recognition (NER) into the customer support workflow of an
e-commerce platform, the application can efficiently handle and categorize customer
interactions. This enables automated query routing to relevant departments, insightful
summarization of customer feedback, and enhanced search functionalities based on named
entities. This approach improves operational efficiency and customer satisfaction by ensuring
that queries and feedback are accurately addressed and easily accessible.
3. CONDITIONAL RANDOM FIELDS (CRFS)

Conditional Random Fields (CRFs) are a type of statistical modeling framework used in
machine learning, particularly for sequence prediction tasks like Part-of-Speech (POS) tagging.
They are designed to model the conditional probability of a sequence of labels given a sequence
of observed data, making them well-suited for tasks where context and dependencies between
adjacent labels are important.
What are Conditional Random Fields?
Conditional Random Fields are a type of undirected graphical model, which provides a way to
encode complex dependencies between labels in a sequence. Unlike traditional models that
might independently predict each label, CRFs consider the entire sequence of labels when
making predictions, allowing them to capture dependencies between neighboring labels.
How CRFs Work
1. Input and Output Sequences:

o Input Sequence: The sequence of observed data, such as words in a sentence.

o Output Sequence: The sequence of labels we want to predict, such as POS tags
for each word.
2. Graphical Model:
o CRFs use a graph structure where each node represents a random variable (in
this case, a label for a word), and edges represent dependencies between these
variables.
o For sequence labeling, the graph is typically a linear chain, where each node is
connected to its neighboring nodes.
3. Feature Functions:
o CRFs rely on feature functions that capture various aspects of the input data and
label sequences. These features can include:
 The word itself (e.g., "quick")
 The previous label (e.g., if the previous label was a noun, the current
word might be more likely to be a verb)
 The previous word (e.g., "brown" could influence the tag of "fox")

1. Training:
o Using a labeled corpus, CRFs learn the weights for feature functions. For
example, if "quick" frequently follows "The" and is often tagged as JJ, the CRF
model will assign higher weights to such features.
2. Inference:
o Given a new sentence, the CRF model uses the learned weights to calculate the
probability of different label sequences and selects the most probable sequence.
For instance, it might determine that "quick" is best labeled as JJ based on its
context in the sentence.
Advantages of CRFs
1. Contextual Dependencies: CRFs can model complex dependencies between labels,

making them effective for tasks where the label of one token depends on neighboring
labels.
2. Flexibility: They can incorporate a wide range of features, including those derived from
the input sequence and neighboring labels.
3. Global Normalization: CRFs consider the entire label sequence when making
predictions, improving accuracy by ensuring that label assignments are globally
consistent.
Challenges of CRFs
1. Computational Complexity: CRFs can be computationally intensive, especially for

large datasets and complex feature sets.
2. Feature Engineering: Effective feature design is crucial and can be labor-intensive.

Conclusion
Conditional Random Fields are a powerful tool for sequence prediction tasks like POS tagging,
offering a sophisticated approach to capturing dependencies between labels and the context of
observed data. By learning from a combination of local and global features, CRFs can achieve
high accuracy in labeling tasks, making them a popular choice in NLP applications.
3.TRIGRAM HIDDEN MARKOV MODELS (TRIGRAM

HMMS)
Trigram Hidden Markov Models (Trigram HMMs) are an extension of Hidden Markov
Models (HMMs) that incorporate a third-order context in modeling sequences. They are used
in various sequence prediction tasks, including Part-of-Speech (POS) tagging and speech
recognition. Trigram HMMs aim to improve the accuracy of sequence predictions by
considering the influence of the two preceding labels on the current label, rather than just the
previous one.
Hidden Markov Models (HMMs) Overview
To understand Trigram HMMs, it's helpful to first grasp the basics of HMMs:
1. Hidden Markov Model Basics:

o States: HMMs have a set of hidden states that represent the possible tags or
categories.
o Observations: The observable data, such as words in a sentence for POS
tagging.
o Transition Probabilities: The probabilities of transitioning from one state to
another.
o Emission Probabilities: The probabilities of observing a particular data point
given a state.
o Initial Probabilities: The probabilities of starting in each state.


4.VITERBI ALGORITHM
The Viterbi algorithm is a dynamic programming algorithm used for finding the most
probable sequence of hidden states in Hidden Markov Models (HMMs). It is particularly useful
in problems involving sequence prediction, such as Part-of-Speech (POS) tagging, speech
recognition, and biological sequence analysis.
What is the Viterbi Algorithm?
The Viterbi algorithm calculates the most likely sequence of hidden states (the Viterbi
path) given a sequence of observed events, based on the parameters of the HMM. It efficiently
computes this sequence by using a dynamic programming approach to avoid redundant
calculations.

Components of HMMs
To understand the Viterbi algorithm, it's important to know the basic components of a
Hidden Markov Model:
1. States (SSS): A set of hidden states in the model.

2. Observations (OOO): A sequence of observed events


5.SYNTACTIC PARSING
Syntactic parsing, also known as syntactic analysis or parsing, is a process in natural
language processing (NLP) and computational linguistics that involves analyzing the syntactic
structure of a sentence. The goal is to determine how words in a sentence are grouped together
to form phrases and how these phrases relate to each other according to the rules of grammar.
Syntactic parsing is essential for understanding the grammatical structure of sentences, which
is crucial for various NLP tasks such as machine translation, information extraction, and
question answering.
Key Concepts in Syntactic Parsing
1. Syntax: Syntax refers to the set of rules, principles, and processes that govern the
structure of sentences in a language. It involves the arrangement of words and phrases
to create well-formed sentences.
2. Parse Tree: A parse tree (or syntax tree) is a hierarchical tree structure that represents
the syntactic structure of a sentence. Each node in the tree represents a syntactic
category, such as a phrase or a part of speech, and the edges represent grammatical
relationships between these categories.

3. Constituents: Constituents are the parts of a sentence that function as a single unit
within the syntactic structure. Constituents can be words, phrases, or clauses. For
example, in the sentence "The cat sat on the mat," "The cat" and "on the mat" are
constituents.
4. Grammatical Categories: These are the classes of words and phrases based on their
function and role in a sentence. Common categories include nouns (N), verbs (V),
adjectives (Adj), adverbs (Adv), prepositions (P), and determiners (Det).
Types of Syntactic Parsing
1. Constituency Parsing:
o Description: Constituency parsing involves identifying the hierarchical
structure of a sentence based on constituent phrases. It focuses on how words
group together to form phrases and how these phrases are organized in a
hierarchical tree structure.
o Example: For the sentence "The quick brown fox jumps over the lazy dog,"
constituency parsing would break it down into noun phrases (NP) and verb
phrases (VP), and further into their components.
2. Dependency Parsing:
o Description: Dependency parsing focuses on the grammatical relationships
between words in a sentence. It represents the syntactic structure as a set of
directed links (dependencies) between words, where each link indicates a
syntactic relationship.
o Example: In the sentence "She eats an apple," a dependency parser would
identify "eats" as the root word with "She" as its subject and "an apple" as its
object.
Parsing Techniques and Algorithms
1. Top-Down Parsing:
o Description: Top-down parsing starts from the root of the parse tree and tries
to expand it to match the input sentence. It begins with the highest-level
grammar rules and breaks them down into smaller components.
o Algorithms: Recursive descent parsing is a common top-down approach, where
a set of recursive procedures attempts to match the sentence structure with the
grammar rules.
2. Bottom-Up Parsing:
o Description: Bottom-up parsing starts from the individual words or tokens and
combines them to build up to the full parse tree. It constructs the parse tree
incrementally by merging smaller units into larger units.
o Algorithms: The Earley parser and the Shift-Reduce parser are examples of
bottom-up parsing techniques. The Earley parser is particularly versatile and
can handle a wide range of grammars.
3. Chart Parsing:
o Description: Chart parsing is a dynamic programming approach that uses a
chart to keep track of partially parsed parts of the input. It combines top-down
and bottom-up techniques to efficiently parse sentences.
o Algorithms: The CYK (Cocke-Younger-Kasami) algorithm is a well-known
chart parsing algorithm for context-free grammars. The Earley parser is another
chart-based parsing method.

4. Parsing with Probabilistic Models:

o Description: Probabilistic parsing uses statistical models to handle ambiguities
in natural language. It assigns probabilities to different parse trees and chooses
the most likely one.
o Algorithms: Probabilistic Context-Free Grammars (PCFGs) and statistical
dependency parsing models are examples. These models use training data to
estimate probabilities and improve parsing accuracy.
Applications of Syntactic Parsing
1. Machine Translation: Understanding the syntactic structure of a sentence is crucial

for translating it into another language while preserving grammatical correctness.
2. Information Extraction: Parsing helps extract relevant information from text by
identifying key phrases and their relationships.
3. Question Answering: Syntactic parsing aids in understanding the structure of
questions and finding relevant answers based on grammatical relationships.
4. Sentiment Analysis: Understanding sentence structure can improve sentiment analysis
by distinguishing between different parts of a sentence that may express different
sentiments.
Challenges in Syntactic Parsing
1. Ambiguity: Natural language is often ambiguous, and sentences can have multiple
valid parses. Handling such ambiguities requires sophisticated algorithms and models.
2. Complexity: Parsing complex sentences with long dependencies or nested structures
can be computationally expensive and challenging.
3. Grammar Coverage: Creating comprehensive grammars that cover all possible
sentence structures in a language is difficult, especially for languages with rich syntactic
variations.
Conclusion
Syntactic parsing is a foundational task in natural language processing that involves

analyzing and understanding the structure of sentences. By breaking down sentences into their
grammatical components and relationships, syntactic parsing enables more advanced NLP
applications such as machine translation, information extraction, and question answering.
Various parsing techniques and algorithms, from constituency and dependency parsing to
probabilistic models, offer different ways to tackle the complexities of natural language.
Despite the challenges, syntactic parsing remains a critical area of research and application in
computational linguistics.
6. CONTEXT-FREE GRAMMARS (CFGS)

Efficient parsing for Context-Free Grammars (CFGs) is crucial in natural language processing
(NLP) and compiler design due to the complexity and computational demands associated with
CFG parsing. Here’s a detailed overview of various methods and algorithms used to efficiently
parse CFGs:

1. Introduction to Context-Free Grammars (CFGs)
 Context-Free Grammar (CFG): A CFG consists of a set of production rules that

define how non-terminal symbols can be replaced with terminal symbols or other non-
terminals. It is used to describe the syntax of programming languages and natural
languages.
 Components:
o Terminals: The basic symbols from which strings are formed (e.g., words or
characters).
o Non-Terminals: Symbols that can be expanded into other symbols or sequences
of symbols.
o Production Rules: Rules that describe how non-terminals can be replaced with
other symbols.
o Start Symbol: The initial non-terminal from which parsing starts.
2. Parsing Techniques for CFGs
A. Top-down parsing
 Recursive Descent Parsing:

o Description: This approach uses a set of recursive procedures, each
corresponding to a non-terminal in the grammar. It starts from the start symbol
and tries to match the input with the grammar rules.
o Challenges: Recursive descent parsing can suffer from problems like left
recursion and requires careful handling of ambiguous grammars.
 Predictive Parsing:
o Description: A subset of recursive descent parsing that uses lookahead tokens
to make parsing decisions. It’s more efficient when combined with LL(1)
grammars.
o LL(1) Grammars: A CFG is LL(1) if it can be parsed by a top-down parser
with a single lookahead token. LL(1) parsing is efficient and straightforward but
limited to a specific class of grammars.
B. Bottom-up parsing
 Shift-Reduce Parsing:
o Description: This approach builds the parse tree from the input tokens by
shifting tokens onto a stack and reducing them to non-terminals based on the
grammar rules.
o Algorithm: The algorithm involves two primary operations—shift (move a
token from the input to the stack) and reduce (replace a sequence on the stack
with a non-terminal).
 LR Parsing:
o Description: LR parsing is a bottom-up parsing method that handles a broader
class of grammars, including many ambiguous ones. It constructs the parse tree
by looking ahead and managing shifts and reductions efficiently.
o Types:

 SLR (Simple LR): A simpler version of LR parsing with less lookahead

but lower grammar coverage.
 LR(1): Uses one lookahead token and is more powerful than SLR.
 LALR (Look-Ahead LR): An intermediate version of LR parsing that
balances between efficiency and coverage.
3. Dynamic Programming Parsing
 CYK Algorithm (Cocke-Younger-Kasami):

o Description: The CYK algorithm is used for parsing CFGs in Chomsky Normal
Form (CNF). It uses a dynamic programming approach to construct a parse table
that helps in determining if a string belongs to the language generated by the
CFG.
o Time Complexity: O(n3⋅∣G∣)O(n^3 \cdot |G|)O(n3⋅∣G∣), where nnn is the
length of the input string, and ∣G∣|G|∣G∣ is the number of grammar rules.
 Earley Parser:
o Description: The Earley parser is a more versatile dynamic programming parser
that handles all CFGs, including those in CNF. It can handle ambiguities and is
particularly useful for parsing sentences with complex structures.
o Time Complexity: O(n3)O(n^3)O(n3) in the general case, but can be more
efficient for specific types of grammars.
4. Chart Parsing
 Chart Parsing:
o Description: Chart parsing is a general parsing method that uses a chart (or
parse table) to keep track of partially parsed parts of the input. It is often
combined with dynamic programming to handle complex grammars efficiently.
o Algorithm: The chart parsing algorithm combines top-down and bottom-up
approaches and is used in various parsing strategies, including the CYK and
Earley algorithms.
5. Ambiguity Handling
Ambiguity: CFGs can be ambiguous, meaning that a single input string may have multiple
valid parse trees. Efficient parsing must handle ambiguity in a way that can either choose
the most probable parse or enumerate all possible parses.
 Approaches:
o Probabilistic Parsing: Use probabilistic CFGs (PCFGs) to assign probabilities
to different parses and select the most likely one.
o Constraint Handling: Use additional constraints or disambiguation rules to
resolve ambiguities.
6. Optimizations and Enhancements
 Grammar Simplification: Transforming CFGs into simpler forms, such as CNF, can
improve parsing efficiency. However, this may not always be feasible for all grammars.

 Parsing Strategies: Hybrid approaches that combine different parsing techniques (e.g.,
combining top-down and bottom-up parsing) can sometimes achieve better
performance.
Conclusion
Efficient parsing of Context-Free Grammars involves various techniques and

algorithms, each suited to different types of grammars and parsing requirements. Top-down
methods like recursive descent and predictive parsing work well for simpler grammars, while
bottom-up methods like LR parsing and shift-reduce parsing handle more complex and
ambiguous grammars. Dynamic programming techniques like the CYK and Earley algorithms
offer powerful ways to parse CFGs efficiently, especially for grammars in CNF. By
understanding these methods and their trade-offs, one can select and implement the most
appropriate parsing strategy for a given application or language.
7.WITH THE HELP OF A TABULAR COLUMN COMPARE

THE VARIOUS PARSING TECHNIQUES FOR CFG.
Time
Parsing Descriptio Types of Space Weakness
Complexi Strengths Use Cases
Technique n Grammars Complexity es
ty
A top-
down
approach Struggles
Simple and Simple
using with left
Recursive High due to intuitive programmin
recursive LL(1) O(n)O(n) recursion
Descent recursion implementa g languages,
procedures grammars O(n) and
Parsing stack tion; easy to educational
to match ambiguous
understand. tools.
input with grammars.
grammar
rules.
A subset of
recursive Limited to
descent Efficient LL(1) Simple
parsing for LL(1) grammars; languages,
High due to
Predictive that uses a LL(1) O(n)O(n) grammars; may some
recursion
Parsing lookahead grammars O(n) avoids require configuratio
stack
token to backtrackin grammar n file
make g. transforma formats.
parsing tion.
decisions.
O(n)O(n) Moderate; Can handle More Compilers,
LR(k),
Shift-Reduce A bottom- O(n) to stack and a broad complex to complex
SLR,
Parsing up O(n3)O(n parse table class of implement language
LALR,
approach ^3)O(n3) required grammars, and parsers.

Time
ty
using shift LR(1) including understand
and reduce grammars ambiguous ; space
operations ones. usage can
to be high.
construct
parse trees.
A specific
type of
shift-
Handles a Requires
reduce Programmin
LR(1), O(n)O(n) wide range large parse
parsing Moderate; g languages,
LALR, O(n) to of tables;
LR Parsing with requires a robust
SLR O(n3)O(n grammars; more
various parse table language
grammars ^3)O(n3) good error complex to
flavors for parsers.
detection. implement.
different
types of
grammars.
A dynamic
programmi
Guarantees
ng
parsing of
approach
High due any CFG in
CYK for parsing Any CFG (O(n^3
G )) to table CNF;
Algorithm in in CNF \cdot
size efficient for
Chomsky
fixed-length
Normal
lookahead.
Form
(CNF).
A dynamic
Complexit
programmi
Can handle y in Natural
ng parser
ambiguous implement language
that can
Earley O(n3)O(n Moderate; and ation; can processing,
handle all Any CFG
Parser ^3)O(n3) chart-based complex be slower parsing with
CFGs,
grammars; for very rich
including
versatile. large grammars.
those in
inputs.
CNF.
General
Combines Space
dynamic
top-down complexity
programmi Linguistic
Varies; and bottom- can be
ng Moderate; analysis,
Chart often up high;
approach Any CFG depends on complex
Parsing O(n3)O(n methods; implement
using a chart size syntactic
^3)O(n3) handles ation
chart to structures.
complex complexity
track
structures. .
partially

Time
ty
parsed
parts of
input.
8.SEMANTIC PARSING
Semantic parsing is the process of mapping natural language sentences into formal
representations that capture their meaning. Unlike syntactic parsing, which focuses on the
grammatical structure of sentences, semantic parsing aims to understand and represent the
meaning behind the text in a way that is useful for further computational processing, such as
question answering, information retrieval, or automated reasoning.
Key Concepts in Semantic Parsing
1. Formal Representation:
o Meaning Representation: The formal structure used to capture the meaning of
a sentence. Common representations include logical forms, semantic graphs,
and structured query languages.
o Logical Forms: These are formal representations of the meaning of a sentence
using logical constructs, such as predicates, functions, and variables.
2. Semantic Ambiguity:
o Lexical Ambiguity: Words with multiple meanings can lead to different
interpretations.
o Syntactic Ambiguity: Different syntactic structures can result in different
meanings.
o Contextual Ambiguity: The meaning of a sentence can depend on the context
in which it is used.
3. Semantic Roles:
o Rolesets: Define the roles that different words or phrases play in the context of
a sentence. For instance, in the sentence "John gave Mary a book," John is the
giver, Mary is the recipient, and the book is the object.
Techniques and Approaches in Semantic Parsing
1. Rule-Based Approaches:
o Pattern Matching: Uses predefined patterns to match parts of the text to
semantic roles or structures. For example, a rule might map the phrase "X gave
Y Z" to a logical form where X is the giver, Y is the recipient, and Z is the object.
o Template-Based Parsing: Uses templates to extract information. Templates are
predefined structures that represent common patterns in text.
2. Compositional Semantics:
o Lambda Calculus: A formal system used to represent functions and their
applications. Lambda calculus can be used to build meaning representations by
composing the meanings of individual words or phrases.

o Compositionality: The principle that the meaning of a sentence is derived from

the meanings of its parts and their syntactic arrangement.
3. Statistical and Machine Learning Approaches:
o Supervised Learning: Trains models using annotated data where sentences are
paired with their formal meanings. Models learn to predict the semantic
representation based on input sentences.
o Neural Networks: Deep learning approaches, such as Recurrent Neural
Networks (RNNs) and Transformer models (like BERT and GPT), are used to
learn complex semantic mappings from data.
4. Semantic Role Labeling (SRL):
o Description: SRL involves identifying the roles that different parts of a
sentence play in relation to the main verb. For instance, in the sentence "Alice
sent a letter to Bob," SRL would identify Alice as the sender, the letter as the
object, and Bob as the recipient.
o Techniques: SRL can be performed using rule-based systems, supervised
machine learning, or neural network-based methods.
5. Semantic Parsing with Logical Forms:
o Mapping to Logical Forms: Converts natural language sentences into logical
expressions or formal queries. This is useful for tasks like database querying
and question answering.
o Example: The sentence "Find all books written by J.K. Rowling" can be
translated into a logical form that represents a database query to retrieve books
authored by J.K. Rowling.
Applications of Semantic Parsing
1. Question Answering:
o Semantic parsing enables systems to understand and respond to natural
language questions by mapping questions to formal representations that can be
queried against a knowledge base or database.
2. Information Extraction:
o Extracts structured information from unstructured text. For example, identifying
entities, relationships, and events from news articles.
3. Machine Translation:
o Helps improve translation accuracy by understanding and preserving the
meaning of sentences during translation.
4. Dialogue Systems:
o Powers conversational agents and chatbots by understanding user inputs and
generating appropriate responses based on the parsed meaning.
5. Textual Entailment and Inference:
o Determines if the meaning of one text implies or contradicts the meaning of
another text, which is useful for tasks like document summarization and fact
verification.

Challenges in Semantic Parsing
1. Ambiguity:
o Handling various types of ambiguities (lexical, syntactic, and contextual) and
providing a single coherent meaning representation can be challenging.
2. Scalability:
o Scaling semantic parsing methods to handle diverse and large-scale text
corpora, especially when dealing with complex and varied sentence structures.
3. Domain Adaptation:
o Adapting semantic parsing systems to different domains or specialized areas
(e.g., medical texts, legal documents) requires domain-specific knowledge and
adjustments.
4. Contextual Understanding:
o Understanding the context and nuances of natural language requires advanced
models and techniques that can grasp subtleties and implied meanings.
Example of Semantic Parsing
Consider the sentence: "John gave Mary a book."
 Lexical Analysis: Identifies the words and their parts of speech.

 Syntactic Parsing: Determines the grammatical structure (e.g., noun phrases, verb
phrases).
 Semantic Parsing: Converts the sentence into a logical form or semantic
representation:
o Logical Form: Gave(John, Mary, Book)
o Semantic Role Labeling: John (Agent), Mary (Recipient), Book (Theme)
Conclusion
Semantic parsing is a crucial component of NLP that bridges the gap between natural
language and formal representations. By translating sentences into meaningful structures,
semantic parsing enables a wide range of applications from question answering to information
extraction. The field encompasses various techniques, from rule-based and compositional
approaches to machine learning and neural network-based methods. Despite the challenges,
advancements in semantic parsing continue to improve the ability of systems to understand and
work with human language effectively.
9.HMM PART-OF-SPEECH TAGGING

Part-of-Speech (POS) tagging is a crucial task in natural language processing (NLP) that
involves assigning part-of-speech labels (e.g., noun, verb, adjective) to each word in a sentence.
Hidden Markov Models (HMMs) are a popular statistical approach used for POS tagging due
to their ability to model sequential data and capture the dependencies between words and their
tags.

Key Concepts of HMM-Based POS Tagging
1. Hidden Markov Model (HMM):

o Definition: An HMM is a statistical model that represents a system which
transitions between hidden states according to certain probabilities. In POS
tagging, the hidden states are the POS tags, and the observations are the words
in the sentence.
o Components:
 States: Represent the POS tags (e.g., NN for noun, VB for verb).
 Observations: Represent the words in the text.
 Transition Probabilities: The probability of transitioning from one
POS tag to another.
 Emission Probabilities: The probability of a word being generated by
a particular POS tag.
 Initial Probabilities: The probability of starting with a particular POS
tag.
2. Sequence Modeling:
o HMMs are well-suited for modeling sequences because they can capture the
dependencies between adjacent tags and words. This is particularly useful in
POS tagging where the context of surrounding words and tags influences the
correct tag assignment.
Detailed Steps in HMM-Based POS Tagging
1. Model Training:
o Training Data: Requires a corpus of text where each word is annotated with its
corresponding POS tag.
o Estimate Probabilities:
 Transition Probabilities: Estimate the probability of transitioning from
one POS tag to another. This is calculated from the training data using
counts of tag sequences.
 Emission Probabilities: Estimate the probability of a word being
associated with a particular POS tag. This is calculated from the counts
of words given tags in the training data.
 Initial Probabilities: Estimate the probability of a POS tag being the
starting tag of a sentence.
2. Tagging Process:
o Input: A sequence of words that need to be tagged.
o Decoding: Use algorithms to find the most likely sequence of POS tags for the
given sequence of words.

Algorithms:
o Viterbi Algorithm: A dynamic programming algorithm used to find the most

probable sequence of tags. It efficiently computes the best path through the
HMM by keeping track of the highest probability of each possible state
sequence up to each word.
o Forward Algorithm: Computes the probability of a sequence of words given
the model. It is used for tasks like calculating the likelihood of a sentence given
a set of POS tags.
Advantages and Challenges
Advantages:
 Sequential Dependencies: HMMs effectively model the sequential nature of POS

tagging by capturing dependencies between adjacent tags.
 Statistical Learning: HMMs use statistical methods to estimate probabilities from
data, making them robust to variations in language.
Challenges:
 Assumption of Independence: HMMs assume that the current tag depends only on the
previous tag (Markov property), which might be limiting in cases where longer-range
dependencies are important.
 Data Sparsity: Rare words or unseen tags in the training data can lead to poor estimates
of probabilities. Smoothing techniques and more sophisticated models like CRFs
(Conditional Random Fields) or deep learning approaches can help mitigate this issue.

10. SEMANTIC PARSING

Semantic Parsing is the process of converting natural language sentences into formal,
machine-readable representations of their meaning. It focuses on understanding and
representing the semantics (meaning) of a sentence rather than just its syntactic structure.
Key Points:
 Goal: To map natural language text to formal meaning representations, such as logical
forms, semantic graphs, or database queries.
 Techniques: Can involve rule-based systems, compositional semantics (e.g., lambda
calculus), and machine learning methods (e.g., supervised learning with neural
networks).
 Applications: Useful for tasks like question answering, information extraction, and
natural language interfaces to databases.
Example:
 Sentence: "Find all books written by J.K. Rowling."

 Semantic Representation: A query to a database: SELECT books FROM database
WHERE author = 'J.K. Rowling'.
11. SEMANTIC ROLE LABELING (SRL)

Semantic Role Labeling (SRL) is a task in natural language processing that involves
identifying the roles that different words or phrases play in the context of a sentence,
particularly in relation to the main verb. It helps understand the underlying relationships and
actions described in the text.
Key Points:
 Goal: To assign semantic roles to constituents of a sentence, such as identifying agents,

recipients, and themes.
 Roles: Typical roles include the Agent (who performs the action), Patient (who receives
the action), and Instrument (how the action is performed).
 Techniques: Can be achieved using rule-based approaches, supervised machine
learning, or neural network models.
Example:
 Sentence: "Alice gave a book to Bob."

 Semantic Roles:
o Alice: Agent (the giver)

o a book: Theme (the item given)

o Bob: Recipient (the receiver)
12.SYNTACTIC PARSING
Syntactic Parsing is the process of analyzing a sentence's grammatical structure to
understand its syntax—how words are organized and related to each other according to the
rules of grammar. This process involves constructing a parse tree that represents the syntactic
structure of the sentence.
Key Points:
 Goal: To determine the grammatical structure of a sentence by breaking it down into

its constituent parts (e.g., noun phrases, verb phrases) and identifying the relationships
between these parts.
 Parse Tree: A hierarchical tree structure that represents the syntactic structure of a
sentence. The root of the tree is the start symbol (usually the sentence), and the branches
represent the grammatical rules applied.
 Types of Parsing:
o Constituency Parsing: Focuses on breaking down a sentence into its
constituents (phrases) and representing their hierarchical structure.
o Dependency Parsing: Focuses on the relationships between words, particularly
how words depend on each other.
 Techniques:
o Top-Down Parsing: Starts from the root and works down to the leaves by
applying production rules.
o Bottom-Up Parsing: Starts from the leaves (words) and builds up to the root
by combining phrases.
o Chart Parsing: Uses a chart to keep track of all possible parses and
incrementally builds the parse tree.
Example:
 Sentence: "The cat sat on the mat."

 Constituency Parsing: The sentence might be parsed into noun phrases ("The cat",
"the mat") and a verb phrase ("sat on the mat").
 Dependency Parsing: The structure would show "sat" as the main verb with
dependencies linking "cat" (subject) and "mat" (object).

Unit Ii Part of Speech Tagging and Syntactic Parsing

Uploaded by

Copyright:

Available Formats

Unit Ii Part of Speech Tagging and Syntactic Parsing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit Ii Part of Speech Tagging and Syntactic Parsing

Uploaded by

Copyright:

Available Formats

ML1701 Natural Language Processing Department of AML 2024 - 2025

UNIT II PART OF SPEECH TAGGING AND SYNTACTIC PARSING

1. PART-OF-SPEECH (POS) TAGGING

Explanation of POS Tagging

1. What is POS Tagging?

Let's take a simple sentence for illustration:

POS Tagging Breakdown:

o The – Determiner (DT): Specifies a noun.

St. Joseph’s College of Engineering Page 1 of 29

o "The/DT quick/JJ brown/JJ fox/NN jumps/VBZ over/IN the/DT lazy/JJ

POS Tagging Techniques

POS tagging can be used in various NLP applications, such as:

 Syntactic Parsing: Understanding sentence structure.

2.NAMED ENTITY RECOGNITION (NER)

What is Named Entity Recognition?

 People: Names of individuals (e.g., "Barack Obama")

St. Joseph’s College of Engineering Page 2 of 29

 Organizations: Names of companies, institutions, or groups (e.g., "Google", "United

Why is NER Important?

1. Information Extraction: Helps in extracting specific information from large volumes

Techniques for Named Entity Recognition

St. Joseph’s College of Engineering Page 3 of 29

Transformer) have set new standards in NER by leveraging contextual

Example of Named Entity Recognition

Consider the following sentence:

Sentence: "Barack Obama was born on August 4, 1961, in Honolulu, Hawaii."

1. Search Engines: Improve search results by recognizing entities in queries and

St. Joseph’s College of Engineering Page 4 of 29

Design an application that leverages Named Entity Recognition (NER) to improve

Application Design: Customer Support Enhancement for an E-Commerce

Enhance customer support by utilizing Named Entity Recognition (NER) to automatically

Entity Extraction and Categorization:

o Entities Identified: Extract relevant entities such as products, organizations,

St. Joseph’s College of Engineering Page 5 of 29

o Categorization: Classify entities into predefined categories (e.g., Product,

St. Joseph’s College of Engineering Page 6 of 29

Scenario 1: Query Routing

 Dell XPS 13: PRODUCT

Scenario 2: Feedback Summarization

St. Joseph’s College of Engineering Page 7 of 29

 Samsung Galaxy Note 20: PRODUCT

3. CONDITIONAL RANDOM FIELDS (CRFS)

What are Conditional Random Fields?

How CRFs Work

1. Input and Output Sequences:

St. Joseph’s College of Engineering Page 8 of 29

St. Joseph’s College of Engineering Page 9 of 29

1. Contextual Dependencies: CRFs can model complex dependencies between labels,

1. Computational Complexity: CRFs can be computationally intensive, especially for

St. Joseph’s College of Engineering Page 10 of 29

3.TRIGRAM HIDDEN MARKOV MODELS (TRIGRAM

Hidden Markov Models (HMMs) Overview

1. Hidden Markov Model Basics:

St. Joseph’s College of Engineering Page 11 of 29

St. Joseph’s College of Engineering Page 12 of 29

What is the Viterbi Algorithm?

St. Joseph’s College of Engineering Page 13 of 29

1. States (SSS): A set of hidden states in the model.

St. Joseph’s College of Engineering Page 14 of 29