Unit 3 NLP

Transition Network—--A transition network, also known as a transition
diagram or state diagram, is a graphical representation of states and transitions

in a computational process or system.It is widely used in computer science to
model the behaviour of finite automata, state machines, and certain types of
grammars.
Basic Concepts of Transition Networks
● State: Represents a condition or situation in the computational process.

● Transition: The movement from one state to another, triggered by an
input or condition.
● Initial State: The state where the process begins.
● Final/Accepting State: The state where the process ends or accepts the
input.
Types of Transition Networks
1. Finite State Machine (FSM):

○ Deterministic Finite Automaton (DFA) - In a DFA, for each
state and input symbol, there is exactly one transition to a next
state.
○ Non-deterministic Finite Automaton (NFA) - In an NFA, for
each state and input symbol, there can be multiple possible
transitions.
2. Pushdown Automaton (PDA):A PDA is like an NFA but with an
additional stack storage, useful for recognizing context-free
languages.
3. Turing Machine:It consists of an infinite tape, a head for reading and
writing, and a set of states and transitions
Recursive Transition Networks (RTN)
Definition: A Recursive Transition Network (RTN) is a type of transition

network used to describe the syntax of languages, especially programming
languages and natural languages. It extends finite state machines by allowing
recursion, which means a network can call itself as a subroutine.
Key Features:
● States and Transitions: Like a finite state machine, it has states and
transitions.
● Recursion: A state can transition to a "subnetwork," which is essentially
a call to another part of the network (or itself).
Example: Consider a simple language that describes arithmetic expressions.

The RTN can be used to parse expressions like "a + b * c".
● States: S (start), E (expression), T (term), F (factor)

● Transitions:
○ From S, go to E.
○ From E, you can go to T.
○ From T, you can go to F.
○ From F, on encountering 'a', 'b', or 'c', accept and return to T or E.
○ From E, on encountering '+', go back to E (recursive call).
○ From T, on encountering '*', go back to T (recursive call).
This allows handling nested structures like "a + (b * c)".
Augmented Transition Networks (ATN)
Definition: An Augmented Transition Network (ATN) is an advanced form of

transition network used mainly for natural language processing. It extends RTNs
by adding features like registers, tests, and actions, making it more powerful and
flexible.
Key Features:
● Registers: Storage areas to keep track of information during the parsing

process.
● Tests: Conditions that must be met for transitions to occur.
● Actions: Operations performed when transitions occur, such as storing
values in registers.
Example: Consider parsing sentences in English. An ATN can be used to parse

"The cat sleeps" and handle more complex sentences.
● States: S (start), NP (noun phrase), VP (verb phrase), N (noun), V (verb)

● Transitions:
○ From S, go to NP (noun phrase).
○ From NP, go to N (noun).
○ From N, on encountering "cat" or "dog", store in a register.
○ From NP, on encountering "the" or "a", move to N (noun).
○ From S, go to VP (verb phrase).
○ From VP, go to V (verb).
○ From V, on encountering "sleeps" or "runs", check the register for
the noun.
Registers and Actions:
● Store the noun "cat" in a register.

● When encountering "sleeps", check if there is a noun in the register.
● If all conditions are met, accept the sentence.
Top down chart Parsing Algorithm:
1. Start from the start symbol of the grammar.

2. Predict the next production rule based on the current non-terminal.
3. Match the predicted symbols with the input string from left to right.
4. If a match is found, proceed to the next symbol.
5. If a mismatch occurs, backtrack and try another production rule.
6. Repeat until the entire input string is parsed or no rules are left to try.
Bottom-Up Parsing Algorithm:
1. Start with the input string.

2. Identify and reduce the substring that matches the right-hand side of a
production rule.
3. Replace the matched substring with the non-terminal on the left-hand side
of the rule.
4. Repeat the process until the entire input string is reduced to the start
symbol or no more reductions are possible.
Feature Systems
Definition: Feature systems are used in computational linguistics and formal

grammar to provide additional information about syntactic and semantic
properties of words or phrases, enhancing the basic structure provided by
traditional grammar rules.
Key Features:
● Attributes: Features are attributes assigned to grammatical units (e.g.,

number, gender, tense).
● Values: Each feature has a set of possible values (e.g., number can be
singular or plural).
● Feature Structures: Collections of features that describe properties of a
linguistic unit (e.g., a noun phrase).
Example: Consider the sentence "The cat sees the dogs."
● Noun Phrase (NP): [Det: 'the', N: 'cat', Number: singular]

● Verb Phrase (VP): [V: 'sees', Number: singular]
● Noun Phrase (NP): [Det: 'the', N: 'dogs', Number: plural]
Usage: Feature systems ensure agreement and compatibility within the

grammar, such as subject-verb agreement in number.
Augmented Grammar
Definition: Augmented grammars extend traditional grammars by incorporating

additional mechanisms like semantic actions, feature structures, and conditions
to handle more complex syntactic and semantic relationships.
Key Features:
● Semantic Actions: Actions or rules that are executed during parsing to

manage semantic information.
● Conditions: Logical tests that determine whether a rule can be applied
based on the current parsing context.
● Feature Structures: Integration of feature systems to enhance syntactic
rules with additional attributes.
Example: Consider a simple augmented grammar for a sentence structure:
● Rule: S → NP VP
○ Semantic Action: Ensure the number feature of NP matches VP.
● Rule: NP → Det N
○ Condition: The features of Det and N must be compatible.
● Rule: VP → V NP
○ Semantic Action: Ensure the verb agrees with the subject in
number.
Basic Feature System for English
Definition: A feature system for English assigns specific attributes and values
to words and phrases to capture grammatical properties and ensure proper
syntactic and semantic agreement.
Key Features:
● Number:
○ Values: Singular, Plural
○ Example: "cat" (singular), "cats" (plural)
● Person:
○ Values: First, Second, Third
○ Example: "I" (first), "you" (second), "he" (third)
● Gender:
○ Values: Masculine, Feminine, Neuter
○ Example: "he" (masculine), "she" (feminine), "it" (neuter)
● Case:
○ Values: Nominative, Accusative, Genitive
○ Example: "he" (nominative), "him" (accusative), "his" (genitive)
● Tense:
○ Values: Present, Past, Future
○ Example: "sees" (present), "saw" (past), "will see" (future)
● Aspect:
○ Values: Simple, Progressive, Perfect
○ Example: "sees" (simple), "is seeing" (progressive), "has seen"
(perfect)
● Voice:
○ Values: Active, Passive
○ Example: "sees" (active), "is seen" (passive)
● Mood:
○ Values: Indicative, Imperative, Subjunctive
○ Example: "sees" (indicative), "see!" (imperative), "were" in "If I
were" (subjunctive)
Morphological Analysis
Definition: Morphological analysis is the process of studying the structure,

formation, and classification of words in a language by breaking them down
into their smallest meaningful units, known as morphemes.
Key Points:
● Morpheme:
○ Definition: The smallest unit of meaning in a language.
○ Types: Free morphemes (standalone words) and bound morphemes
(prefixes, suffixes).
○ Example: "unhappiness" (un- (prefix), happy (root), -ness (suffix))
● Types of Morphological Processes:
○ Inflection:
■ Changes the form of a word to express different grammatical
features.
■ Example: "walk" → "walks" (present tense), "walked" (past
tense)
○ Derivation:
■ Creates new words by adding prefixes or suffixes.
■ Example: "happy" → "unhappy" (prefix), "happiness"
(suffix)
○ Compounding:
■ Combines two or more free morphemes to create a new
word.
■ Example: "notebook" (note + book), "blackboard" (black +
board)
● Steps in Morphological Analysis:
○ Segmentation: Breaking down words into morphemes.
○ Classification: Identifying the types and roles of morphemes.
○ Reassembly: Combining morphemes to understand the word's
structure and meaning.
Lexicon-
A lexicon is a collection or database of words and their meanings, including

information about their usage, grammatical properties, and relationships with
other words. In computational linguistics, a lexicon is an essential component
for tasks such as parsing, translation, and natural language processing.
Key Components:
● Entries: Individual words or phrases.

● Lemma: The base form of a word.
○ Example: "run" is the lemma for "running," "ran," "runs."
● Part of Speech (POS): Grammatical category of the word.
○ Example: Noun, verb, adjective, adverb.
● Morphological Information: Details about the word’s inflectional
forms.
○ Example: Singular/plural forms, past/present tenses.
● Syntactic Information: How the word fits into sentences.
○ Example: Subcategorization frames (e.g., verb "give" requires a
subject, an object, and an indirect object).
● Semantic Information: Meaning of the word and its relationships to
other words.
○ Example: Synonyms, antonyms, hyponyms.
Example: Consider the word "run" in a lexicon entry:
● Lemma: run
● POS: Verb
● Morphological Information: runs, ran, running
● Syntactic Information: [subject NP, object NP]
● Semantic Information:
○ Definition: Move at a speed faster than a walk.
○ Synonyms: sprint, jog
○ Antonyms: walk, stand

Unit 3 NLP

Uploaded by

Copyright:

Available Formats

Unit 3 NLP

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 3 NLP

Uploaded by

Copyright:

Available Formats

Transition Network—--A transition network, also known as a transition

diagram or state diagram, is a graphical representation of states and transitions

Basic Concepts of Transition Networks

● State: Represents a condition or situation in the computational process.

Types of Transition Networks

1. Finite State Machine (FSM):

Recursive Transition Networks (RTN)

Definition: A Recursive Transition Network (RTN) is a type of transition

Example: Consider a simple language that describes arithmetic expressions.

● States: S (start), E (expression), T (term), F (factor)

This allows handling nested structures like "a + (b * c)".

Augmented Transition Networks (ATN)

Definition: An Augmented Transition Network (ATN) is an advanced form of

● Registers: Storage areas to keep track of information during the parsing

Example: Consider parsing sentences in English. An ATN can be used to parse

● States: S (start), NP (noun phrase), VP (verb phrase), N (noun), V (verb)

Registers and Actions:

● Store the noun "cat" in a register.

Top down chart Parsing Algorithm:

1. Start from the start symbol of the grammar.

Bottom-Up Parsing Algorithm:

1. Start with the input string.

Definition: Feature systems are used in computational linguistics and formal

● Attributes: Features are attributes assigned to grammatical units (e.g.,

Example: Consider the sentence "The cat sees the dogs."

● Noun Phrase (NP): [Det: 'the', N: 'cat', Number: singular]

Usage: Feature systems ensure agreement and compatibility within the

Definition: Augmented grammars extend traditional grammars by incorporating

● Semantic Actions: Actions or rules that are executed during parsing to

Example: Consider a simple augmented grammar for a sentence structure:

Basic Feature System for English

Definition: Morphological analysis is the process of studying the structure,

A lexicon is a collection or database of words and their meanings, including

● Entries: Individual words or phrases.

Example: Consider the word "run" in a lexicon entry:

You might also like