Semantic Analysis: 2018/19 Sem I

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

6405

Semantic Analysis

2018/19 Sem I
• Semantic Analysis involves extraction of context-independent aspects of a sentence's
meaning, including the semantic roles of entities mentioned in the sentence, and
quantification information, such as cardinality, iteration, and dependency.

Syntactic Analysis

Parse tree
Semantic Analysis

Context-independent meaning +
Other relevant information

• Semantic Analysis is an important component for many NLP applications.


• Culture of the society has in impact on semantic analysis
How do we represent the meanings of tella injera, besso, tej, teff, etc… so as to
translate them in to a foreign language?

• There is a close link between the life of the society and the lexicon of the language
spoken.
For example:
ice, snow
politeness in Amharic, not in English
• This theory is based on the notion of semantic primitives which are claimed to have the
following properties:
Universally meaningful concepts
Do not need definition in terms of other words
Represent basic concepts
Reduce a language into a core that enables the full development of the
language (Natural Semantic Metalanguage, NSM)
Every language has essentially the same NSM though the syntax may differ.
Substantives: I/ , YOU/ ( ), SOMEONE/ , PEOPLE/ ,
SOMETHING/THING/ , BODY/
Determiners: THIS/ , THE SAME/ , OTHER/
Quantifiers: ONE/ , TWO/ , SOME/ , ALL/ , MUCH/MANY/
Evaluators: GOOD/ , BAD/
Descriptors: BIG/ , SMALL/
Mental predicates: THINK/ , KNOW/ , WANT/ , FEEL/ , SEE/ , HEAR/
Speech: SAY/ , WORDS/ , TRUE/
Actions, events and movement: DO/ , HAPPEN/ , MOVE/
Existence and possession: THERE IS/ , HAVE/
Life and death: LIVE/ , DIE/
Time: WHEN/TIME/ , NOW/ , BEFORE/ , AFTER/ , A LONG TIME/ ,A
SHORT TIME/ , FOR SOME TIME/
Space: WHERE/PLACE/ / , HERE/ , ABOVE/ , BELOW/ , FAR/ ,
NEAR/ , SIDE/ , INSIDE/
Logical concepts: NOT/ ( )… , MAYBE/ , CAN/ , BECAUSE/ -, IF/ -
Intensifier, augmentor: VERY/ , MORE/
Taxonomy, partonomy: KIND OF, PART OF
Similarity: LIKE/

How do we then define LIE?


LIE: “What a person does when he says something not true because he wants someone
to think it true”
• A semantic role is the underlying relationship that a participant has with the main verb
in a clause.
Semantic roles are identified from the grammatical relations.

• Semantic roles are discussed at three different levels of generality.


Verb-Specific Semantic Roles
E.g., runner, killer, hearer, broken, etc.
Thematic Relations, which are generalizations across the verb-specific roles
E.g., agent, instrument, experiencer, theme, patient
Generalized Semantic Roles, which are generalizations across thematic relations.
Two arguments are used: actor and undergoer.
Actor is a generalization across agent, experiencer, instrument and other
roles.
Undergoer is a generalization subsuming patient, theme, recipient and
other roles.
Verb-Specific Thematic Generalized
Semantic Roles Relations Semantic Roles
Thinker
Believer Cognizer
Knower
Presumer
Hearer
Smeller Experiencer
Feeler Perceiver
Taster
Liker
Lover Emoter Actor
Hater
Giver
Runner
Killer Agent
Speaker
Dancer Grammatical
Relations
Located
Moved Theme
Given
Broken
Destroyed Patient Undergoer
Killed
Given to
Sent to Recipient
Handed to

Increasing Generalization, Increasing Neutralization of Semantic Contrasts


• The notions of actor and undergoer capture generalizations across verb types that are
expressed by underlying grammatical relations.
• Actors are distinguished from subject and undergoer from direct object.
• The syntactic subject can be either an actor (active voice) or an undergoer (passive
voice).
• In clauses with intransitive verbs, the syntactic subject may be either an actor or an
undergoer, depending upon the class of the verb
• Examples:
Active voice: “Abebe killed the lion.”
“Abebe” is the syntactic subject and “the lion” is syntactic direct object.
“Abebe” is the actor and “the lion” is the undergoer.
Passive voice: “The lion was killed by Abebe.”
“The lion” is the syntactic subject and “Abebe” is syntactic object.
“Abebe” is the actor and “the lion” is the undergoer.
• What requirements do we have for meaning representations?
Verifiability: The system should allow us to compare representations to facts in a
Knowledge Base (KB).
Ambiguity: The system should allow us to represent meanings unambiguously.
E.g., Amharic “teachers” has at least 2 representations.
Vagueness: The system should allow us to represent vagueness
E.g., Abebe lives somewhere in the center of Addis Ababa.
Inference: Draw valid conclusions based on the meaning representation of
inputs and its store of background knowledge.
E.g., Does Abebe eat besso?
Canonical Form: Inputs that mean the same thing have the same representation.
E.g., Abebe eats besso.
Besso, Abebe will eat.
What Abebe eats is besso.
It’s besso that Abebe eats.
Compositionality: Can get meaning of composite words from separate,
independent meanings of their constituents.
E.g., Brown fox
• Given a set of semantic roles, how can we assign grammatical relations with semantic
roles and represent meaning?
There are no specialized mechanisms of semantic role assignment.
Everything is predication.
A function returning a Boolean is called a predicate.
For example, “Abebe ate besso” can be represented as “ate(Abebe, besso)”.
But, what are the possible arguments?
Predicate arguments can be complicated.
Nouns (entities/instances) take no arguments.
Verbs (event) are predicational, and take one argument, a complement.
Prepositions (relations) are relational, and take two arguments.
Adjectives (states) are predicational, and take one argument, but require some
help; thus an adjective is always the complement of a verb, which then projects
for an external argument.
• Examples:
Abebe ate besso
thing(besso)
ate(Abebe,x) ^ thing(x)
Brown fox
Brown(x) ^ fox(x)

• Not ideal as a meaning representation and doesn't do everything we want - but close
Supports the determination of truth
Supports compositionality of meaning
Supports question-answering (via variables)
Supports inference

• What are its elements?


• What else do we need?
Sentence → AtomicSentence
| Sentence Connective Sentence
| Quantifier Variable, ... Sentence
| ¬Sentence
| (Sentence)
AtomicSentence→ Predicate(Term, ...)
Term→ Function(Term, ...)
| Constant
| Variable

Connective → ∨ | ∧|⇒
Quantifier → ∃| ∀
Constant → A | Abebe | Car1
Variable → x | y | z |...
Predicate → Red | Owns | Serves| Near |...
Function → FatherOf | Plus | LocationOf |...
• Terms are names used to represent objects.
Constants
Refer to specific object in the world being described.
Conventionally depicted as single capitalized letters such as A and B or
single capitalized words like Abebe.
For example: Abebe, Car1
Functions
Correspond to concepts that are often expressed as possessives.
Provide a convenient way to refer to specific objects without having to
associate a named constant with them.
For example: LocationOf(AAU), FatherOf(Abebe), Plus(1,2)..
Variables
Used to make assertions and draw inferences about objects without having
to make any reference to any particular named object.
Making statements about a particular unknown object;
Making statements about all the objects in some world of objects.
• A predicate represents a property of or relation between terms that can be true or false.
In a given interpretation, an n-ary predicate can defined as a function from tuples
of n terms to {True, False}
For example: Brother(Abebe, Kebede), Left-of(Square1, Square2),
GreaterThan(plus(1,1), plus(0,1))

• Connectives are used to compose complex representations.


Truth table

P Q ¬P P∧Q P∨Q ⇒Q
P⇒

F F T F F T

F T T F T T

T F F F T F

T T F T T T
• An atomic sentence is simply a predicate applied to a set of terms.
For example: Owns(Abebe, Car1)
Sold(Abebe, Car1, Kebede)
Semantics is True or False depending on the interpretation

• The standard propositional connectives ( ∨, ¬, ∧, ⇒ ) can be used to construct


complex sentences:
For example: Owns(Abebe, Car1) ∨ Owns(Kebede, Car1)
Sold(Abebe, Car1, Kebede) ⇒ ¬Owns(Abebe, Car1)
Semantics same as in propositional logic.
• Quantifiers allow statements about entire collections of objects rather than having to
enumerate the objects by name.
Universal quantifier: ∀
Asserts that a sentence is true for all values of variable x
For example: ∀x Loves(x, FOPC)
∀x Whale(x) ⇒ Mammal(x)
∀x Birds(x) ⇒ Black(x)

Existential quantifier: ∃
Asserts that a sentence is true for at least one value of a variable x.
∃x Loves(x, FOPC)
∃x(Cat(x) ∧ Color(x,Black) ∧ Owns(Abebe,x))
• Universal and existential quantification are logically related to each other:
∀x ¬Love(x,Abebe) ⇔ ¬∃x Loves(x,Abebe)
∀x Love(x,Kebede) ⇔ ¬∃x ¬Loves(x,Kebede)

• General Identities
∀x ¬P ⇔ ¬∃x P
¬∀x P ⇔ ∃x ¬P
∀x P ⇔ ¬∃x ¬P
∃x P ⇔ ¬∀x ¬P
∀x P(x)∧Q(x) ⇔ ∀xP(x) ∧ ∀xQ(x)
∃x P(x)∨Q(x) ⇔ ∃xP(x) ∨ ∃xQ(x)
• Sheraton Addis is a hotel
Hotel(SheratonAddis)

• Sheraton Addis serves Ethiopian food.


Serves(SheratonAddis, EthiopianFood)

• I have only five Birr and I don’t have a lot of time


Have(Speaker, FiveBirr)∧ ¬ Have(Speaker, LotOfTime)

• Sheraton Addis is near AAU


Near(LocationOf(SheratonAddis),LocationOf(AAU))
• A hotel that serves Ethiopian food near AAU
∃xHotel(x) ∧ Serves(x,EthiopianFood) ∧ Near(LocationOf(x),LocationOf(AAU))
If Sheraton Addis is a hotel that serves Ethiopian food near AAU, substituting
Sheraton Addis for x results in the following:
Hotel(SheratonAddis) ∧ Serves(SheratonAddis,EthiopianFood) ∧ Near(LocationOf(SheratonAddis),
LocationOf(AAU))

• All Ethiopian hotels serve Ethiopian food


∀xEthiopianHotel(x) ⇒ Serves(x,EthiopianFood)

Every substitution of a known object for x must result in a sentence that is true.
EthiopianHotel(SheratonAddis) ⇒ Serves(SheratonAddis,EthiopianFood)
EthiopianHotel(HiltonAddis) ⇒ Serves(HiltonAddis,EthiopianFood)
EthiopianHotel(Ghion) ⇒ Serves(Ghion,EthiopianFood)
.
.
.
.
.
.
What happens when we consider a substitution from a set of objects that are
not Ethiopian hotels?
EthiopianHotel(KampalaSheraton) ⇒ Serves(KampalaSheraton, EthiopianFood)
EthiopianHotel(NairobiHilton) ⇒ Serves(NairobiHilton, EthiopianFood)
.
.
.
.
The sentence is always true

What happens when we substitute x with irrelevant object?


EthiopianHotel(AnbesaBus) ⇒ Serves(AnbesaBus, EthiopianFood)
EthiopianHotel(EthioTelecom) ⇒ Serves(EthioTelecom, EthiopianFood)
EthiopianHotel(Computer) ⇒ Serves(Computer, EthiopianFood)
.
.
.
.

The sentence is still true


• Variables in logical formula must be either existentially (∃) or universally (∀) quantified.
• To satisfy an existentially quantified variable, there must be at least one substitution
that results in a true sentence.
• Sentences with universally quantified variables must be true under all possible
substitutions.

• Limitations:
If you are interested in a football, Ethiopian Coffee is playing today
HaveInterestIn(Hearer, Football) Playing( EthiopianCoffee, Today)
Flawed if the antecedent is false.
One more beer and I will fall off this stool
A simple-minded translation of this sentence might consist of a
conjunction of two clauses.
The use of the word and obscures the fact that this sentence in stead has
an implication underlying it.
Your money or your life!
• A semantic network is a network which represents semantic relations among
concepts.
This is often used as a form of knowledge representation.
It is a directed or undirected graph consisting of vertices, which represent
concepts, and edges.
Example:
Vertebrae Cat has Fur

has is a
has

Animal is an Mammal is a Beer


is a
is an Whale

Fish lives in Water lives in

• A semantic network is used when one has knowledge that is best understood as a
set of concepts that are related to one another.
• Lexicon has a highly systematic structure that governs what words can mean and how
they can be used.

• An individual entry in the lexicon is called a lexeme.

• A lexeme is considered as a pairing of a particular orthographic and phonological form


with some form of symbolic meaning representation (sense).

• A lexicon is therefore a finite list of lexemes.

• The structure of a lexicon consists of relations among words and their meanings, as
well as the internal structure of individual words.

• Lexical Semantics is the linguistic study of the systematic and meaning related
structure of the lexicon.
• Homonymy: refers to the relation that holds between words that have the same form
but different meaning.

• Example: 1. A bank can hold the investments in a custodial account in the client’s
name.

2. As the agriculture burgeons on the east bank, the river will shrink more.
This relationship is traditionally denoted by placing a superscript on the
orthographic form of the word as bank1 and bank2.

This notation indicates that these are two separate lexemes, with distinct and
unrelated meanings, that happen to share an orthographic form.
• Homophones: words with the same pronunciation but different spellings.
Example: to, two, too [too]

• Homographs: words with the same orthographic forms but different pronunciation.
Example: desert [dizurt/dezurt]
• Homonymy usually leads an application into dealing with ambiguity.
• In Spelling Correction, homophones can lead to real-world spelling errors.
Example: weather and whether are erroneously interchanged.
• In Speech Recognition, homophones pose a great deal of problems to language models.
Example: to, two and too cause obvious problems.
• Text-to-Speech Systems are particularly vulnerable to homographs with distinct
pronunciations.
The problem can be avoided through the use of part of speech tagging.
Example: desert (V) => [dizurt]
desert (N) => [dezurt]

• Information Retrieval Systems are degraded in the presence of homographs.


For example, users seeking information about arid areas with the keyword desert
are unlikely to be satisfied with documents concerning defections.
• Polysemy: refers to the relation that holds within a single lexeme with multiple
related meanings.

• Example: 1. A bank can hold the investments in a custodial account in the client’s
name.

2. Some banks provide blood only to emergency patients.

• To distinguish homonymy from polysemy, two criteria are used: history and etymology
of the lexeme in question.

For example, etymology reveals that bank1 and bank2 have Italian and
Scandinavian origins, respectively.

However, the use of banks (as in blood bank) is related to the sense of bank1,
and therefore there is no need to consider the usage of banks as homonym to
bank1.
• Synonymy: refers to the relation that holds between different lexemes with the same
meaning.

• Examples:

Nouns: student, pupil

Verbs: buy, purchase

Adjectives: sick, ill

Adverbs: quickly, speedily


• Hyponymy: refers to the relation of a lexeme which shares a “type-of” relationship
with another lexeme (hypernym).

Also called an “is-a” relationship.


• Examples:

Automobile, motor vehicle

human, hominid

teacher, educator

university, educational institution

• An example of a lexical database with rich set of lexical relationships is WordNet.

• WordNet is freely and publicly available lexical database of English, in which lexemes
are interlinked by means of conceptual-semantic and lexical relations.
• Latent Semantic Analysis (LSA) aims to discover something about the meaning
behind the words; about the topics in the documents.
• What is the difference between topics and words?
Words are observable
Topics are not observable; they are latent.
• How to find out topics from the words in an automatic way?
We can imagine them as:
a compression of words
a combination of words
• Implements the idea that the meaning of a passage is the sum of the meanings of
its words:
meaning(word1) + meaning(word2) + … + meaning(wordn) = meaning(passage)
• This “bag of words” function shows that a passage is considered to be an unordered
set of word tokens and the meanings are additive.
• By creating an equation of this kind for every passage of language that a learner
observes, we get a large system of linear equations.
• Represent the document as a vector where each entry corresponds to a different
word and the number at that entry corresponds to how many times that word was
present in the document (or some function of it)
Number of words is huge.
Select and use a smaller set of words that are of interest.
E.g. uninteresting words: ‘and’, ‘the’ ‘at’, ‘is’, etc. These are called
stop-words
Stemming: remove endings. E.g. ‘learn’, ‘learning’, ‘learnable’, ‘learned’
could be substituted by the single stem ‘learn’
Other simplifications can also be invented and used
The set of different remaining words is called dictionary or vocabulary. Fix an
ordering of the terms in the dictionary so that you can operate them by their
index.

You might also like