Natural Language Processing: Neural Question Answering
Natural Language Processing: Neural Question Answering
Natural Language Processing: Neural Question Answering
Mamatha.H.R
Department of Computer Science and
Engineering
NATURAL LANGUAGE PROCESSING
Mamatha H R
Department of Computer Science and Engineering
NATURAL LANGUAGE PROCESSING
What is Question Answering?
• Open-domain question answering deals with questions about nearly anything, and
can only rely on general ontologies and world knowledge. These systems usually
have much more data available from which to extract the answer.
Bram Stoker
NATURAL LANGUAGE PROCESSING
Question Answering: IBM’s Watson
• Factoid questions
• Who wrote “The Universal Declaration of Human
Rights”?
• How many calories are there in two slices of apple
pie? Factoid questions are such that they can be
answered with simple facts expressed in
• What is the average age of the onset of autism?
short texts, like the following:
• Where is Apple Computer based? 1. Where is the Louvre Museum located?
2. What is the average age of the onset of
autism?
• Complex (narrative) questions:
• In children with an acute febrile illness, what is the
efficacy of acetaminophen in reducing fever?
• What do scholars think about Jefferson’s position on
dealing with pirates?
NATURAL LANGUAGE PROCESSING
Commercial systems: mainly factoid questions
• IR-based approaches
• TREC; IBM Watson; Google
• Knowledge-based and Hybrid approaches
• IBM Watson; Apple Siri; Wolfram Alpha; True
Knowledge Evi
NATURAL LANGUAGE PROCESSING
Many questions can already be answered by web search
NATURAL LANGUAGE PROCESSING
IR-based Question Answering
NATURAL LANGUAGE PROCESSING
IR-based Factoid QA( Traditional Way)
Document
DocumentDocument
Document
Document Document
Indexing Answer
Passage
Question Retrieval
Processing Docume
Query Document
Docume
nt
Docume
nt
Docume
nt
Passage Answer
Docume
Formulation Retrieval Relevant
nt
nt Retrieval passages Processing
Question Docs
Answer Type
Detection
QA has 3 stages:
1. Question Processing
2. Passage Retrieval
3. Answer Processing
NATURAL LANGUAGE PROCESSING
IR-based Factoid QA(Traditional Way)
• QUESTION PROCESSING
• Detect question type, answer type,
focus, relations
• Formulate queries to send to a Document
DocumentDocument
Document
• ANSWER PROCESSING
• Extract candidate answers
• Rank candidates
• using evidence from the text
and external sources
NATURAL LANGUAGE PROCESSING
IR-based Factoid QA( Neural way)
• Retrievers use a term weight for each document word. Two term weighting
schemes are common:
• Tf-Idf
• BM25
• The tf-idf value for word t in document d is then the product of term
frequency tft,d and IDF:
where
• Document Scoring
is done by the cosine of document vector d with the query vector q:
NATURAL LANGUAGE PROCESSING
IR-based Factoid QA(Retriever)
OR the above expression can be written as
Using the tf-idf values and spelling out the dot product as a sum of products:
Queries are usually very short, so each query word is likely to have a count of 1.
And the cosine normalization for the query (the division by |q|) will be the same for all documents, so
won’t change the ranking between any two documents Di and Dj So we generally use the following
simple score for a document d given a query q:
NATURAL LANGUAGE PROCESSING
IR-based Factoid QA(Retriever)
An Example: query against a collection of 4 nano documents, computing tf-idf values
and seeing the rank of the documents.
Computing tf-idf for each document, here showing only for doc 1 and 2
Ranking of Doc by
• If k=0, BM25 reverts to no use of term frequency, just a binary selection of terms in the query (plus
idf).
• If k= large, it results in raw term frequency (plus idf).
b ranges from 1 (scaling by document length) to 0 (no length scaling).
• Note that Manning et al. (2008) suggest reasonable values are k = [1.2,2] and b = 0.75.
NATURAL LANGUAGE PROCESSING
IR-based Factoid QA(Reader)
• Their goal is thus to compute the probability 𝑃(𝑎|𝑞, 𝑝) that each possible span 𝑎 is the answer.
• If each span 𝑎 starts at position 𝑎𝑠 and ends at position 𝑎𝑒, and making a simplifying assumption
that this probability can be estimated as
• Thus for for each token 𝑝𝑖 in the passage we’ll compute two probabilities: 𝑝𝑠𝑡𝑎𝑟𝑡(𝑖) that 𝑝𝑖 is the
start of the answer span, and 𝑝𝑒𝑛𝑑(𝑖) that 𝑝𝑖 is the end of the answer span
NATURAL LANGUAGE PROCESSING
IR-based Factoid QA(Reader)
A standard baseline algorithm for reading comprehension is to
pass the question and passage to any encoder like BERT, as
strings separated with a [SEP] token, resulting in an encoding
token embedding for every passage token 𝑝𝑖 .
NATURAL LANGUAGE PROCESSING
IR-based Factoid QA(Reader)
• For span-based question answering,
represent
The question as the 1st sequence and
the passage as the 2nd sequence.
• Many datasets (like SQuAD 2.0 and Natural Questions) also contain
(question, passage) pairs in which the answer is not contained in
the passage.
• 2 algorithms,
• one simple classic baseline that uses anchor dictionaries and
information from the Wikipedia graph structure (Ferragina and Scaiella,
2011)
• and one modern neural algorithm (Li et al., 2020).
• We’ll very briefly look into modern Neural algorithm for KB QA.
NATURAL LANGUAGE PROCESSING
Neural Graph-based linking
• Recent entity linking models are based on biencoders,
encoding a candidate mention span,
encoding an entity,
and computing the dot product between the encodings.
This allows embeddings for all the entities in the knowledge base to be precomputed and cached.
• The ELQ linking algorithm of Li et al. (2020), which is given a 𝑞𝑢𝑒𝑠𝑡𝑖𝑜𝑛 𝑞 and
𝑎 𝑠𝑒𝑡 𝑜𝑓 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝑒𝑛𝑡𝑖𝑡𝑖𝑒𝑠 𝑓𝑟𝑜𝑚 𝑊𝑖𝑘𝑖𝑝𝑒𝑑𝑖𝑎 with associated Wikipedia text, and outputs
𝑡𝑢𝑝𝑙𝑒𝑠 (𝑒, 𝑚𝑠 , 𝑚𝑒) of entity id, mention start, and mention end.
As figure shows, it does this by
encoding each Wikipedia entity
using text from Wikipedia,
encoding each mention span
using text from the question,
and computing their similarity.
NATURAL LANGUAGE PROCESSING
QA by Semantic Parsing
• The second kind of knowledge-based QA uses a semantic parser to map the question to a
structured program to produce an answer.
• These logical forms can be thought of as some version of predicate calculus, a query language like
SQL or SPARQL, or some other executable program like the examples below
NATURAL LANGUAGE PROCESSING
QA by Semantic Parsing
• The task is then to take those pairs of training tuples and produce a system that
maps from new questions to their logical forms.
• A common baseline algorithm is a simple sequence-to-sequence model, for
example using BERT to represent question tokens, passing them to a biLSTM
encoder decoder as shown below
NATURAL LANGUAGE PROCESSING
Course References
Text Book:
1. “Introduction to Natural Language Processing”, Jacob Eisenstein, MIT
Press, Adaptive computation and Machine Learning series, 18th October,
2019.
The open source softcopy is available at githubhttps://github.com/jacobeisenstein/gt-nlp
class/blob/master/notes/eisenstein-nlp-notes.pdf.
Reference Books:
1: “Speech and Natural Language Processing”, Daniel Jurafsky and James H.
Martin, 2nd edition paperback,2013.
The more up to date 3rd edition draft is available at
http://web.stanford.edu/~jurafsky/slp3/
THANK YOU
Dr. Mamatha H R
Professor, Department of Computer Science
[email protected]
+91 80 2672 1983 Extn 712