Natural Language Processing: Neural Question Answering

NATURAL LANGUAGE PROCESSING
Neural Question Answering
Mamatha.H.R
Department of Computer Science and
Engineering
Neural Question Answering
Mamatha H R
Department of Computer Science and Engineering
What is Question Answering?
One of the oldest NLP tasks (punched card systems in 1961)

Simmons, Klein, McConlogue. 1964. Indexing and
Dependency Logic for Answering English Questions.
American Documentation 15:30, 196-204
Question answering (QA) is a computer

science discipline within the fields of IR
and NLP, which is concerned with
building systems that automatically
answer questions posed by humans in
a NL.
3
Question Answering Research
Question answering research attempts to deal with a wide range of question types
including: fact, list, definition.
How, Why, hypothetical, semantically constrained, and cross-lingual questions.
• Closed-domain question answering deals with questions under a specific domain

(for example, medicine or automotive maintenance), and can exploit domain-
specific knowledge frequently formalized in ontologies. Eg. MEANS
• Open-domain question answering deals with questions about nearly anything, and
can only rely on general ontologies and world knowledge. These systems usually
have much more data available from which to extract the answer.
• Multimodal question answering uses multiple modalities of user input to answer

questions, such as text and images etc.
Question Answering: IBM’s Watson
Feb 16, 2011, IBM’s Watson QA system won the TV game-show

Jeopardy!, surpassing humans at answering questions like:
Bram Stoker
Question Answering: IBM’s Watson
➢ Watson applies advanced NLP,

IR, knowledge representation
and reasoning, and ML
technologies to Open-domain
question answering.
➢ Loaded with millions of documents,
including dictionaries,
encyclopedias, taxonomies,
religious texts, novels, plays and
other reference material that it can
use to build its knowledge.
Apple’s Siri
Types of Questions in Modern Systems
• Factoid questions
• Who wrote “The Universal Declaration of Human
Rights”?
• How many calories are there in two slices of apple
pie? Factoid questions are such that they can be
answered with simple facts expressed in
• What is the average age of the onset of autism?
short texts, like the following:
• Where is Apple Computer based? 1. Where is the Louvre Museum located?
2. What is the average age of the onset of
autism?
• Complex (narrative) questions:
• In children with an acute febrile illness, what is the
efficacy of acetaminophen in reducing fever?
• What do scholars think about Jefferson’s position on
dealing with pirates?
Commercial systems: mainly factoid questions
Where is the Louvre Museum located? In Paris, France

What’s the abbreviation for limited L.P.
partnership?
What are the names of Odin’s ravens? Huginn and Muninn

What currency is used in China? The yuan
What kind of nuts are used in marzipan? almonds
What instrument does Max Roach play? drums
What is the telephone number for Stanford 650-723-2300
University?
Paradigms for QA
• IR-based approaches
• TREC; IBM Watson; Google
• Knowledge-based and Hybrid approaches
• IBM Watson; Apple Siri; Wolfram Alpha; True
Knowledge Evi
Many questions can already be answered by web search
IR-based Question Answering
IR-based Factoid QA( Traditional Way)
Document
DocumentDocument
Document
Document Document
Indexing Answer
Passage
Question Retrieval
Processing Docume
Query Document
Docume
nt
Docume
nt
Docume
nt
Passage Answer
Docume
Formulation Retrieval Relevant
nt
nt Retrieval passages Processing
Question Docs
Answer Type
Detection
QA has 3 stages:
1. Question Processing
2. Passage Retrieval
3. Answer Processing
IR-based Factoid QA(Traditional Way)
• QUESTION PROCESSING
• Detect question type, answer type,
focus, relations
• Formulate queries to send to a Document
DocumentDocument
Document
search engine Document Document

Indexing Answer
• PASSAGE RETRIEVAL Passage

Question Retrieval
• Retrieve ranked documents Processing Docume
Docume
nt Answer
• Break into suitable passages and Query Document Docume
nt
Docume Passage
nt
Docume
Formulation Retrieval Relevant
nt
nt Retrieval passages Processing
Question Docs
rerank Answer Type
Detection
• ANSWER PROCESSING
• Extract candidate answers
• Rank candidates
• using evidence from the text
and external sources
IR-based Factoid QA( Neural way)
• Neural Factoid QA is a 2 stage process:

• Retrieval : which returns relevant documents from
the collection,
• Reading : in which a neural reading comprehension
system extracts answer spans. A factoid question and a large
document collection (such as
Wikipedia or a crawl of the
web) is given To the QA
systems
and it returns an answer,

which is a span of text
extracted from a document.
This task is often called open
domain QA.
IR-based Factoid QA(Retriever)
• When a user poses a query to a retrieval system, it returns an ordered set of
documents from some document collection.
• A document refers to a unit of text the system indexes and retrieves (web pages,
scientific papers, news articles, or even shorter passages like collection
paragraphs).
An inverted index, given a query term, gives a list of
documents that contain the term.
• A collection refers to a set of documents being used to

satisfy user term requests.
• A term refers to a word in a collection, but it may also
include phrases.
• Finally, a query represents a user’s information need expressed as a set of terms.

The high-level architecture of an ad hoc retrieval engine is shown above.
• The basic IR architecture uses the vector space model,

➢ so the queries and document are mapped to vectors based
on unigram word counts,
➢ And the cosine similarity is used between the
vectors to rank potential documents (Salton, 1971).
Match between a document and query is scored
• Retrievers use a term weight for each document word. Two term weighting
schemes are common:
• Tf-Idf
• BM25
• The tf-idf value for word t in document d is then the product of term
frequency tft,d and IDF:
where
• Document Scoring
is done by the cosine of document vector d with the query vector q:
OR the above expression can be written as
Using the tf-idf values and spelling out the dot product as a sum of products:
Queries are usually very short, so each query word is likely to have a count of 1.
And the cosine normalization for the query (the division by |q|) will be the same for all documents, so
won’t change the ranking between any two documents Di and Dj So we generally use the following
simple score for a document d given a query q:
An Example: query against a collection of 4 nano documents, computing tf-idf values
and seeing the rank of the documents.
Computing tf-idf for each document, here showing only for doc 1 and 2
Ranking of Doc by
Computing tf-Idf using

IR-based Factoid QA (BM25)
• Another variant in the tf-idf family is the BM25 weighting scheme (sometimes called
Okapi BM25 after the Okapi IR system in which it was introduced (Robertson et al., 1995)).
• BM25 adds two parameters:

k, it adjusts the balance between TF and IDF,
and b, which controls the importance of document length normalization.
• The BM25 score of a document d given a query q is:
where |davg| is the length of the average document in the

text collection from which documents are drawn.
• If k=0, BM25 reverts to no use of term frequency, just a binary selection of terms in the query (plus
idf).
• If k= large, it results in raw term frequency (plus idf).
b ranges from 1 (scaling by document length) to 0 (no length scaling).
• Note that Manning et al. (2008) suggest reasonable values are k = [1.2,2] and b = 0.75.
IR-based Factoid QA(Reader)
• The second stage of IR-based question answering is the Reader.

• The reader’s job is to take a passage as input and produce the answer.
• In the extractive QA, the answer is a span of text in the passage.
• For example given a question like
“How tall is Mt. Everest?”
and a passage that contains the clause “Reaching 29,029 feet at its summit”,
• A reader will output 29,029 feet

IR-based Factoid QA(Reader: Answer Span Extraction)
• The answer extraction task is commonly modeled by span labeling: in the passage a reader will
identify a span (a continuous string of text) that constitutes an answer.
• Neural span algorithms for reading comprehension are given

• a question q of n tokens 𝑞1, … , 𝑞𝑛 and
• a passage p of m tokens 𝑝1, … , 𝑝𝑚.
• Their goal is thus to compute the probability 𝑃(𝑎|𝑞, 𝑝) that each possible span 𝑎 is the answer.
• If each span 𝑎 starts at position 𝑎𝑠 and ends at position 𝑎𝑒, and making a simplifying assumption
that this probability can be estimated as
𝑃(𝑎|𝑞, 𝑝) = 𝑃𝑠𝑡𝑎𝑟𝑡(𝑎𝑠 |𝑞, 𝑝)𝑃𝑒𝑛𝑑(𝑎𝑒|𝑞, 𝑝).
• Thus for for each token 𝑝𝑖 in the passage we’ll compute two probabilities: 𝑝𝑠𝑡𝑎𝑟𝑡(𝑖) that 𝑝𝑖 is the
start of the answer span, and 𝑝𝑒𝑛𝑑(𝑖) that 𝑝𝑖 is the end of the answer span
A standard baseline algorithm for reading comprehension is to
pass the question and passage to any encoder like BERT, as
strings separated with a [SEP] token, resulting in an encoding
token embedding for every passage token 𝑝𝑖 .
• For span-based question answering,
represent
The question as the 1st sequence and
the passage as the 2nd sequence.
• Also add a linear layer that will be trained in

the fine-tuning phase to predict the start
and end position of the span.
• Also add two new special vectors:

• a span-start embedding S and
• a span-end embedding E,
which will be learned in fine-tuning.
• To get a span-start probability 𝑃𝑠𝑡𝑎𝑟𝑡𝑖 for each output token 𝑝’𝑖 ,
we compute the dot product between S and 𝑝’𝑖 and then use a
Softmax to normalize over all tokens 𝑝’𝑖 in the passage:
Similarly, compute a span-end probability 𝑃𝑒𝑛𝑑𝑖 :

• The score of a candidate span from position 𝑖 𝑡𝑜 𝑗 is

𝑆 · 𝑝’𝑖 + 𝐸 · 𝑝’𝑗,
and the highest scoring span in which 𝑗 ≥ 𝑖 is chosen is the model
prediction.
• The training loss for fine-tuning is the negative sum of the log-
likelihoods of the correct start and end positions for each instance:
• Many datasets (like SQuAD 2.0 and Natural Questions) also contain
(question, passage) pairs in which the answer is not contained in
the passage.
• So also need a way to estimate the probability that the answer to a

question is not in the document.
• This is standardly done by treating questions with no answer as having
the [CLS] token as the answer,
• And hence the answer span start and end index will point at [CLS]
Knowledge-based Approaches (Siri)
• Build a semantic representation of the query

• Times, dates, locations, entities, numeric quantities
• Map from this semantics to query structured data or
resources
• Geospatial databases
• Ontologies (Wikipedia infoboxes, dbPedia, WordNet, Yago)
• Restaurant review sources and reservation services
• Scientific databases
Knowledge-based approaches
• In Knowledge-based QA, answering a natural language question is

done by mapping it to a query over a structured database.
• Two common paradigms are used for knowledge-based QA.

1. Graph-based QA, models the knowledge base as a graph,
often with entities as nodes and relations or propositions as
edges between nodes.
2. QA by semantic parsing, using the semantic parsing methods.
Both the KB QA approaches use Entity Linking.
• Entity linking is the task of associating a mention in text with the

representation of some real-world entity in an ontology.
• The most common ontology for factoid question-answering is Wikipedia.
• Entity linking is done in (roughly) two stages:

` mention detection and
mention disambiguation.
• 2 algorithms,
• one simple classic baseline that uses anchor dictionaries and
information from the Wikipedia graph structure (Ferragina and Scaiella,
2011)
• and one modern neural algorithm (Li et al., 2020).
• We’ll very briefly look into modern Neural algorithm for KB QA.
Neural Graph-based linking
• Recent entity linking models are based on biencoders,
encoding a candidate mention span,
encoding an entity,
and computing the dot product between the encodings.
This allows embeddings for all the entities in the knowledge base to be precomputed and cached.
• The ELQ linking algorithm of Li et al. (2020), which is given a 𝑞𝑢𝑒𝑠𝑡𝑖𝑜𝑛 𝑞 and
𝑎 𝑠𝑒𝑡 𝑜𝑓 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝑒𝑛𝑡𝑖𝑡𝑖𝑒𝑠 𝑓𝑟𝑜𝑚 𝑊𝑖𝑘𝑖𝑝𝑒𝑑𝑖𝑎 with associated Wikipedia text, and outputs
𝑡𝑢𝑝𝑙𝑒𝑠 (𝑒, 𝑚𝑠 , 𝑚𝑒) of entity id, mention start, and mention end.
As figure shows, it does this by
encoding each Wikipedia entity
using text from Wikipedia,
encoding each mention span
using text from the question,
and computing their similarity.
QA by Semantic Parsing
• The second kind of knowledge-based QA uses a semantic parser to map the question to a
structured program to produce an answer.
• These logical forms can be thought of as some version of predicate calculus, a query language like
SQL or SPARQL, or some other executable program like the examples below
QA by Semantic Parsing
• The task is then to take those pairs of training tuples and produce a system that
maps from new questions to their logical forms.
• A common baseline algorithm is a simple sequence-to-sequence model, for
example using BERT to represent question tokens, passing them to a biLSTM
encoder decoder as shown below
Course References
Text Book:
1. “Introduction to Natural Language Processing”, Jacob Eisenstein, MIT
Press, Adaptive computation and Machine Learning series, 18th October,
2019.
The open source softcopy is available at githubhttps://github.com/jacobeisenstein/gt-nlp
class/blob/master/notes/eisenstein-nlp-notes.pdf.
Reference Books:
1: “Speech and Natural Language Processing”, Daniel Jurafsky and James H.
Martin, 2nd edition paperback,2013.
The more up to date 3rd edition draft is available at
http://web.stanford.edu/~jurafsky/slp3/
THANK YOU
Dr. Mamatha H R
Professor, Department of Computer Science
[email protected]
+91 80 2672 1983 Extn 712

Natural Language Processing: Neural Question Answering

Uploaded by

Copyright:

Available Formats

Natural Language Processing: Neural Question Answering

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Natural Language Processing: Neural Question Answering

Uploaded by

Copyright:

Available Formats

NATURAL LANGUAGE PROCESSING

Neural Question Answering

Neural Question Answering

One of the oldest NLP tasks (punched card systems in 1961)

Question answering (QA) is a computer

• Closed-domain question answering deals with questions under a specific domain

• Multimodal question answering uses multiple modalities of user input to answer

Feb 16, 2011, IBM’s Watson QA system won the TV game-show

➢ Watson applies advanced NLP,

Where is the Louvre Museum located? In Paris, France

What are the names of Odin’s ravens? Huginn and Muninn

search engine Document Document

• PASSAGE RETRIEVAL Passage

• Neural Factoid QA is a 2 stage process:

and it returns an answer,

• A collection refers to a set of documents being used to

• Finally, a query represents a user’s information need expressed as a set of terms.

• The basic IR architecture uses the vector space model,

Match between a document and query is scored

Computing tf-Idf using

• BM25 adds two parameters:

• The BM25 score of a document d given a query q is:

where |davg| is the length of the average document in the

• The second stage of IR-based question answering is the Reader.

• A reader will output 29,029 feet

• Neural span algorithms for reading comprehension are given

𝑃(𝑎|𝑞, 𝑝) = 𝑃𝑠𝑡𝑎𝑟𝑡(𝑎𝑠 |𝑞, 𝑝)𝑃𝑒𝑛𝑑(𝑎𝑒|𝑞, 𝑝).

• Also add a linear layer that will be trained in

• Also add two new special vectors:

Similarly, compute a span-end probability 𝑃𝑒𝑛𝑑𝑖 :

• The score of a candidate span from position 𝑖 𝑡𝑜 𝑗 is

• So also need a way to estimate the probability that the answer to a

• Build a semantic representation of the query

• In Knowledge-based QA, answering a natural language question is

• Two common paradigms are used for knowledge-based QA.

• Entity linking is the task of associating a mention in text with the

• The most common ontology for factoid question-answering is Wikipedia.

• Entity linking is done in (roughly) two stages:

You might also like