1 Natural Language Processing-Intro

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 16

Natural Language

Processing
Introduction
• Natural language processing (NLP) is an area of computer science and
artificial intelligence concerned with the interactions between
computers and human (natural) languages, in particular how to
program computers to process and analyze large amounts of natural
language data.

• NLP is a branch of artificial intelligence that helps computers


understand, interpret and manipulate human language.

• NLP draws from many disciplines, including computer science and


computational linguistics, in its pursuit to fill the gap between human
communication and computer understanding.
Evolution of natural language processing

• The history of natural language processing generally started in the 1950s, although
work can be found from earlier periods. In 1950, Alan Turing published an article
titled "Computing Machinery and Intelligence" which proposed what is now called
the Turing test as a criterion of intelligence.
• NLP is not a new science, the technology is rapidly advancing thanks to an increased
interest in human-to-machine communications, plus an availability of big data,
powerful computing and enhanced algorithms.

• As a human, we may speak and write in English, Hindi, Spanish or Chinese, But a
computer’s native language – known as machine code or machine language – is
largely incomprehensible to people.

• At your device’s lowest levels, communication occurs not with words but through
millions of zeros and ones that produce logical actions.
Why is NLP important?

Large volumes of textual data

• Natural language processing helps computers communicate with


humans in their own language and scales other language-related tasks.
For example, NLP makes it possible for computers to read text, hear
speech, interpret it, measure sentiment and determine which parts are
important.
• Today’s machines can analyse more language-based data than humans,
without fatigue and in a consistent, unbiased way. Considering the
staggering amount of unstructured data that’s generated every day,
from medical records to social media, automation will be critical to fully
analyze text and speech data efficiently.
Why is NLP important?

Structuring a highly unstructured data source

• Human language is astoundingly complex and diverse. We express ourselves in infinite ways, both
verbally and in writing. Not only are there hundreds of languages and dialects, but within each
language is a unique set of grammar and syntax rules, terms and slang.

• When we write, we often misspell or abbreviate words, or omit punctuation. When we speak, we
have regional accents, and we mumble, stutter and borrow terms from other languages.

• While supervised and unsupervised learning, and specifically deep learning, are now widely used
for modelling human language, there’s also a need for syntactic and semantic understanding and
domain expertise that are not necessarily present in these machine learning approaches.

• NLP is important because it helps resolve ambiguity in language and adds useful numeric structure
to the data for many downstream applications, such as speech recognition or text analytics.
How does NLP work?
To Breaking down the elemental pieces of language
• Natural language processing includes many different techniques for interpreting
human language, ranging from statistical and machine learning methods to rules-
based and algorithmic approaches. We need a broad array of approaches because
the text- and voice-based data varies widely, as do the practical applications.

• Basic NLP tasks include tokenization and parsing, lemmatization/stemming, part-


of-speech tagging, language detection and identification of semantic relationships.

• In general terms, NLP tasks break down language into shorter, elemental pieces,
try to understand relationships between the pieces and explore how the pieces
work together to create meaning.
How does NLP work?
These underlying tasks are often used in higher-level NLP capabilities, such as:
Content categorization A linguistic-based document summary, including search and
indexing, content alerts and duplication detection.
Topic discovery and modelling Accurately capture the meaning and themes in text
collections, and apply advanced analytics to text, like optimization and forecasting.
Contextual extraction Automatically pull structured information from text-based sources.
Sentiment analysis Identifying the mood or subjective opinions within large amounts of text,
including average sentiment and opinion mining.
Speech-to-text and text-to-speech conversion Transforming voice commands into written
text, and vice versa.
Document summarization Automatically generating synopses of large bodies of text.
Machine translation. Automatic translation of text or speech from one language to another.
How computers make sense of textual data
NLP and Text analytics
• Natural language processing goes hand in hand with text analytics, which counts,
groups and categorizes words to extract structure and meaning from large
volumes of content. Text analytics is used to explore textual content and derive
new variables from raw text that may be visualized, filtered, or used as inputs to
predictive models or other statistical methods.
• NLP and text analytics are used together for many applications, including:
• Investigative discovery. Identify patterns and clues in emails or written reports to
help detect and solve crimes.
• Subject-matter expertise. Classify content into meaningful topics so you can take
action and discover trends.
• Social media analytics. Track awareness and sentiment about specific topics and
identify key influencers.
How computers make sense of textual data
Everyday NLP Examples
There are many common and practical applications of NLP in everyday lives.
Virtual assistants like Alexa or Siri.

Spam filtering, a statistical NLP technique that compares the words in spam
to valid emails to identify junk mail.

Automatic transcript of the voicemail in your email inbox or smartphone


app? That’s speech-to-text conversion, an NLP capability.

A website by using its built-in search bar, or by selecting suggested topic,


entity or category tags? This used NLP methods for search, topic modelling,
entity extraction and content categorization.
Components of NLP

Natural Language Understanding (NLU)

• NLU directly enables human-computer interaction (HCI).

• NLU understanding of natural human languages enables computers to


understand commands without the formalized syntax of computer
languages and for computers to communicate back to humans in their
own languages.
• NLU is tasked with communicating with untrained individuals and
understanding their intent, meaning that NLU goes beyond understanding
words and interprets meaning. NLU is even programmed with the ability
to understand meaning in spite of common human errors like
mispronunciations or transposed letters or words.
Components of NLP

Natural Language Understanding (NLU)

• NLU encompasses one of the more narrow but especially complex


challenges of AI: how to best handle unstructured inputs that are
governed by poorly defined and flexible rules and convert them into a
structured form that a machine can understand and act upon.
• While humans are able to effortlessly handle mispronunciations, swapped
words, contractions, colloquialisms, and other quirks, machines are less
adept at handling unpredictable inputs.
Components of NLP

Natural Language Understanding (NLU)


• NLU uses algorithms to reduce human speech into a structured ontology.
• AI can break such things as intent, timing, locations and sentiments.
• For example, a request for an trip of Shimla on the 8th of august might
break down something like this: Rahul tickets [intent] / need: trip
reservation [intent] / Shimla [location] / August 8th [date].
• The main drive behind NLU is to create chat and speech enabled bots that
can interact effectively with the public without supervision.
• NLU is a pursuit of many start up and major IT Companies working on NLU
include Medium's Lola, Amazon's with Alexis and Lex, Apple's Siri,
Google's Assistant and Microsoft's Cortana.
Components of NLP

Natural Language Generation (NLG)

• NLG is the natural language processing task of generating natural


language from a machine representation system such as a knowledge
base or a logical form.
• NLG system is like a translator that converts data into a natural language
representation.
• NLG may be viewed as the opposite of NLU as in NLU the system needs to
disambiguate the input sentence to produce the machine representation
language, and in NLG the system needs to make decisions about how to
put a concept into words.
Components of NLP

Natural Language Generation (NLG)

• A simple example is systems that generate template based sentence.


These do not typically involve grammar rules, but may generate a
sentence with pre specified words.

Example : “Your credit card payment is due on [Date]”


Phases of NLP

• Phonology : Organizing sound systematically


• Morphology : Construction of words from primitive meaningful units
• Lexical Analysis : Identifying and analysing the structure of words
• Syntactic Analysis: Arranging words to make a sentence
• Semantic Analysis: Meaning of words and how to combine words into meaningful
sentences
• Pragmatic Analysis : Study of language by considering the context in which it is
used
• Discourse Analysis : Preceding sentence can affect the interpretation of the next
sentence
Phases of NLP
• Phonology : Phonology deals with the study of sound patterns of language.

Phoneme - The smallest unit of sound that has meaning within a language

You might also like