IJREAS
VOLUME 6, ISSUE 3 (March, 2016)
(ISSN 2249-3905)
International Journal of Research in Engineering and Applied Sciences (IMPACT FACTOR – 6.573)
Natural Language Processing: A Review
Sethunya R Joseph1,
Computer Science Department,
Botswana International University of Science and Technology,
Palapye, Botswana
Hlomani Hlomani2
Computer Science Department,
Botswana International University of Science and Technology,
Keletso Letsholo3,
Computer Science Department,
Botswana International University of Science and Technology,
Palapye, Botswana
Freeson Kaniwa4,
Computer Science Department,
Botswana International University of Science and Technology,
Palapye, Botswana
Kutlwano Sedimo5
Computer Science Department,
Botswana International University of Science and Technology,
Palapye, Botswana
ABSTRACT
Natural Language Processing (NLP) is a way of analyzing texts by computerized means. NLP involves
gathering of knowledge on how human beings understand and use language. This is done in order to
develop appropriate tools and techniques which could make computer systems understand and
manipulate natural languages to perform various desired tasks. This paper reviews the literature on NLP.
It also covers or gives a hint about the history of NLP. It is based on document analysis. This research
paper could be beneficial to those who wish to study and learn about NLP.
Keywords: NLP, machine translation, machine learning, computational techniques, linguists
1. Introduction
Various researchers have explained Natural Language Processing (NLP) as an area of research and
application that explores how computers can be used to understand and manipulate natural language text
or speech to do useful things ([2]; [3]; [6]; [7]).
International Journal of Research in Engineering & Applied Sciences
Email:-
[email protected], http://www.euroasiapub.org
207
IJREAS
VOLUME 6, ISSUE 3 (March, 2016)
(ISSN 2249-3905)
International Journal of Research in Engineering and Applied Sciences (IMPACT FACTOR – 6.573)
Liddy [1] defines NLP as a theoretically motivated range of computational techniques for analyzing and
representing naturally occurring texts, at one or more levels of linguistic analysis for the purpose of
achieving human-like language processing for a range of tasks or applications. The term NLP is normally
used to describe the function of software or hardware components in a computer system which analyze
or synthesize spoken or written language [ ]. The natural epithet is meant to distinguish human
speech and writing from more formal languages, such as mathematical notations or programming
languages, where vocabulary and syntax are comparatively restricted [14].
The research and development in NLP over the last sixty years as stated by Church and Rau [16] can be
categorized into the following five areas:
Natural Language Understanding
Natural Language Generation
Speech or Voice recognition
Machine Translation
Spelling Correction and Grammar Checking
The increase demands for softwares that process text of all kinds have tremendously been influenced by
the advent of the Internet and World Wide Web. Over a decade, Internet publishing has become a
common place activity for private individuals, commercial enterprises, and government organizations, as
well as traditional media companies, and the medium of most of these communications and transactions
is primarily natural language [14]. Various forms of keyword processing provide access to Web sites as
well as organizational principles for retrieving, navigating and browsing web pages within those sites.
Search engines and spam filters are now of everyday life and work well enough that their viability as
products is not in question [14].
The language is more than transfer of information. Language is a set of resources to enable us to share
meanings, but is not best thought of as a means for encoding meanings [ ]. The foundations of NLP fall
within a number of disciplines being: computer and information sciences, linguistics, mathematics,
electrical and electronic engineering, psychology, artificial intelligence and robotics, etc. NLP applications
comprise a number of fields of studies, such as natural language text processing and summarization,
machine translation, user interfaces, multilingual and cross language information retrieval, speech
recognition, artificial intelligence and expert systems, and so on ( [6]; [7] ).
2. Scope and objective
Based on document analysis, this paper summarizes the information on NLP, the general overview,
history, and previous works on NLP. It then considers applications of NLP. The challenges and failures of
NLP together with current and future research of NLP are also discussed briefly in this paper. The
research paper is intended to give an understating to researchers, scholarly peers and companies who
wish to stay abreast with the NLP technologies and applications from the past, present and future.
International Journal of Research in Engineering & Applied Sciences
Email:-
[email protected], http://www.euroasiapub.org
208
IJREAS
VOLUME 6, ISSUE 3 (March, 2016)
(ISSN 2249-3905)
International Journal of Research in Engineering and Applied Sciences (IMPACT FACTOR – 6.573)
3. Previous Works On NLP (Brief History)
NLP research dates back to the late 1940s with Machine translation (MT) being said to be the first
computer-based application related to natural language. It was Weaver and Booth who started one of the
earliest MT projects in 1946, on computer translation based on expertise in breaking enemy codes during
World War )). (owever, a general agreement was made that, Weaver s memorandum of
has brought
the idea of MT to general notice and had inspired many projects. Weaver suggested using ideas from
cryptography and information theory for language translation [1]; [29]. According to Liddy [1] earliest
works in MT followed the basic view, that the only difference between languages was vested in their
vocabularies and the permitted word orders. Hence systems which were made from this perspective
basically used dictionary-lookup (for appropriate words for translation and reordering of the words after
translation to fit the word-order rules of the target language). This was done without considering the
lexical ambiguity inherent in natural language. This generated poor results and called for researchers to
come up with a more sufficient theory of language. )t was the Chomsky s 957 publication [25] of the
syntactic structures which introduced the idea of generative grammar [25], which gave the linguistic a
better understanding of how they could help the machine translation. Subsequently, other NLP
application areas began to emerge, such as speech recognition [1].
Since 1960 there have been some significant developments, both in production of prototype systems and
in theoretical issues. This has mainly focused on the issue of how to represent meaning and developing
computationally tractable solutions that the then-existing theories of grammar were not able to produce
before
. Examples are: Chomsky s
transformational model of linguistic [ ]; case grammar of
Fillmore [26], semantic networks of Quillian [27], and conceptual dependency theory of Schank, which
explained syntactic anomalies, and provided semantic representations; Formalisms representation which
included Wilks preference semantics[ ] and Kay s functional grammar; Augmented transition networks
of Woods which extended the power of phrase-structure grammar by incorporating mechanisms from
programming languages [10].
Besides theoretical development, many prototype systems have been developed. According to Liddy [1]
these include: Weizenbaum s EL)ZA [ ] which was built to replicate the conversation between a
psychologist and a patient, simple by permuting or echoing the user input; Winograd s S(RDLU
simulation [8] of a robot that manipulated blocks on a tabletop which showed that natural language
understanding was indeed possible for the computer [ ], PARRY s a theory of paranoia [ ] in a system
which used groups of keywords instead of single keywords and used synonyms if keywords were not
found; LUNAR developed by Woods [9] as an interface system to a database that consisted of information
about lunar rock samples using augmented transition network and procedural semantics ([1];[4]).
By the
s a substantial work was done on natural language generation, for example McKeown s
discourse planner TEXT [32] and McDonald s response generator MUMMBLE [ ] used rhetorical
predicates to create declarative descriptions in short texts form that is paragraphs and TEXT s which
generated comprehensible responses online. However, by the early 1980s, there was an increasing
awareness of the limitations of isolated solutions to NLP problems and a general push towards
applications that worked with language in a broad, real-world context. Since then to the present times,
NLP has swiftly grown. This growth could be accredited to the advent of technologies such as: Internet;
fast computers with increased memory; increased availability of large amounts of electronic text [1].
International Journal of Research in Engineering & Applied Sciences
Email:-
[email protected], http://www.euroasiapub.org
209
IJREAS
VOLUME 6, ISSUE 3 (March, 2016)
(ISSN 2249-3905)
International Journal of Research in Engineering and Applied Sciences (IMPACT FACTOR – 6.573)
4. Natural Language Processing Overview
Given NLP s lineage, it is clear that many of its early theories and methods are derived from the field of
linguistics [4]. A major shift was noticed in the early 1990s with the move to a reliance on empirical
methodologies vs. the introspective generalizations that characterized the Chomsky era which held sway
in theoretical linguistics. Liddy et al., [4] contends that, the focus in NLP shifted from what might be
possible to do in a language and still have it be grammatically acceptable to what is actually observed to
occur in naturally occurring text - that is, performance data. As more and larger corpora became available,
empirical methods and evaluation rather than introspection-based methods and evaluation became the
norm [4].
NLP researchers are now developing next generation NLP systems that deal reasonably well with general
text and account for a good portion of the variability and ambiguity of a language. Statistical approaches
thrived in dealing with many generic problems in computational linguistics such as part-of-speech
identification, word sense disambiguation, etc., and have become standard throughout NLP [1].
Liddy [ ] s sentiments are also shared by Liddy et al., [ ] that, the availability of larger, performanceoriented corpora supported the use of statistical (machine learning) methods, to learn the
transformations that in previous approaches were performed by hand-built rules, eventually providing
the empirical proof that statistical processing could accomplish some language analysis tasks at a level
comparable to human performance. Liddy et al., [4] argue further that, at the center of this move lay the
understanding that most of the work to be effected by language processing algorithms is too complex to
be captured by rules constructed by human generalization, but rather require machine learning methods.
According to Ringger et al., [8] the early statistical Part-Of-Speech tagging algorithms which were used in
the early times, using Hidden Markov Models were said to achieve performance comparable to humans.
In the course of the test sections of the Penn Treebank [35], and also on unobserved portions of the
Brown Corpus [31], an up-to-date statistical parser was made known to perform more accurately than a
broad-coverage rule-based parser [8]. Framing questions in the noisy channel model / information
theory, with use of Probability Theory, Maximum Entropy, and Mutual Information, produced tangible
advances in automatic capabilities [8].
The aforesaid transformations came in about because of the newly existing extensive electronic resources
(e.g. the sizable corpora, such as the Brown corpus and other research programs) which were collected
and distributed by the Linguistic Data Consortium. These were then followed by the lexical resources
such as WordNet, which provided lexical-semantic knowledge bases (i.e. it enabled use of the semantic
level of processing) and the Penn TreeBank (which provided gold standard syntactic resources that
steered the development and testing of progressively rich algorithmic analysis tools [4].
A shift from a focus on closed domains of the earliest NLP research (from the 60s through the 80s) to
open domains (e.g. newswire) has been made possible and supported by the increasing availability of
realistically-sized resources coupled with machine learning methods. The flaring of the domains was
further enabled by the availability of the broad ranging-textual resources of the web [4].
On the other hand, parallel with these moves towards use of more real world data, a realization was made
that NLP researchers should evaluate their work on a larger scale, hence the introduction of empiricallybased, blind evaluations across systems. These efforts led to the development of metrics such as BLEU
and ROUGE that are integral to today s NLP research itself, of which they can be computed automatically
and results fed back into the research [4]. Concomitant with these advances in statistical capabilities, but
moving at a slower pace, was the demonstration that higher levels of human language analysis are
International Journal of Research in Engineering & Applied Sciences
Email:-
[email protected], http://www.euroasiapub.org
210
IJREAS
VOLUME 6, ISSUE 3 (March, 2016)
(ISSN 2249-3905)
International Journal of Research in Engineering and Applied Sciences (IMPACT FACTOR – 6.573)
amenable to NLP. The lower levels (morphological, lexical, and syntactic) deal with smaller units of
analysis and are considered to be more rule-oriented and therefore more amenable to statistical analysis,
while the higher levels (with semantics as a middle level, and discourse and pragmatics as the higher
levels) admit of more free choice and variability in usage. This is to mean that, these levels allow more
variation, with more exceptions, and perhaps less regularity (e.g. Rhetorical Structure Theory in NLP by
Mann & Thompson [10], demonstrated that even much larger units of analysis (e.g., treatises,
instructional guides, etc.) are amenable to computational analysis [4].
Wiebe et al., [13] state that, in information extraction, increasingly complex phenomena such as
subjectivity and opinion are identified automatically. Charniak et al., [11] and Quirk et al., [12] point out
that with the most recent machine translation results, syntax based MT outperforms surface-level word
and phrase replacement systems. These developments have resulted in the realization that NLP, by the
blending of statistical and symbolic methods, together with lexical resources such as WordNet, and
syntactic and semantic resources such as Prop Bank, plus the availability of large scale corpora on which
to test and evaluate approaches, is gaining ground on the goal of realistic comprehension and production
of human-like language understanding [4].
5. Applications of NLP
According to Church and Rau [16], in recent years, the natural language text interpretation and
processing technologies have also gained an increasing level of sophistication. For example, generic
engines are now available which can deliver semantic representations for sentences, or deliver sentences
from representations. It is now possible to build very-targeted systems for specific purposes, for example,
finding index terms in open text, and also the ability to judge what level of syntax analysis is appropriate.
NLP technologies are becoming extremely important in the creation of user-friendly decision-support
systems for everyday non-expert users, particularly in the areas of knowledge acquisition, information
retrieval and language translation [16].
NLP technology has progressively increased. It can be noted that this has happened because of the
following reasons: The web has provided researchers with readily accessible corpus of electronic
document on scale that is unprecedented ; Academia has replaced a new emphasis upon empirical
approaches to language processing that rely more heavily upon corpus statistics than linguist theory and
Modern networked machines are capable of processing millions of documents and performing the billions
of calculations to build statical profiles of large corpora[14].
Massive quantities of text are becoming available in electronic form, ranging from published documents
such as electronic dictionaries, encyclopedias, libraries and archives for information retrieval services,
private databases, personal email and faxes [15]. Online information services are reaching mainstream
computer users. With media attention reach time, hardly a day goes by without a new article on the
national information infrastructure, digital libraries, networked services, digital convergence or
intelligent agents. This attention is moving NLP along the critical path for all kinds of novel applications
[15] see Figure 1.
International Journal of Research in Engineering & Applied Sciences
Email:-
[email protected], http://www.euroasiapub.org
211
IJREAS
VOLUME 6, ISSUE 3 (March, 2016)
(ISSN 2249-3905)
International Journal of Research in Engineering and Applied Sciences (IMPACT FACTOR – 6.573)
Figure 1: A diagram showing the NLP continuum (adopted from Church and Rau [16])
Figure 1 shows a number of technologies ranging from well-understood technologies, such as string
matching, to more forward looking technologies such as grammar checkers, conceptual search, event
extraction, interlingual and so on.
There are so many examples of the aforementioned technologies, However this paper does not provide an
exhaustive list (not all of them will be elaborated in this paper), examples of some applications products
of NLP as stated by Church and Rau [16] which are given in this paper includes: Word Processing and
Desktop Publishing, WordPerfect (Novell) which offers Grammatik 6, which checks for grammatical
errors and attempts to fix them. Microsoft has demonstrated a considerable long-term commitment to
improving the technology by hiring a research group, for significant contributions to grammar checking
[24]. Other examples of the application products include: finite-state automata (a practical set of
algorithms that efficiently represent functions from strings to strings); Transducer techniques (facilitates
the retrieval of answers to service repair questions from a text database of repair manual); information
retrieval products such as Xerox XSoft s Visual Recall, and OCR products such as Xerox )maging Systems
Textbridge; lexical products through the Desktop Document Systems (DDS) division (which uses the
technology to create a range of multilingual components that can be embedded in information retrieval,
translation, and other document management applications);spelling checkers (e.g. The Xerox MemoryWriter typewriter). Kaplan and Kay have been working on these issues for over a decade at Xerox PARC
[22].
The applications (such as the spell checker for the Xerox MemoryWriter typewriter) were the basis for a
start-up company called Microlytics, formed in 1985 and later (1987) merged into the publically traded
Selectronics Corp [16]. Through Microlytics, the original Kaplan and Kay algorithms found their way into
spelling checkers and thesaurii, such as those included in popular systems such as Micropro, Claris,
MacWrite II, Microsoft Word 4 (the thesaurus), Symantec, and WordFinder software sold to the PC and
Apple Macintosh user community.
International Journal of Research in Engineering & Applied Sciences
Email:-
[email protected], http://www.euroasiapub.org
212
IJREAS
VOLUME 6, ISSUE 3 (March, 2016)
(ISSN 2249-3905)
International Journal of Research in Engineering and Applied Sciences (IMPACT FACTOR – 6.573)
The hand-held language-related device market is now dominated by such companies as Casio,
Seiko, Fuji, Xerox, Eurotronics, Franklin, Sharp and other primarily Asian manufacturers [16]. Word
processing and information management were previously cited as two of the better examples of
commercial opportunities for natural language processing [16]. The importance of information
management is beginning to be appreciated as vast quantities of text become available in electronic
form: digital libraries, it wasn t all that long ago that the researchers referred to the Brown Corpus
as a large corpus [ ]; [31]. However, Dialog, Westlaw, Lexis-Nexis and other major vendors of
online information services are archiving hundreds of megabytes per night, the equivalent of one
Brown Corpus per hour [16]. Another significant recent development on NLP applications is the
development of a speller checker for the Tswana native language, which has been designed to
work on the Mozilla Firefox software and Libre-Office application [19].
6. Challenges and failures
Church and Rau [16] points out that even though we should know better, it is so appealing to
fantasize about intelligent computers that understand human communication, that hyperbole is
practically unavoidable. Sometimes these practices work out for the best. Symantec, for example, a
highly successful vendor of software tools for the PC, started with a product called Q&A, an NLP
program for querying a database. The Q&A was successful because of its unique packaging of
AI/NLP with a good simple database facility. Neither would have been successful in isolation. The
AI/NLP generated initial sales, but the real value was in the database. People bought the product
because they were intrigued with the AI/NLP technology, but most users ended up turning off the
AI/NLP features [20]. But all too often excessive optimism results in a manic-like cycle of euphoric
activity followed by severe depression. In 1954, Georgetown University demonstrated what would
now be called a toy system. )t was designed to translate a small corpus of approximately 50
Russian sentences into English. Little if any attempt was made to generalize to sentences beyond
the tiny test corpus [16]; [29].
The limitations of today s practical language processing technology have been summarized by
Bobrow and Weischedel [18] as follows:
1. Current systems have limited discourse capabilities that are almost ex-clusively
handcrafted. Thus current systems are limited to viewing interaction, translation, and
writing text as processing a sequence of either isolated sentences or loosely related
paragraphs. Consequently, the user must adapt to such limited discourse.
2. Domains must be narrow enough so that the constraints on the relevant semantic concepts
and relations can be expressed using current knowledge presentation techniques, i.e.,
primarily in terms of types and sorts. Processing may be viewed abstractly as the
application of recursive tree re-writing rules, including filtering out trees not matching a
certain pattern.
3. Handcrafting is necessary, particularly in the grammatical components of systems (the
component technology that exhibits least dependence on the application domain). Lexicons
and axiomatizations of critical facts must be developed for each domain, and these remain
time-consuming tasks.
International Journal of Research in Engineering & Applied Sciences
Email:-
[email protected], http://www.euroasiapub.org
207
IJREAS
VOLUME 6, ISSUE 3 (March, 2016)
(ISSN 2249-3905)
International Journal of Research in Engineering and Applied Sciences (IMPACT FACTOR – 6.573)
4. The user must still adapt to the machine, but, as the products testify, the user can do so
effectively.
7. Current and Future progress of NLP
Some of the active researches on NLP phenomena include the Syntactic phenomena: those that
pertain to the structure of a sentence and the order of words in the sentence, based on the
grammatical classes of words rather than their meaning (e.g. discriminative models for scoring
parses, coarse to fine efficient approximate parsing, dependency grammar); Machine translation
(e.g. models and algorithms, low- resource and morphological complex language); Semantic
phenomena : those that pertain to the meeting of a sentence relatively independent of the context in
which the language occurs(e.g. sentiment analysis, summarization, information extraction ,slotfilling, discourse analysis, textual entailment);Pragmatic phenomena such as Speech: those that
relate the meaning of a sentence to the context in which it occurs. This context can be linguistic
(such as the previous text or dialogue) or, non-linguistic (such as knowledge about person who
produced the language, about goals of the communication, about the objects in the current visual
field, etc. (e.g. language modelling-syntax and semantics, models of acoustics, pronunciation) [17];
[18]. Speech recognition and information retrieval have finally gone commercial and there is a ton
of text and speech on the Internet, cell phones, etc. It is now clear that studies regarding anything
about a language are possible, e.g. formalizing some insights e.g. discrete knowledge (what is
possible) and continuous knowledge (what is likey); studying the formalism mathematically;
developing and implementing algorithms and testing on real data. The current and on-going future
changes or improvements which need to be done to NLP are: to add features to existing interfaces,
back end processing should be fully implemented (e.g. information extraction and normalization to
build databases. Another anticipated improvement is of having hand held devices with translators
and personal conversation recorder with topical searches [17].
8. Conclusions
As a computerized approach of analyzing text, NLP is continually striving forward. Researchers are
continually trying to gather knowledge on how human beings understand and use various
languages. This aid in the development of appropriate tools and techniques which make computer
systems understand and manipulate natural languages to perform the various tasks. Technologies,
such as string matching, keyword search, glossary lookup are now on the past as, to more forward
looking technologies such as grammar checkers, conceptual search, event extraction, interlingual on
going and striving forward.
References
[1] E.D. Liddy, Natural Language Processing, 2001.
[ ] N. Kaur , V. Pushe and R Kaur, Natural Language Processing )nterface for Synonym ,
International Journal of Computer Science and Mobile Computing, Vol.3 Issue.7, July- 2014, pp.
638-642 ,ISSN 2320–088X.
[ ] S. Vijayarani , J. )lamathi and Nithya, Preprocessing Techniques for Text Mining - An
Overview , )nternational Journal of Computer Science & Communication Networks, Vol. ,
issue.1, pp. 7-16 7 ISSN: 2249-5789
International Journal of Research in Engineering & Applied Sciences
Email:-
[email protected], http://www.euroasiapub.org
208
IJREAS
VOLUME 6, ISSUE 3 (March, 2016)
(ISSN 2249-3905)
International Journal of Research in Engineering and Applied Sciences (IMPACT FACTOR – 6.573)
[4]L.Liddy, E. (ovy, J.Lin, J.Prager, D. Radev, L.Vanderwende, R.Weischedel, Natural Language
Processing , This report is one of five reports that were based on the M)NDS workshops.
[ ] G.Chowdhury, Natural language processing , Annual Review of )nformation Science and
Technology, 2003, 37. pp. 51-89, ISSN 0066-4200.
[ ] S. Jusoh and (.M. Alfawareh, Natural language interface for online sales , in Proceedings of the
International Conference on Intelligent and Advanced System (ICIAS2007),Malaysia: IEEE,
November 2007, pp. 224-228
[ ] E.K. Ringger, R.C. Moore, E. Charniak, L. Vanderwende, and ( Suzuki, Using the Penn Treebank
to Evaluate Non-Treebank Parsers , )n Proceedings of the
Language Resources and
Evaluation Conference (LREC), 2004, Lisbon, Portugal.
[8] T. Winograd, Procedures as a Representation for Data in a Computer Program for
Understanding Natural Language, 1971, MIT-AI-TR-235
[ ] W. A. Woods, Transition Network Grammars for Natural Language Analysis , Communications
of the ACM 13:10, 1970.
[10] W.C. Mann & S. Thompson, Rhetorical Structure Theory: Toward a Functional Theory of Text
Organization ,
. Text
. Pp.
-281.
[11] E. Charniak, K. Knight, and K.Yamada, Syntax-based Language Models for Statistical Machine
Translation . In Proceedings of MT Summit IX, 2003.
[12] C. Quirk, A. Menezes and C. Cherry, Dependency Treelet Translation: Syntactically )nformed
Phrasal SMT . In Proceedings of the 43rd Annual Meeting of the Association for Computational
Linguistics, Ann Arbor, Michigan, 2005.
[13] J. Wiebe, E. Breck, C. Buckley, C. Cardie, P. Davis, B. Fraser, D. Litman, D. Pierce, E. Riloff, T.
Wilson, D. Day, and M. Maybury, Recognizing and Organizing Opinions Expressed in the World
Press . In Proceedings of 2003 AAAI Spring Symposium on New Directions in Question Answering,
2003.
[14] P. Jackson and I. Moulinier, Natural Language Processing for Online Applications : Cambridge
University press, New York.2012, page 7-9.
[
] R. Bose. Natural language processing: Current state and future directions . International
Journal of the Computer, the Internet and Management Vol. 12#1 (January – April, 2004) pp. 1 –
11.
[
] K. W Church and L.F Rau, Commercial applications of Natural Language Processing .
Communication of the ACM, vol 38, No. 11,November 1995
[17] J. Eisner. Current and future NLP research.
[18] R. J Bobrow and R.M. Weischedel, Challenges in Natural Language Processing, Cambridge
University press, New York.1993
International Journal of Research in Engineering & Applied Sciences
Email:-
[email protected], http://www.euroasiapub.org
209
IJREAS
VOLUME 6, ISSUE 3 (March, 2016)
(ISSN 2249-3905)
International Journal of Research in Engineering and Applied Sciences (IMPACT FACTOR – 6.573)
[19] D. Bailey et al., Tswana Spell Checker , available online:
https://addspns.mozilla.org/.../fi.../addon/tswana-spell-checker.last accessed 11/14/2015.
[20] W. Frakes and R., Baeza-Yates, Eds, Information Retrieval: Data Structures & Algorithms”.
Prentice Hall, Englewood Cliffs, NJ, 1992.
[22] W. Francis and H .K. Houghton Mifflin, Brown University, 1982.
[ ] R.M. Kaplan and M. Kay, Regular models of phonological rule systems , American J.
Computational Linguistics, 1994.
[24] K., Jensen, G. (eidorn, and S. Richardson, Natural Languag Processing: The PLNLP Approach,”
Kluwer Academic Publisher, Boston, 1993
[25] N. Chomsky.Syntatic structures.The Hague: Mouton & Co. Reprinetd 1978,Peter Lang
Publisjing
[26] C. J. Fillmore," The Case for Case". In Bach and Harms (Ed.): Universals in Linguistic Theory.
New York: Holt, Rinehart, and Winston, 1-88, 1968.
[27] R. Quillian, A notation for representing conceptual information: An application to semantics
and mechanical English para- phrasing , SP-1395, System Development Corporation, Santa Monica,
1963.
[28] Y. Wilks, Preference Semantics",1973.
[ ] J. (utchins., The history of Machine translation in a nutshell .Available[online] :
http://ourworld.compuserve.com/homepages/WJHutchins,Revised November 2005.Retrieved
11/13/2015.
[30] J. Weizenbaum, "ELIZA-A Computer Program For the Study of Natural Language
Communication Between Man And Machine", Communications of the ACM Vol.9., No.1pp:3645,January 1966), doi:10.1145/365153.365168
[31] V. Cerf, PARRY encounters the DOCTOR , IETF. RFC
,
January
.
[ ] K. R McKeown., Discourse Strategies for generating Natural Language Text . Artificial
Intelligence.
[ ] W.N. Francis and (. Kucera, The Brown Corpus Manual ,
,Available[online]:
http://clu.uni.no/icame/brown/bcm.html.Accessed: 13/11/2015
[34] R. Rubinoff, Adapting MUMBLE: experience with natural Language generation , Published in
Proceeding (Lt
Proceedings of the workshop on Strategic computing natural language pp.
211.
[
] A. Taylor, The Penn Tree bank: Overview, Available at http://www.ldc.upenn.edu
International Journal of Research in Engineering & Applied Sciences
Email:-
[email protected], http://www.euroasiapub.org
210